Diffusion modelling is one of very recent approach to generate text using diffusion model after gaining immense success in image generation. The project aims to explore and build diffusion based generative multimodal LLM for code and text generation tasks, such as, code generation through natural language instructions, summarization of code and related tasks. In progress
Connected assets are the assets that are together involved in performing one or more business processes. If vulnerability/threat on one of the assets is exploited, because of the connectedness - it could lead to failures or disruptions to assets it is connected to and/or the entire network. Here the first objective is to get a generalised model for Threat posture of connected assets. Another objective would be researching on how risk score of an asset can vary based on the number of assets it is connected to and nature of assets connected to it. In progress
Populate and enrich a benchmark of tabular data tasks for evaluation of LLMs with our evaluation framework. Primary work includes tasks and datasets selection, writing data loaders, preparing task cards with input/output details and pre-processing steps, and prompts for the tasks. Test the data processing pipeline using our framework and evaluate select set of tasks with LLMs. This benchmark standardizes evaluating tabular data tasks in uniform manner against LLMs. Goal is to add a variety of tabular data tasks and make it a rich resource for benchmarking. In progress
The project aims to explore the impact of Dynamic In-context learning approach for Generative MultiModal LLM and develop algorithm to enhance the LLM performance through representation learning approach. In progress
To determine the Handwriting of a user to analyze PSYCHOLOGICAL behaviour In progress
Design a data pipeline to ingest data from REST APIs. Ensure you use a REST client and a mechanism to schedule batch data processing using cron expressions. The Data Pipeline should include, Data extraction from API’s/Db2 table. Loading data to IBM COS (Object Storage) as a Landing Area. Batch data load monitoring mechanism (verification and validations) Device a design to handle errors and capture error details. Ensure no data loss by retrying mechanisms or creating a notification system and audit ETL processes. In progress
To develop techniques using state-of-the-art AI methods to enrich the experience around LLM usage. Skills Required: Python, ML, DL, LLMs, Hugging Face, exposure to UI development In progress
Generalize variational QEC to larger class of error correcting code Reference : https://arxiv.org/abs/2204.03560 In progress
With powerful models like ChatGPT, GPT-4 demonstrating astonishing capabilities, there is a need to test if the text is generated by Generative AI models or from a human. In progress
1. To train a LLM for use in Legal Text Analytics tasks. 2. To explore extracting a knowledge graph from an LLM. In progress