Research Summary
Research Intersts
- CodeLLMs
- State Space Models
- Reinforcement Learning
- Diffusion LMs
- Reliable AI
- Interpretability of AI Models
- Computer Vision
Doctoral Reasearch
In my doctoral research, I have been working towards developing a franework to train LLMs to generate secure code. It is a big challenge because models being trained on security-based feedback often loses some capability to generate corrrect code. Additionally, there is no established method to provide security related feedbacks. To avoid online verification of security of code, I have been focused on developing Offline and off-policy RL algorithms for Code LLMs.
While working on LLMs, I have also focused on state space models to create efficient architectures. I especially focus on S4 and S4D instead of Mamba. This line of research has resulted in a high performanct model called CodeSSM. We are not scaling the model to Billions of parameters. The model is incredible, espcially for length generalization. It is capable of genenralizing upto 32x the pretraining context.
In the first year of my PhD, I also developed a framework for analysis of attention and hidden representation of CodeLLMs. With the framework, we showed the limitations of transformer architecture on understanding code syntax and semantics.
More recently, we used the framework for a comparative study of hidden representation of SSMs and Transformers establishing that SSMs are superior to Transformers in understanding code syntax and semantics. We also developed a new framework for analysis of convolutional kernel of S4 and S4D.
Research During Masters
My research involved analysing the performance of self driving cars in Indian road conditions and in adverse weather conditions. I specifically focused on state-of-the-art object detection models during rain.
The major challenge in obtaining good performance of these models in rain (or other adverse weathers like fog, snow etc.) is primarily a lack of data for those conditions. However, given the dynamic nature of these weather conditions, no amount of data can capture the entire distribution.
Rain comes with it's own sets of challenges and diveristy. On the one hand there are low frequency componets such as water droplets and on the other, high frequency ones like rain streaks. Further, there is the effect of depth. The performance decreases significantly due to depth as rain accumulates more in front of camera. Then there are reflections, that get identified as objects.
It's really difficult to capture all this diversity in a dataset. And even if we succeed in doing that, we will have to make a model learn all that, maybe even come up with a new architecture to achieve this with a lot more parameters. Then, a new adverse condition will arrive or we want to apply our model to a new problem, say off-road vehicles or vehicles in villages, and we are back to collecting data.
Throwing more and more data at the problem won't bring us closer to the solution, at least not a generalized solution. Instead, it is important that we make the current models learn a better representaiton of the world. It will also help if the model understands it's uncertainities and it's fallacies.
My research interest lies in this direction. For this, I have explored ideas form self-supervised learning, contrastive learning, domain adaptation, transfer learning, knowledge distillation, monocular depth estimation. I have also used image to image translation models as a tool for data augmentation and explored image deraining models and their effects on various downstream tasks.