I am a pre-doctoral Research Fellow at Microsoft. I work with Gustavo Soares, Arjun Radhakrishnan, Ashish Tiwari and other collaborators at the PROSE Team led by Sumit Gulwani.
My research focuses on how AI systems can write correct and verifiable code, and reason with human-like flexibility. I study this through the lens of Software Engineering, Artificial Intelligence, Formal Methods, and cognitive science. Much of my work sits in AI4Code, AI4Math, and Human-AI Interaction, with an emphasis on structured and verifiable reasoning. Recently, I built IndiMathBench, a Lean4 benchmark to evaluate formal theorem proving on Olympiad Math problems, and a human-in-the-loop system for autoformalization.
At PROSE, I worked on developing an end-to-end LLM based software agent. I also helped design a conversational debugger, now deployed in Visual Studio IDE, as well as the automatic evaluation of Human-AI conversations, which is currently used to evaluate millions of conversations across VS IDE and GitHub Copilot. Before PROSE, I worked at Adobe as a Software Developer, interned American Express AI Labs, under Narayanan Edakunni, and at Speech and Language Lab, NTU Singapore under Chng Eng Siong. I did my undergrad in computer science at BITS Goa.
Outside of research, I like to swim, dive, read mangas, and play grand strategy video games.
Check out my CV, reach me through email.
IndiMathBench: Autoformalizing Mathematical Reasoning Problems with a Human Touch
, Shashank Kirtania, Yasharth Bajpai, Sumit Gulwani, Ashish Tiwari
P-AI-FM AAAI 2026 | AAAI 2026 Workshop on Post-AI Formal Methods (P-AI-FM); Under Submission ICLR 2026
pdf
code
Improving Language Agents Through BREW
Shashank Kirtania, , Priyanshu Gupta, Yasharth Bajpai, Roshni Iyer, Sumit Gulwani, Gustavo Soares
MTI-LLM NeurIPS 2025 | NeurIPS 2025 Workshop on
Multi-Turn Interactions in Large Language Models; Under Submission ICLR 2026
pdf
π Best Paper Presentation Award
RUBICON: Rubric-Based Evaluation of Domain-Specific Human AI Conversations
, Yasharth Bajpai, Arjun Radhakrishna, Gustavo Soares, Sumit Gulwani
AIware 2024 |
AIware: Proceedings of the 1st ACM International Conference on AI-Powered Software 2024 (co-located with FSE 2024)
blog
pdf
web
π Best Paper Award
Letβs Fix this Together: Conversational Debugging with GitHub Copilot
Yasharth Bajpai, Bhavya Chopra, , Cagri Aslan, Sumit Gulwani, Dustin Coleman, Chris Parnin, Arjun Radhakrishna, Gustavo Soares
VL/HCC 2024 | IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) 2024
pdf
Exploring Interaction Patterns for Debugging: Enhancing Conversational Capabilities of AI-assistants
Bhavya Chopra, Yasharth Bajpai, , Gustavo Soares, Arjun Radhakrishna, Chris Parnin, and Sumit Gulwani
HCI-NLP NAACL 2024 | NAACL 2024: Proceedings of the Third Workshop on Bridging Human-Computer Interaction and Natural Language Processing
pdf