Shubham Toshniwal

I'm a Senior Research Scientist at NVIDIA, NYC. Prior to NVIDIA, I was a Research Scientist at FAIR, Meta AI. I received my Ph.D. in Computer Science from TTI Chicago working with Kevin Gimpel and Karen Livescu. Before TTIC, I was at IBM Research India working on IBM Watson related research. I received my B.Tech. in Computer Science and Engineering from IIT Kanpur.

Research Interests

My research focuses primarily on natural language processing (NLP). Some of the key topics which drive my research are:

Reasoning with Language Models: Performing reasoning with LLMs; Generating synthetic data with LLMs; Impact of pretraining data on emergence of LLM capabilities. [1][2][3]
Entity Tracking with Language Models: Understanding text requires that the models are building an implicit/explit representation of the underlying world. Are language models trained on mere surface form capable of representing the underlying world? [3][4][5]

Recent Highlights

April 2025 Our team NemoSkills won the AIMO-2 compeition among 2200+ teams! We also released the OpenMathReasoning dataset, models, and report
April 2025 Presented OpenMathInstruct-2 at ICLR in Singapore.