Saurabh Pujar - AI Researcher

About Me

I am fascinated by how AI can bridge human intent and executable code—whether it's translating natural language into production-ready workflows or establishing benchmarks that ensure self-consistent, reliable code understanding. I believe in harnessing the transformative power of intelligent systems to tackle real-world challenges in software engineering, security, and developer productivity.

5+

Years Research

15+

Publications

5

Patents

1000+

Citations

View/Download CV

Recent Updates

2025-06-20 Happy to share that our paper "Cross-lingual Transfer in Programming Languages: An Extensive Empirical Study" was published at TMLR 06/2025
2025-06-16 Our paper "How Does LLM Reasoning Work for Code? A Survey and a Call to Action", in collaboration with Columbia University, was published on arXiv.
2025-06-10 Our paper "Understanding Software Engineering Agents Through the Lens of Traceability: An Empirical Study", in collaboration with Columbia University, was published on arXiv.
2025-04-11 Our paper "SeaView: Software Engineering Agent Visual Interface for Enhanced Workflow", is now available on arXiv.

Selected Publications

Project CodeNet: A Large‑Scale AI for Code Dataset

Ruchir Puri, David S. Kung, Geert Janssen, Wei Zhang, Giacomo Domeniconi, Vladimir Zolotov, Julian Dolby, Jie Chen, Mihir Choudhury, Lindsey Decker, Veronika Thost, Luca Buratti, Saurabh Pujar, Shyam Ramji, Ulrich Finkler, Susan Malaika, Frederick Reiss

Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, 2021

...In this paper, we present a large-scale dataset CodeNet, consisting of over 14 million code samples and about 500 million lines of code in 55 different programming languages, which is aimed at teaching AI to code. In addition to its large scale, CodeNet has a rich set of high-quality annotations to benchmark and help accelerate research in AI techniques for a variety of critical coding tasks, including code similarity and classification, code translation between a large variety of programming languages, and code performance (runtime and memory) improvement techniques. ...

363 citations

Paper Code

Automated Code generation for Information Technology Tasks in YAML through Large Language Models

Saurabh Pujar, Luca Buratti, Xiaojie Guo, Nicolas Dupuis, Burn Lewis, Sahil Suneja, Atin Sood, Ganesh Nalawade, Matt Jones, Alessandro Morari, Ruchir Puri

2023 60th ACM/IEEE Design Automation Conference (DAC)

... The recent improvement in code generation capabilities due to the use of large language models has mainly benefited general purpose programming languages. Domain specific languages, such as the ones used for IT Automation, received far less attention, despite involving many active developers and being an essential component of modern cloud platforms. This work focuses on the generation of Ansible YAML, a widely used markup language for IT Automation. ...

22 citations

Paper Demo

LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks

Saad Ullah, Mingji Han, Saurabh Pujar, Hammond Pearce, Ayse Coskun, Gianluca Stringhini

2024 IEEE Symposium on Security and Privacy (SP)

... We thus develop SecLLMHolmes, a fully automated evaluation framework that performs the most detailed investigation to date on whether LLMs can reliably identify and reason about security-related bugs. We construct a set of 228 code scenarios and analyze eight of the most capable LLMs across eight different investigative dimensions using our framework. Our evaluation shows LLMs provide non-deterministic responses, incorrect and unfaithful reasoning, and perform poorly in real-world scenarios. ...

123 citations

Paper Code

Towards Learning (Dis)-Similarity of Source Code from Program Contrasts

Yangruibo Ding, Luca Buratti, Saurabh Pujar, Alessandro Morari, Baishakhi Ray, Saikat Chakraborty

60th Annual Meeting of the Association for Computational Linguistics, 2022

... we design structure-guided code transformation algorithms to generate synthetic code clones and inject real-world security bugs, augmenting the collected datasets in a targeted way. We propose to pre-train the Transformer model with such automatically generated program contrasts to better identify similar code in the wild and differentiate vulnerable programs from benign ones. ...

61 citations

Paper Code

View All Publications on Google Scholar

Hi, I'm Saurabh Pujar

About Me

5+

15+

5

1000+

Recent Updates

Research Areas

Code Understanding & Generation

Vulnerability Detection & Analysis

Developer Productivity & Tools

Selected Publications

Project CodeNet: A Large‑Scale AI for Code Dataset

Automated Code generation for Information Technology Tasks in YAML through Large Language Models

LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks

Towards Learning (Dis)-Similarity of Source Code from Program Contrasts

Get In Touch

Let's collaborate on AI research!