Research

Training LLMs for Honesty via Confessions (2025) [pdf] [Blog post]
M Joglekar, J Chen, G Wu, J Yosinski, J Wang, B Barak, A Glaese
Work done at OpenAI
Deduction-Projection Estimators for Understanding Neural Networks (2025) [pdf]
Gabriel Wu, advised by Sitan Chen
Undergraduate thesis presented to Department of Computer Science
Estimating the Probabilities of Rare Outputs in Language Models (2024) [arXiv] [Blog post]
Gabriel Wu, Jacob Hilton
International Conference on Learning Representations (ICLR 2025), Spotlight paper
Work done at the Alignment Research Center in Berkeley, CA
Harvard Undergraduate Survey on Generative AI (2024) [arXiv]
Shikoh Hirabayashi, Rishab Jain, Nikola Jurkovic, Gabriel Wu
Led the first Harvard survey studying student perspectives on AI
Testing Tensor Products of Algebraic Codes (2023) [arXiv]
Sumegha Garg, Madhu Sudan, Gabriel Wu
International Conference on Randomization and Computation (RANDOM 2025)