Noam Razin

Noam Razin

Postdoctoral Fellow

Princeton University

Email: noamrazin (at) princeton.edu

I am a Postdoctoral Fellow at Princeton Language and Intelligence, Princeton University. My research focuses on the fundamentals of artificial intelligence (AI). By combining mathematical analyses with systematic experimentation, I aim to develop theories that shed light on how modern AI works, identify potential failures, and yield principled methods for improving efficiency, reliability, and performance.

My work is supported in part by a Zuckerman Postdoctoral Scholarship. Previously, I obtained my PhD in Computer Science at Tel Aviv University, where I was fortunate to be advised by Nadav Cohen. During my PhD, I interned at Apple Machine Learning Research and Microsoft Recommendations Team, and received the Apple Scholars in AI/ML and Tel Aviv University Center for AI & Data Science fellowships.

I am on the academic and industry job market for 2025/26

Recent Research

Recently, I have been working on language model alignment, including reinforcement learning and preference optimization approaches.

  • In [1], we identify a connection between reward variance and the flatness of the reinforcement learning objective landscape. Building on this, [2] provides an optimization perspective on what makes a good reward model for RLHF, establishing that more accurate reward models are not necessarily better.
  • In [3], we investigate why language models are often poor implicit reward models, and show that they tend to rely on superficial token-level cues.
  • In [4], we characterize the causes of likelihood displacement — the counter-intuitive phenomenon where preference optimization decreases the probability of preferred responses (instead of increasing it as intended). We demonstrate that likelihood displacement can cause surprising failures in alignment and give preventative guidelines.

News

Publications

* denotes equal contribution

Why is Your Language Model a Poor Implicit Reward Model?

Noam Razin, Yong Lin, Jiarui Yao, Sanjeev Arora

International Conference on Learning Representations (ICLR), 2026

Retaining by Doing: The Role of On-Policy Data in Mitigating Forgetting

Howard Chen, Noam Razin, Karthik Narasimhan, Danqi Chen

arXiv:2510.18874, 2025

What Makes a Reward Model a Good Teacher? An Optimization Perspective

Noam Razin, Zixuan Wang, Hubert Strauss, Stanley Wei, Jason D. Lee, Sanjeev Arora

Advances in Neural Information Processing Systems (NeurIPS), 2025

Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

Noam Razin, Sadhika Malladi, Adithya Bhaskar, Danqi Chen, Sanjeev Arora, Boris Hanin

International Conference on Learning Representations (ICLR), 2025

The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels

Yonatan Slutzky*, Yotam Alexander*, Noam Razin, Nadav Cohen

Advances in Neural Information Processing Systems (NeurIPS), 2025

Understanding Deep Learning via Notions of Rank

Noam Razin

arXiv:2408.02111 (PhD thesis), 2024

PDF

Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States

Noam Razin*, Yotam Alexander*, Edo Cohen-Karlik, Raja Giryes, Amir Globerson, Nadav Cohen

International Conference on Machine Learning (ICML), 2024

Vanishing Gradients in Reinforcement Finetuning of Language Models

Noam Razin, Hattie Zhou, Omid Saremi, Vimal Thilak, Arwen Bradley, Preetum Nakkiran, Joshua Susskind, Etai Littwin

International Conference on Learning Representations (ICLR), 2024

What Algorithms Can Transformers Learn? A Study in Length Generalization

Hattie Zhou, Arwen Bradley, Etai Littwin, Noam Razin, Omid Saremi, Joshua Susskind, Samy Bengio, Preetum Nakkiran

International Conference on Learning Representations (ICLR), 2024

Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning

Nadav Cohen, Noam Razin

arXiv:2408.13767, 2024

PDF

What Makes Data Suitable for a Locally Connected Neural Network? A Necessary and Sufficient Condition Based on Quantum Entanglement

Yotam Alexander*, Nimrod De La Vega*, Noam Razin, Nadav Cohen

Advances in Neural Information Processing Systems (NeurIPS), 2023

On the Ability of Graph Neural Networks to Model Interactions Between Vertices

Noam Razin, Tom Verbin, Nadav Cohen

Advances in Neural Information Processing Systems (NeurIPS), 2023

Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks

Noam Razin, Asaf Maman, Nadav Cohen

International Conference on Machine Learning (ICML), 2022

Implicit Regularization in Tensor Factorization

Noam Razin*, Asaf Maman*, Nadav Cohen

International Conference on Machine Learning (ICML), 2021

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Noam Razin, Nadav Cohen

Advances in Neural Information Processing Systems (NeurIPS), 2020

RecoBERT: A Catalog Language Model for Text-Based Recommendations

Itzik Malkiel, Oren Barkan, Avi Caciularu, Noam Razin, Ori Katz, Noam Koenigstein

Findings of the Association for Computational Linguistics: EMNLP, 2020

PDF

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding

Oren Barkan*, Noam Razin*, Itzik Malkiel, Ori Katz, Avi Caciularu, Noam Koenigstein

AAAI Conference on Artificial Intelligence (AAAI), 2020

Selected Talks

  • Understanding and Overcoming Pitfalls in Language Model Alignment

    EPFL AI Fundamentals Seminar · Sep 2025

  • Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

    Deep Learning: Classics and Trends Seminar · Jan 2025

  • Analyses of Policy Gradient for Language Model Finetuning and Optimal Control

    MPI MiS + UCLA Math Machine Learning Seminar · Mar 2024

  • Two Analyses of Modern Deep Learning: Graph Neural Networks and Language Model Finetuning

    Princeton Alg-ML Seminar · Dec 2023

  • On the Ability of Graph Neural Networks to Model Interactions Between Vertices

    Learning on Graphs and Geometry Reading Group · Jan 2023

  • Generalization in Deep Learning Through the Lens of Implicit Rank Lowering

    ICTP Youth in High-Dimensions: Recent Progress in Machine Learning, High-Dimensional Statistics and Inference · Jun 2022

  • Implicit Regularization in Tensor Factorization

    The Hebrew University Machine Learning Club · Jun 2021

  • Implicit Regularization in Deep Learning May Not Be Explainable by Norms

    Tel Aviv University Machine Learning Seminar · May 2020

Teaching

  • Fundamentals of Deep Learning (COS 514)

    Guest Lecturer · Princeton University · 2025

  • Introduction to Reinforcement Learning (COS 435)

    Guest Lecturer · Princeton University · 2025

  • First Steps in Research Honors Seminar

    Guest Lecturer · Tel Aviv University · 2021–2024

  • Foundations of Deep Learning

    Teaching Assistant · Tel Aviv University · 2021–2023