I am a Postdoctoral Fellow at Princeton Language and Intelligence, Princeton University. My research focuses on the fundamentals of artificial intelligence (AI). By combining mathematical analyses with systematic experimentation, I aim to develop theories that shed light on how modern AI works, identify potential failures, and yield principled methods for improving efficiency, reliability, and performance.
My work is supported in part by a Zuckerman Postdoctoral Scholarship. Previously, I obtained my PhD in Computer Science at Tel Aviv University, where I was fortunate to be advised by Nadav Cohen. During my PhD, I interned at Apple Machine Learning Research and Microsoft Recommendations Team, and received the Apple Scholars in AI/ML and Tel Aviv University Center for AI & Data Science fellowships.
I am on the academic and industry job market for 2025/26
Recent Research
Recently, I have been working on language model alignment, including reinforcement learning and preference optimization approaches.
- In [1], we identify a connection between reward variance and the flatness of the reinforcement learning objective landscape. Building on this, [2] provides an optimization perspective on what makes a good reward model for RLHF, establishing that more accurate reward models are not necessarily better.
- In [3], we investigate why language models are often poor implicit reward models, and show that they tend to rely on superficial token-level cues.
- In [4], we characterize the causes of likelihood displacement — the counter-intuitive phenomenon where preference optimization decreases the probability of preferred responses (instead of increasing it as intended). We demonstrate that likelihood displacement can cause surprising failures in alignment and give preventative guidelines.
News
Why is Your Language Model a Poor Implicit Reward Model? received a best paper runner-up award at the NeurIPS 2025 Reliable Machine Learning from Unreliable Data Workshop and was accepted to ICLR 2026!
Two papers accepted to NeurIPS 2025: one provides an optimization perspective on what makes a good reward model for RLHF and another proves that the implicit bias of state space models (SSMs) can be poisoned with clean labels.
Honored to receive the Zuckerman and Israeli Council for Higher Education Postdoctoral Scholarships.
Joined Princeton Language and Intelligence as a Postdoctoral Fellow.
Publications
* denotes equal contribution
Understanding Deep Learning via Notions of Rank
Noam Razin
arXiv:2408.02111 (PhD thesis), 2024
Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning
Nadav Cohen, Noam Razin
arXiv:2408.13767, 2024
RecoBERT: A Catalog Language Model for Text-Based Recommendations
Itzik Malkiel, Oren Barkan, Avi Caciularu, Noam Razin, Ori Katz, Noam Koenigstein
Findings of the Association for Computational Linguistics: EMNLP, 2020
Selected Talks
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Deep Learning: Classics and Trends Seminar · Jan 2025
Two Analyses of Modern Deep Learning: Graph Neural Networks and Language Model Finetuning
Princeton Alg-ML Seminar · Dec 2023
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Tel Aviv University Machine Learning Seminar · May 2020
Teaching
Fundamentals of Deep Learning (COS 514)
Guest Lecturer · Princeton University · 2025
Introduction to Reinforcement Learning (COS 435)
Guest Lecturer · Princeton University · 2025
First Steps in Research Honors Seminar
Guest Lecturer · Tel Aviv University · 2021–2024
Foundations of Deep Learning
Teaching Assistant · Tel Aviv University · 2021–2023