Noam Razin
Noam Razin
News
Publications
Talks
Blog Posts
Teaching
Language Models
What Makes a Reward Model a Good Teacher? An Optimization Perspective
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Vanishing Gradients in Reinforcement Finetuning of Language Models
What Algorithms Can Transformers Learn? A Study in Length Generalization
Cite
×