Out-of-Distribution Generalization

Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States