Yang Cai
Yang Cai
Home
Publications
Teaching
Short Bio
Students
Contact
Light
Dark
Automatic
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Yixin Liu
,
Argyris Oikonomou
,
Weiqiang Zheng
,
Yang Cai
,
Arman Cohan
January 2026
arxiv
Type
Conference paper
Publication
The 14th International Conference on Learning Representations (ICLR)
AI Alignment
NLHF
RLHF
last-iterate convergence
Yang Cai
Professor
Related
From Average-Iterate to Last-Iterate Convergence in Games: A Reduction and Its Applications
On Tractable Φ-Equilibria in Non-Concave Games
Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms
Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games with Bandit Feedback
Doubly Optimal No-Regret Learning in Monotone Games
Cite
×