RLHF: Training Language Models to Follow Instructions with Human Feedback

Ouyang et al., OpenAI  •  Mar 22, 2026  •  2 views

Presents InstructGPT, demonstrating that fine-tuning language models using reinforcement learning from human feedback produces models better aligned with user intent than larger models trained on next...

Read Paper