AI Research — AI Promptory Hub

RLHF: Training Language Models to Follow Instructions with Human Feedback

Ouyang et al., OpenAI • Mar 22, 2026 • 6 views

Presents InstructGPT, demonstrating that fine-tuning language models using reinforcement learning from human feedback produces models better aligned with user intent than larger models trained on next...

Read Paper

RLHF

AI Research Hub

RLHF: Training Language Models to Follow Instructions with Human Feedback