🔬 Research
AI Research Hub
Explore 1 papers in machine learning, NLP, computer vision & more
RLHF: Training Language Models to Follow Instructions with Human Feedback
Ouyang et al., OpenAI •
Mar 22, 2026 •
2 views
Presents InstructGPT, demonstrating that fine-tuning language models using reinforcement learning from human feedback produces models better aligned with user intent than larger models trained on next...