AI Research Hub
Explore 9 papers in machine learning, NLP, computer vision & more
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
This paper introduces Retrieval-Augmented Generation (RAG), combining parametric memory of pre-trained models with non-parametric memory through a retrieval mechanism over Wikipedia, dramatically impr...
RLHF: Training Language Models to Follow Instructions with Human Feedback
Presents InstructGPT, demonstrating that fine-tuning language models using reinforcement learning from human feedback produces models better aligned with user intent than larger models trained on next...
LLaMA: Open and Efficient Foundation Language Models
Introduces the LLaMA family of foundation language models ranging from 7B to 65B parameters, trained on publicly available data and outperforming GPT-3 while being more compute-efficient.
The Llama 3 Herd of Models
Describes the development of the Llama 3 model family, including a new tokenizer, grouped-query attention, and training on over 15 trillion tokens. Llama 3.1 405B achieves competitive performance with...
Gemini: A Family of Highly Capable Multimodal Models
Presents Gemini, Google DeepMind's multimodal model family trained from the ground up to be natively multimodal across text, images, audio, and video — achieving state-of-the-art performance on 30 o...
Attention Is All You Need (Revisited)
A comprehensive analysis of the transformer architecture five years after its introduction, examining how the original attention mechanism has evolved across modern LLMs.
Constitutional AI: Harmlessness from AI Feedback
This paper introduces Constitutional AI, a method for training harmless AI assistants using AI-generated feedback rather than human labels.
Scaling Laws for Neural Language Models
Empirical study of how language model performance scales with model size, dataset size, and compute budget, revealing predictable power-law relationships.