Retrieval-Augmented Generation (RAG)
A technique that enhances AI model responses by retrieving relevant information from external data sources before generating an answer.
Retrieval-augmented generation (RAG) combines the generative capabilities of language models with information retrieval systems. Instead of relying solely on what a model learned during training, RAG first searches a knowledge base — such as company documents, databases, or the web — and feeds the retrieved information into the model as context.
This approach reduces hallucinations, keeps responses grounded in up-to-date facts, and allows models to work with proprietary data they were never trained on. RAG architectures typically use vector databases to store and retrieve document embeddings efficiently.
RAG has become a foundational pattern in enterprise AI. Engineers who can design and optimize RAG pipelines — from chunking strategies to retrieval ranking to prompt composition — are among the most sought-after in the AI job market.
Related AI Job Categories
Related Terms
Vector Database
A specialized database designed to store and efficiently search high-dimensional vector embeddings.
Embeddings
Dense numerical representations of data (text, images, etc.) that capture semantic meaning in a format AI models can process.
Large Language Model (LLM)
A neural network trained on massive text datasets that can understand and generate human language.
Inference
The process of running a trained AI model to generate predictions or outputs from new input data.