Combines retrieval systems (structured or unstructured) with generative models for answering complex multimodal queries.
RAG enhances large language models by retrieving relevant information from external knowledge sources before generating responses. This approach combines the strengths of knowledge retrieval and text generation to produce more accurate, up-to-date, and verifiable outputs.
RAG architectures typically involve three components: a retriever that finds relevant documents using vector embeddings, a context builder that formats retrieved information appropriately, and a generator (usually an LLM) that produces final responses incorporating the retrieved knowledge.