A retrieval paradigm that encodes queries and documents into dense embedding vectors and uses vector similarity for ranking. Dense retrieval powers semantic search in multimodal systems where keyword matching falls short.
Dense retrieval uses dual encoder models to independently map queries and documents into a shared embedding space. At query time, the query is encoded into a vector, and the most similar document vectors are retrieved using approximate nearest neighbor search. This captures semantic meaning rather than relying on exact keyword overlap.
Models like DPR, E5, and BGE use transformer-based dual encoders trained with contrastive learning on query-document pairs. Document embeddings are pre-computed and indexed in vector databases. Query latency is dominated by the ANN search step since query encoding is a single forward pass. Typical embedding dimensions range from 384 to 1024.