A data structure that maps content tokens to the documents containing them, enabling fast full-text search. Inverted indices are the foundation of keyword search and complement vector-based retrieval in hybrid multimodal search systems.
An inverted index reverses the relationship between documents and terms. Instead of listing which terms appear in each document, it lists which documents contain each term. For a given search term, the index provides immediate access to all matching documents with their term positions and frequencies. Boolean operations (AND, OR, NOT) combine posting lists across terms for multi-term queries.
An inverted index consists of a dictionary (sorted list of all unique terms) and posting lists (sorted lists of document IDs containing each term). Posting lists may include term frequency, positions, and payloads. Compression techniques (variable-byte encoding, PFOR) reduce index size. Skip lists and block-max techniques accelerate query processing. Elasticsearch, Lucene, and Tantivy are widely used inverted index implementations.