Embedding versioning addresses the operational problem of migrating from one embedding model to another. Because vectors from different model versions occupy incompatible spaces, upgrading requires re-encoding all stored data, which is expensive, slow, and risky. Without a versioning strategy, organizations either stay on outdated models or face painful bulk migrations.
When a new embedding model is released (or an existing model is fine-tuned), the vector space changes. Queries encoded with the new model cannot be meaningfully compared against documents encoded with the old model. Embedding versioning solves this by maintaining parallel indexes, migrating data incrementally, and routing queries to the correct index based on which model version produced the stored vectors. The simplest approach is dual-write: new data goes into both the old and new index, while a background job re-encodes historical data into the new space. Once migration is complete, the old index is retired.
Three main strategies exist. Shadow indexing creates a second index alongside the primary one, encodes all new data with both models, and backfills historical data in the background. Query routing sends searches to both indexes and merges results using reciprocal rank fusion or score normalization. Once the new index reaches full coverage, traffic shifts entirely. Blue-green migration builds the new index completely offline, validates retrieval quality against a test set, and performs an atomic cutover. Progressive rollout re-encodes data in priority order (most queried documents first) and gradually increases the share of traffic served by the new index. Each strategy trades off between migration speed, compute cost, and retrieval continuity.