A second-stage ranking process that reorders initial search results using a more computationally expensive but accurate scoring model. Re-ranking is essential for maximizing precision in multimodal retrieval pipelines where first-stage recall is prioritized over exact ordering.
Re-ranking takes the top-k results from a fast first-stage retrieval system and rescores them using a more powerful model. Cross-encoder rerankers process the query and each candidate document jointly, enabling fine-grained interaction between query and document tokens. This produces more accurate relevance scores than bi-encoder models that encode query and document independently.
Cross-encoder rerankers (BGE-reranker, Cohere Rerank, ColBERT) take concatenated query-document pairs and output relevance scores. Processing is O(n) per query where n is the number of candidates to rerank. Typical rerank depths are 50-200 candidates. Latency adds 50-200ms for reranking 100 candidates. Learning-to-rank (LTR) models combine multiple features (BM25 score, semantic score, metadata) into a final ranking.
Connect a bucket and Mixpeek runs the whole multimodal search pipeline for you: extraction, indexing, and search over your own objects. No models to wire up, nothing to host.
Start with ManagedKeep your embeddings on your own cloud and run dense, sparse, and BM25 search directly on object storage. First 1M vectors free.
Start with MVS