Documentation Index Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
The Rerank stage uses cross-encoder models to re-score and reorder search results. Unlike bi-encoder models (used in semantic search), cross-encoders process the query and document together, enabling more accurate relevance scoring at the cost of higher latency.
Stage Category : SORT (Reorders documents)Transformation : N documents → top_n documents (re-ranked by relevance)
When to Use
Use Case Description Two-stage retrieval Fast recall (search) + precise ranking (rerank) High-precision requirements When ranking quality is critical Top-N optimization Improve quality of final displayed results RAG applications Better context selection for LLM generation
When NOT to Use
Scenario Recommended Alternative Large result sets (1000+) Too slow; use sort_by_field Real-time requirements (< 20ms) Use search scores directly Simple attribute sorting sort_by_field
Parameters
Parameter Type Default Description modelstring Required Reranker model to use top_ninteger 10Number of results to return after reranking querystring {{INPUT.query}}Query for relevance scoring
Available Models
Model Speed Quality Best For bge-reranker-v2-m3Fast High General purpose, multilingual cohere-rerank-v3Medium Highest Maximum accuracy jina-reranker-v2Fast High Multilingual, long documents
Configuration Examples
Basic Reranking
High-Quality Reranking
Custom Query
Large Candidate Pool
{
"stage_type" : "sort" ,
"stage_id" : "rerank" ,
"parameters" : {
"model" : "bge-reranker-v2-m3" ,
"top_n" : 10
}
}
How Cross-Encoders Work
Bi-Encoder (Search) Cross-Encoder (Rerank) Query and doc encoded separately Query + doc encoded together Pre-compute doc embeddings Must process each pair Fast (< 10ms for millions) Slower (50-100ms for 100 docs) Good approximate ranking Precise relevance scoring
Cross-encoders see the full context of both query and document together, enabling better understanding of semantic relationships.
Two-Stage Retrieval Pattern
The recommended pattern is fast recall followed by precise reranking:
[
{
"stage_type" : "filter" ,
"stage_id" : "semantic_search" ,
"parameters" : {
"query" : "{{INPUT.query}}" ,
"vector_index" : "text_extractor_v1_embedding" ,
"top_k" : 100
}
},
{
"stage_type" : "sort" ,
"stage_id" : "rerank" ,
"parameters" : {
"model" : "bge-reranker-v2-m3" ,
"top_n" : 10
}
}
]
Why this works:
Search stage : Fast, retrieves 100 candidates (< 20ms)
Rerank stage : Slower but precise, picks best 10 (50-100ms)
Total : High-quality results in 70-120ms
Metric Value Latency 50-100ms (depends on candidate count) Optimal input size 50-200 documents Maximum practical ~500 documents Batching Automatic
Reranking 1000+ documents is not recommended. Use top_k limits in the search stage to control candidate pool size.
Output
Each returned document includes:
Field Type Description document_idstring Unique document identifier scorefloat Reranker relevance score original_scorefloat Score from previous stage rerank_positioninteger Position after reranking
Common Pipeline Patterns
Search + Rerank + Limit
[
{
"stage_type" : "filter" ,
"stage_id" : "hybrid_search" ,
"parameters" : {
"query" : "{{INPUT.query}}" ,
"vector_index" : "text_extractor_v1_embedding" ,
"top_k" : 100
}
},
{
"stage_type" : "sort" ,
"stage_id" : "rerank" ,
"parameters" : {
"model" : "bge-reranker-v2-m3" ,
"top_n" : 20
}
},
{
"stage_type" : "reduce" ,
"stage_id" : "limit" ,
"parameters" : {
"limit" : 5
}
}
]
Search + Filter + Rerank
[
{
"stage_type" : "filter" ,
"stage_id" : "semantic_search" ,
"parameters" : {
"query" : "{{INPUT.query}}" ,
"vector_index" : "text_extractor_v1_embedding" ,
"top_k" : 200
}
},
{
"stage_type" : "filter" ,
"stage_id" : "structured_filter" ,
"parameters" : {
"conditions" : {
"field" : "metadata.category" ,
"operator" : "eq" ,
"value" : "{{INPUT.category}}"
}
}
},
{
"stage_type" : "sort" ,
"stage_id" : "rerank" ,
"parameters" : {
"model" : "bge-reranker-v2-m3" ,
"top_n" : 10
}
}
]
Custom Reranker (BYO Model)
Use your own reranker model deployed as a custom extractor instead of the built-in models. Set feature_uri to route reranking through your extractor’s inference endpoint.
Parameters
Parameter Type Default Description feature_uristring nullFeature URI of a custom reranker plugin. Overrides model when set.
Your plugin must accept {pairs: [[query, doc], ...]} and return {scores: [float, ...]}.
Configuration Example
{
"stage_name" : "my_rerank" ,
"config" : {
"stage_id" : "rerank" ,
"parameters" : {
"feature_uri" : "mixpeek://my_reranker@1.0.0/rerank" ,
"top_n" : 10
}
}
}
Set inference_type: "rerank" in your plugin’s manifest to declare compatibility with the rerank stage.
Trade-offs
Aspect Impact Higher precision Better relevance scoring Higher latency 50-100ms per batch Limited scale Best for < 500 candidates API costs Per-document scoring