Skip to main content
Rerank stage showing cross-encoder model re-scoring search results
The Rerank stage uses cross-encoder models to re-score and reorder search results. Unlike bi-encoder models (used in semantic search), cross-encoders process the query and document together, enabling more accurate relevance scoring at the cost of higher latency.
Stage Category: SORT (Reorders documents)Transformation: N documents → top_k documents (re-ranked by relevance)

When to Use

Use CaseDescription
Two-stage retrievalFast recall (search) + precise ranking (rerank)
High-precision requirementsWhen ranking quality is critical
Top-N optimizationImprove quality of final displayed results
RAG applicationsBetter context selection for LLM generation

When NOT to Use

ScenarioRecommended Alternative
Large result sets (1000+)Too slow; use sort_by_field
Real-time requirements (< 20ms)Use search scores directly
Simple attribute sortingsort_by_field

Parameters

ParameterTypeDefaultDescription
inference_namestringBAAI__bge_reranker_v2_m3Reranking inference service. List options with GET /engine/inference. Ignored when feature_uri is set.
feature_uristringnullCustom reranker plugin (mixpeek://...). Overrides inference_name.
top_kintegernullNumber of results to keep after reranking (omit to keep all, reordered).
querystring{{INPUT.query}}Query for relevance scoring
document_fieldstringcontentDocument field to rerank against

Available Models

The built-in reranker is BAAI__bge_reranker_v2_m3 (multilingual cross-encoder). For other models, deploy your own via a custom extractor and reference it with feature_uri.

Configuration Examples

{
  "stage_name": "rerank",
  "stage_type": "sort",
  "config": {
    "stage_id": "rerank",
    "parameters": {
      "inference_name": "BAAI__bge_reranker_v2_m3",
      "top_k": 10
    }
  }
}

How Cross-Encoders Work

Bi-Encoder (Search)Cross-Encoder (Rerank)
Query and doc encoded separatelyQuery + doc encoded together
Pre-compute doc embeddingsMust process each pair
Fast (< 10ms for millions)Slower (50-100ms for 100 docs)
Good approximate rankingPrecise relevance scoring
Cross-encoders see the full context of both query and document together, enabling better understanding of semantic relationships.

Two-Stage Retrieval Pattern

The recommended pattern is fast recall followed by precise reranking:
[
  {
    "stage_name": "search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          {
            "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1",
            "query": { "input_mode": "text", "value": "{{INPUT.query}}" },
            "top_k": 100
          }
        ],
        "final_top_k": 100
      }
    }
  },
  {
    "stage_name": "rerank",
    "stage_type": "sort",
    "config": {
      "stage_id": "rerank",
      "parameters": {
        "inference_name": "BAAI__bge_reranker_v2_m3",
        "top_k": 10
      }
    }
  }
]
Why this works:
  1. Search stage: Fast, retrieves 100 candidates (< 20ms)
  2. Rerank stage: Slower but precise, picks best 10 (50-100ms)
  3. Total: High-quality results in 70-120ms

Performance

MetricValue
Latency50-100ms (depends on candidate count)
Optimal input size50-200 documents
Maximum practical~500 documents
BatchingAutomatic
Reranking 1000+ documents is not recommended. Use top_k limits in the search stage to control candidate pool size.

Output

Each returned document includes:
FieldTypeDescription
document_idstringUnique document identifier
scorefloatReranker relevance score
original_scorefloatScore from previous stage
rerank_positionintegerPosition after reranking

Common Pipeline Patterns

Search + Rerank + Limit

[
  {
    "stage_name": "search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          {
            "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1",
            "query": { "input_mode": "text", "value": "{{INPUT.query}}" },
            "top_k": 100
          }
        ],
        "final_top_k": 100
      }
    }
  },
  {
    "stage_name": "rerank",
    "stage_type": "sort",
    "config": {
      "stage_id": "rerank",
      "parameters": {
        "inference_name": "BAAI__bge_reranker_v2_m3",
        "top_k": 20
      }
    }
  },
  {
    "stage_name": "limit",
    "stage_type": "reduce",
    "config": {
      "stage_id": "limit",
      "parameters": {
        "limit": 5
      }
    }
  }
]

Search + Filter + Rerank

[
  {
    "stage_name": "search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          {
            "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1",
            "query": { "input_mode": "text", "value": "{{INPUT.query}}" },
            "top_k": 200
          }
        ],
        "final_top_k": 200
      }
    }
  },
  {
    "stage_name": "category_filter",
    "stage_type": "filter",
    "config": {
      "stage_id": "attribute_filter",
      "parameters": {
        "field": "metadata.category",
        "operator": "eq",
        "value": "{{INPUT.category}}"
      }
    }
  },
  {
    "stage_name": "rerank",
    "stage_type": "sort",
    "config": {
      "stage_id": "rerank",
      "parameters": {
        "inference_name": "BAAI__bge_reranker_v2_m3",
        "top_k": 10
      }
    }
  }
]

Custom Reranker (BYO Model)

Use your own reranker model deployed as a custom extractor instead of the built-in models. Set feature_uri to route reranking through your extractor’s inference endpoint.

Parameters

ParameterTypeDefaultDescription
feature_uristringnullFeature URI of a custom reranker plugin. Overrides inference_name when set.
Your plugin must accept {pairs: [[query, doc], ...]} and return {scores: [float, ...]}.

Configuration Example

{
  "stage_name": "my_rerank",
  "config": {
    "stage_id": "rerank",
    "parameters": {
      "feature_uri": "mixpeek://my_reranker@1.0.0/rerank",
      "top_k": 10
    }
  }
}
Set inference_type: "rerank" in your plugin’s manifest to declare compatibility with the rerank stage.

Trade-offs

AspectImpact
Higher precisionBetter relevance scoring
Higher latency50-100ms per batch
Limited scaleBest for < 500 candidates
API costsPer-document scoring