Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Rerank stage showing cross-encoder model re-scoring search results
The Rerank stage uses cross-encoder models to re-score and reorder search results. Unlike bi-encoder models (used in semantic search), cross-encoders process the query and document together, enabling more accurate relevance scoring at the cost of higher latency.
Stage Category: SORT (Reorders documents)Transformation: N documents → top_n documents (re-ranked by relevance)

When to Use

Use CaseDescription
Two-stage retrievalFast recall (search) + precise ranking (rerank)
High-precision requirementsWhen ranking quality is critical
Top-N optimizationImprove quality of final displayed results
RAG applicationsBetter context selection for LLM generation

When NOT to Use

ScenarioRecommended Alternative
Large result sets (1000+)Too slow; use sort_by_field
Real-time requirements (< 20ms)Use search scores directly
Simple attribute sortingsort_by_field

Parameters

ParameterTypeDefaultDescription
modelstringRequiredReranker model to use
top_ninteger10Number of results to return after reranking
querystring{{INPUT.query}}Query for relevance scoring

Available Models

ModelSpeedQualityBest For
bge-reranker-v2-m3FastHighGeneral purpose, multilingual
cohere-rerank-v3MediumHighestMaximum accuracy
jina-reranker-v2FastHighMultilingual, long documents

Configuration Examples

{
  "stage_type": "sort",
  "stage_id": "rerank",
  "parameters": {
    "model": "bge-reranker-v2-m3",
    "top_n": 10
  }
}

How Cross-Encoders Work

Bi-Encoder (Search)Cross-Encoder (Rerank)
Query and doc encoded separatelyQuery + doc encoded together
Pre-compute doc embeddingsMust process each pair
Fast (< 10ms for millions)Slower (50-100ms for 100 docs)
Good approximate rankingPrecise relevance scoring
Cross-encoders see the full context of both query and document together, enabling better understanding of semantic relationships.

Two-Stage Retrieval Pattern

The recommended pattern is fast recall followed by precise reranking:
[
  {
    "stage_type": "filter",
    "stage_id": "semantic_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 100
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "rerank",
    "parameters": {
      "model": "bge-reranker-v2-m3",
      "top_n": 10
    }
  }
]
Why this works:
  1. Search stage: Fast, retrieves 100 candidates (< 20ms)
  2. Rerank stage: Slower but precise, picks best 10 (50-100ms)
  3. Total: High-quality results in 70-120ms

Performance

MetricValue
Latency50-100ms (depends on candidate count)
Optimal input size50-200 documents
Maximum practical~500 documents
BatchingAutomatic
Reranking 1000+ documents is not recommended. Use top_k limits in the search stage to control candidate pool size.

Output

Each returned document includes:
FieldTypeDescription
document_idstringUnique document identifier
scorefloatReranker relevance score
original_scorefloatScore from previous stage
rerank_positionintegerPosition after reranking

Common Pipeline Patterns

Search + Rerank + Limit

[
  {
    "stage_type": "filter",
    "stage_id": "hybrid_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 100
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "rerank",
    "parameters": {
      "model": "bge-reranker-v2-m3",
      "top_n": 20
    }
  },
  {
    "stage_type": "reduce",
    "stage_id": "limit",
    "parameters": {
      "limit": 5
    }
  }
]

Search + Filter + Rerank

[
  {
    "stage_type": "filter",
    "stage_id": "semantic_search",
    "parameters": {
      "query": "{{INPUT.query}}",
      "vector_index": "text_extractor_v1_embedding",
      "top_k": 200
    }
  },
  {
    "stage_type": "filter",
    "stage_id": "structured_filter",
    "parameters": {
      "conditions": {
        "field": "metadata.category",
        "operator": "eq",
        "value": "{{INPUT.category}}"
      }
    }
  },
  {
    "stage_type": "sort",
    "stage_id": "rerank",
    "parameters": {
      "model": "bge-reranker-v2-m3",
      "top_n": 10
    }
  }
]

Custom Reranker (BYO Model)

Use your own reranker model deployed as a custom extractor instead of the built-in models. Set feature_uri to route reranking through your extractor’s inference endpoint.

Parameters

ParameterTypeDefaultDescription
feature_uristringnullFeature URI of a custom reranker plugin. Overrides model when set.
Your plugin must accept {pairs: [[query, doc], ...]} and return {scores: [float, ...]}.

Configuration Example

{
  "stage_name": "my_rerank",
  "config": {
    "stage_id": "rerank",
    "parameters": {
      "feature_uri": "mixpeek://my_reranker@1.0.0/rerank",
      "top_n": 10
    }
  }
}
Set inference_type: "rerank" in your plugin’s manifest to declare compatibility with the rerank stage.

Trade-offs

AspectImpact
Higher precisionBetter relevance scoring
Higher latency50-100ms per batch
Limited scaleBest for < 500 candidates
API costsPer-document scoring