Skip to main content
Limit stage showing result truncation to top-N documents
The Limit stage truncates the document set to a maximum number of results, optionally with an offset for pagination-style behavior. This is the retriever pipeline equivalent of SQL’s LIMIT/OFFSET clause.
Stage Category: REDUCE (Truncates documents)Transformation: N documents → min(N, limit) documents

When to Use

Use CaseDescription
Top-K resultsReturn only the best N results after reranking
PaginationImplement page-based result access with offset
Cost controlCap document count before expensive LLM stages
Fixed outputGuarantee exactly N results for downstream consumers
Mid-pipeline trimReduce candidates between expensive stages

When NOT to Use

ScenarioRecommended Alternative
Random samplingsample stage
Filtering by criteriaattribute_filter or llm_filter
Initial retrieval limitSet limit in feature_search directly
Statistical reductionaggregate stage
Grouping resultsgroup_by stage

Parameters

ParameterTypeDefaultDescription
limitinteger10Maximum number of documents to return (1-10000)
offsetinteger0Number of documents to skip from the beginning (0-10000)

Configuration Examples

{
  "stage_name": "limit",
  "stage_type": "reduce",
  "config": {
    "stage_id": "limit",
    "parameters": {
      "limit": 10
    }
  }
}
Place the limit stage after sorting/reranking to ensure you’re keeping the highest-quality results. Limiting before reranking loses potentially relevant documents.

Performance

MetricValue
Latency< 1ms
MemoryO(1)
CostFree
ComplexityO(1) list slicing

Common Pipeline Patterns

Rerank Then Limit

[
  {
    "stage_name": "feature_search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [{"feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1", "query": {"input_mode": "text", "value": "{{INPUT.query}}"}, "top_k": 100}],
        "final_top_k": 100
      }
    }
  },
  {
    "stage_name": "rerank",
    "stage_type": "sort",
    "config": {
      "stage_id": "rerank",
      "parameters": {
        "inference_name": "BAAI__bge_reranker_v2_m3",
        "query": "{{INPUT.query}}",
        "document_field": "content"
      }
    }
  },
  {
    "stage_name": "limit",
    "stage_type": "reduce",
    "config": {
      "stage_id": "limit",
      "parameters": {
        "limit": 10
      }
    }
  }
]

Cost-Controlled LLM Pipeline

[
  {
    "stage_name": "feature_search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [{"feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1", "query": {"input_mode": "text", "value": "{{INPUT.query}}"}, "top_k": 200}],
        "final_top_k": 200
      }
    }
  },
  {
    "stage_name": "limit",
    "stage_type": "reduce",
    "config": {
      "stage_id": "limit",
      "parameters": {
        "limit": 20
      }
    }
  },
  {
    "stage_name": "llm_enrich",
    "stage_type": "enrich",
    "config": {
      "stage_id": "llm_enrich",
      "parameters": {
        "provider": "openai",
        "model_name": "gpt-4o-mini",
        "prompt": "Summarize: {{DOC.content}}",
        "output_field": "summary"
      }
    }
  }
]

Error Handling

ErrorBehavior
Limit > input countReturns all available documents
Offset > input countReturns empty result set
Empty inputReturns empty result set
Offset + Limit > countReturns documents from offset to end
  • Sample - Random or stratified sampling
  • Deduplicate - Remove duplicates before limiting
  • Rerank - Re-score before limiting to ensure best results