Skip to main content
Summarize stage showing LLM-powered document summarization
The Summarize stage uses language models to generate summaries from document sets. It can create single summaries from multiple documents, per-document summaries, or answer questions based on the retrieved content.
Stage Category: REDUCE (Aggregates documents)Transformation: N documents → 1 summary document (or N documents with summaries)

When to Use

Use CaseDescription
RAG summarizationGenerate answers from search results
Document synthesisCombine multiple sources into one summary
Key points extractionDistill long documents to essentials
Question answeringAnswer user questions from retrieved docs

When NOT to Use

ScenarioRecommended Alternative
Just formatting for LLMrag_prepare (no LLM call)
Extracting structured datallm_enrich
Real-time low-latencyPre-compute summaries

Parameters

ParameterTypeDefaultDescription
promptstringRequiredSummarization instructions (must include {{DOCUMENTS}})
providerstringgoogleLLM provider: openai, google, anthropic
model_namestringprovider defaultSpecific LLM model to use
content_fieldstringcontentField containing text to summarize
group_bystringnoneField to group by (one summary per group); omit for a single summary
max_input_tokensinteger8000Max tokens to send to LLM
include_sourcesbooleantrueAdd source document IDs to output
output_fieldstringsummaryField for summary output

Available Models

Set provider and model_name together. If provider is omitted, it is inferred from model_name (defaults to google / gemini-2.5-flash-lite).
Providermodel_name examplesSpeedQuality
googlegemini-2.5-flash-liteFastGood
openaigpt-4o-miniFastGood
openaigpt-4oMediumExcellent
anthropicclaude-haiku-4-5-20251001FastGood

Configuration Examples

{
  "stage_name": "summarize",
  "stage_type": "reduce",
  "config": {
    "stage_id": "summarize",
    "parameters": {
      "provider": "openai",
      "model_name": "gpt-4o-mini",
      "prompt": "Based on the provided documents, answer the user's question: {{INPUT.query}}\n\n{{DOCUMENTS}}",
      "include_sources": true
    }
  }
}

Grouping

Single Summary (default)

With no group_by, all documents are combined into one summary (N→1):
[Doc1, Doc2, Doc3] → "Combined summary of all documents..."

Per-Group Summaries

Set group_by to a field path to produce one summary per unique group value (N→M). Use {{GROUP_VALUE}} in the prompt to reference the current group:
group_by: "metadata.category"
[Doc1(A), Doc2(A), Doc3(B)] → ["A" summary, "B" summary]

Output Schema

The summary is written to output_field (default summary). When include_sources is true, source_document_ids is added; when include_metadata is true, document_count and tokens_used are added.

Single Summary (no group_by)

{
  "summary": "Based on the documents, the answer is...",
  "source_document_ids": ["doc_123", "doc_456"],
  "document_count": 2,
  "tokens_used": 1250
}

Per-Group (with group_by)

One summary document per unique group value:
[
  {
    "summary": "Summary for the electronics category...",
    "source_document_ids": ["doc_123", "doc_456"],
    "document_count": 2
  },
  {
    "summary": "Summary for the clothing category...",
    "source_document_ids": ["doc_789"],
    "document_count": 1
  }
]

Performance

MetricValue
Latency500-2000ms
Token usageDepends on input size
Max inputModel context window
StreamingSupported
Summarization calls the LLM and incurs API costs. Use rag_prepare if you only need to format content for external LLM calls.

Common Pipeline Patterns

Full RAG Pipeline

[
  {
    "stage_name": "hybrid_search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          { "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1", "query": { "input_mode": "text", "value": "{{INPUT.query}}" }, "top_k": 50 }
        ],
        "final_top_k": 50
      }
    }
  },
  {
    "stage_name": "rerank",
    "stage_type": "sort",
    "config": {
      "stage_id": "rerank",
      "parameters": {
        "inference_name": "BAAI__bge_reranker_v2_m3",
        "top_k": 10
      }
    }
  },
  {
    "stage_name": "summarize",
    "stage_type": "reduce",
    "config": {
      "stage_id": "summarize",
      "parameters": {
        "provider": "openai",
        "model_name": "gpt-4o",
        "prompt": "Answer the user's question based on the provided documents: {{INPUT.query}}\n\n{{DOCUMENTS}}",
        "include_sources": true
      }
    }
  }
]

Multi-Document Synthesis

[
  {
    "stage_name": "semantic_search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          { "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1", "query": { "input_mode": "text", "value": "{{INPUT.topic}}" }, "top_k": 20 }
        ],
        "final_top_k": 20
      }
    }
  },
  {
    "stage_name": "structured_filter",
    "stage_type": "filter",
    "config": {
      "stage_id": "attribute_filter",
      "parameters": {
        "conditions": {
          "field": "metadata.type",
          "operator": "eq",
          "value": "research_paper"
        }
      }
    }
  },
  {
    "stage_name": "summarize",
    "stage_type": "reduce",
    "config": {
      "stage_id": "summarize",
      "parameters": {
        "provider": "anthropic",
        "model_name": "claude-haiku-4-5-20251001",
        "prompt": "Synthesize the research findings from these papers on {{INPUT.topic}}. Identify common themes, contradictions, and gaps in the research.\n\n{{DOCUMENTS}}",
        "max_input_tokens": 32000
      }
    }
  }
]

Preview Summaries

[
  {
    "stage_name": "semantic_search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          { "feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1", "query": { "input_mode": "text", "value": "{{INPUT.query}}" }, "top_k": 10 }
        ],
        "final_top_k": 10
      }
    }
  },
  {
    "stage_name": "summarize",
    "stage_type": "reduce",
    "config": {
      "stage_id": "summarize",
      "parameters": {
        "provider": "openai",
        "model_name": "gpt-4o-mini",
        "prompt": "Create a one-sentence summary for the '{{GROUP_VALUE}}' group:\n\n{{DOCUMENTS}}",
        "group_by": "metadata.source",
        "output_field": "preview"
      }
    }
  }
]

Comparison: summarize vs rag_prepare

Featuresummarizerag_prepare
Calls LLMYesNo
OutputGenerated summaryFormatted context
Latency500-2000ms< 10ms
CostLLM API costsFree
Use caseEnd-to-end RAGPrepare for external LLM

Error Handling

ErrorBehavior
Token limit exceededTruncates input, continues
LLM timeoutRetry once, then fail
Rate limitAutomatic backoff
Empty inputReturns empty summary