Skip to main content
The Score Threshold stage applies an absolute quality gate: it drops every document whose score on a chosen field fails a minimum bar. When nothing clears the bar it returns an empty result set and sets all_below_threshold — the signal your UI uses to show a “no good results” state instead of presenting weak matches.
Stage Category: REDUCE (N documents → ≤ N documents)Transformation: keeps only documents meeting min_score; the set may become empty.

Why not just normalize and filter?

score_normalize with min_max always rescales the top result to 1.0 — so a threshold on the normalized score can never reject an all-bad result set (the best match is always 1.0). Score Threshold gates on the raw or calibrated score, so “everything is below the bar” is expressible. Threshold on a calibrated score — the rerank cross-encoder score is ideal (score_field: "scores.rerank"), since it is far more absolute and comparable than a raw cosine similarity.

When to Use

Use CaseDescription
Suppress weak matchesDon’t show results that aren’t good enough to be useful
”No good results” UXBranch to a no-results / suggestion state via all_below_threshold
Quality gate after rerankHard-gate on the calibrated cross-encoder score
Confidence cutoffsOnly surface high-confidence matches to end users

When NOT to Use

ScenarioRecommended Alternative
Rescale scores for comparisonscore_normalize
Keep top-N regardless of qualitylimit
Filter by metadata fieldsattribute_filter
Reorder by scoresort_relevance

Parameters

ParameterTypeDefaultDescription
min_scorefloat(required)Absolute minimum score a document must meet to be kept
score_fieldstringscoreScore field to gate on. Dot-paths supported (e.g. scores.rerank, metadata.quality)
comparisonstringgteKeep docs whose score is gte (≥) or gt (strictly >) min_score
missing_scorestringdropWhat to do with documents lacking score_field: drop or keep

Response Metadata

FieldDescription
all_below_thresholdtrue when no document met the bar (empty result set)
input_count / output_countDocuments in / kept
droppedDocuments removed
missing_score_docsDocuments that lacked score_field

Configuration Examples

{
  "stage_name": "score_threshold",
  "stage_type": "reduce",
  "config": {
    "stage_id": "score_threshold",
    "parameters": {
      "min_score": 0.5,
      "score_field": "scores.rerank"
    }
  }
}

”No Good Results → Suggestions” Pattern

Gate on the rerank score; when nothing qualifies, the empty result + all_below_threshold tells your application to fall back to query_expand for adjacent suggestions instead of showing weak matches.
[
  {
    "stage_name": "feature_search",
    "stage_type": "filter",
    "config": {
      "stage_id": "feature_search",
      "parameters": {
        "searches": [
          {"feature_uri": "mixpeek://text_extractor@v1/multilingual_e5_large_instruct_v1", "query": {"input_mode": "text", "value": "{{INPUT.query}}"}}
        ],
        "final_top_k": 50
      }
    }
  },
  {
    "stage_name": "rerank",
    "stage_type": "sort",
    "config": {
      "stage_id": "rerank",
      "parameters": {
        "inference_name": "BAAI__bge_reranker_v2_m3",
        "query": "{{INPUT.query}}",
        "document_field": "content"
      }
    }
  },
  {
    "stage_name": "score_threshold",
    "stage_type": "reduce",
    "config": {
      "stage_id": "score_threshold",
      "parameters": {
        "min_score": 0.5,
        "score_field": "scores.rerank"
      }
    }
  }
]
Calibrate min_score empirically: run representative queries (including ones that should return nothing) and pick the value that separates good from bad on the rerank score. The right cutoff is query-domain specific.

Performance

MetricValue
Latency< 1ms
MemoryO(N)
CostFree
ComplexityO(N) (single pass)
  • Score Normalize - Rescale scores to a common range
  • Rerank - Calibrated cross-encoder scores to gate on
  • Query Expand - Adjacent suggestions when nothing qualifies
  • Limit - Truncate to top-N regardless of score