Unified semantic and hybrid search across multiple embedding features with configurable fusion strategies
The Feature Search stage is the primary search stage for retrieval pipelines. It performs vector similarity search across one or more embedding features, supporting single-modal, multimodal, and hybrid search patterns. Results from multiple searches are fused using configurable strategies (RRF, DBSF, weighted, max, or learned).
Stage Category: FILTER (Retrieves documents)Transformation: 0 documents → N documents (retrieves from collection based on vector similarity)
Maps interaction types to reward magnitudes. Positive values reinforce the associated feature; negative values penalize it.
min_interactions
integer
5
Minimum interactions before personal-level weights are used. Below this, falls back to demographic or global.
exploration_bonus
float
1.0
Initial multiplier for weight distribution variance.
exploration_decay
float
0.99
Per-interaction decay of exploration_bonus.
exploration_floor
float
0.1
Minimum exploration bonus (prevents full exploitation).
decay_factor
float
0.995
Per-day exponential decay on older interactions. 1.0 = no decay.
decay_window_days
integer
365
Interactions older than this are excluded entirely.
min_weight
float
0.05
Floor for any feature’s weight after sampling.
max_weight
float
0.95
Ceiling for any feature’s weight after sampling.
rollout_pct
float
100.0
Percentage of requests using learned weights (0-100).
shadow_mode
boolean
false
Compute learned weights but serve static results.
See Auto-Tune for the full concept overview, Reward Signals for reward map customization, and Rollout & Safety for traffic splitting and kill switch details.
Multi-content — embed multiple files together in one API call. Only valid when the feature_uri points to an extractor whose vector index has supports_multi_query=True (currently: gemini_multifile_extractor). Attempting this with any other feature URI returns a 400 error.
Each item in values is auto-detected: URLs (http://, https://, s3://) are fetched and embedded as files; all other strings are embedded as text. All items are passed to the underlying model in one call, producing a single query vector that mirrors how objects were indexed.
Set lexical: true on a search to run keyword/BM25 matching instead of vector similarity. The query text is matched against the namespace’s full-text index — it is not embedded into a vector. BM25 catches exact tokens that dense embeddings routinely miss: brand names, SKUs, prices like $9.99, promo codes, error strings, and CTAs.
Behavior
Detail
Input
Must be text (input_mode: "text"). The query string is used verbatim.
Matching
Across alltext-indexed string payload fields — not a single field.
feature_uri
Used only for collection scoping; no vector index is queried.
Searching only one field (e.g. OCR text). BM25 matches across alltext-indexed string fields — it can’t be scoped to a single field like ocr_text. To make one field independently searchable, give it its own dense index by running a text_extractor over it (map the extractor’s input to ocr_text), then feature_search that feature URI directly. For coarse exact-substring filtering on a single field, an attribute_filter with the contains operator works but is not relevance-ranked.
The real power is hybrid retrieval — fuse a dense (vector) search with a lexical (BM25) search under rrf so semantic recall and exact-keyword precision reinforce each other:
Use rrf fusion for dense+lexical hybrid — it ranks by position, so it is immune to the score-scale mismatch between cosine similarity and BM25. Avoid weighted/max here unless you have a specific reason.
When searching with large files (videos, PDFs, long documents) as input, query_preprocessing decomposes the file into chunks using the same extractor pipeline that indexed your data, runs parallel searches for each chunk, and fuses the results.This is ingestion applied to the query — same decomposition and embedding, but vectors are used for search instead of storage.
Parameter
Type
Default
Description
feature_uri
string
null
Extractor pipeline for decomposition (inherits from search feature_uri if not set)
params
object
null
Extractor parameters — identical schema to the collection’s extractor config for that feature_uri
max_chunks
integer
20
Max chunks to search (1-100). Each chunk = 1 credit
aggregation
string
rrf
Fusion strategy: rrf, max, or avg
dedup_field
string
null
Field to deduplicate results by
params uses the extractor’s own parameter schema. Whatever parameters the extractor accepts during ingestion (e.g. split_method, time_split_interval for video; chunk_size, chunk_overlap for text) are the same parameters you pass here. There is no separate preprocessing-specific schema — the extractor drives the decomposition exactly as it would during collection processing. Refer to the extractor’s own documentation for valid parameter names.
You can also set query_preprocessing at the stage level (on parameters) to apply it to all searches as a default. Per-search settings override the stage default.
Aggregation strategies:
Strategy
Best For
How It Works
rrf
General purpose (recommended)
Rank-based fusion, immune to score magnitude differences
Preprocessing uses the same extractor pipeline that indexed your data. The params accept the same fields you configured on your collection’s feature extractor (e.g., split_method, chunk_size). If you don’t specify params, extractor defaults are used.
The response includes preprocessing metadata showing what happened:
Filtered fields must have payload indexes on your namespace. Without indexes, filtering is slow and the response includes warnings about unindexed fields.
For best performance, use pre-filters to reduce the search space. Filtering at the vector index level is much faster than post-filtering in later stages.
The following is a complete working example of creating a retriever that uses the feature_search stage, then executing it. Pay close attention to the field names — several are easy to confuse.
Common mistakes:
Use collection_identifiers (not collection_ids) in the retriever body.
input_schema is a flat map keyed by field name ({"query": {"type": "text"}}) — do not wrap it in a JSON Schema object ({"properties": {...}, "type": "object"}).
Use type: "text" (not "string") in input_schema values.
stage_type at the outer level must be "filter" — passing stage_type: "feature_search" is rejected (feature_search is a stage_id, not a stage_type).
stage_id: "feature_search" lives inside the config object, not at the outer stage_id.
Inside each search, the query value uses {"input_mode": "text", "value": "..."} — the value key, not a bare text key.
final_top_k lives inside config.parameters, not at the top level.
If a feature_uri is wrong, the error lists the available_feature_uris for your target collections — copy the exact URI (e.g. multilingual_e5_large_instruct_v1, not embedding).
The feature_uri must match an embedding index that exists in your namespace. To discover available feature URIs, list the vector indexes in a collection:
Feature search automatically applies task-aware embedding conditioning to instruction-aware models (E5, Gemini) at query time. This means query embeddings are optimized for asymmetric retrieval without any configuration.How it works:
Index time: Extractors embed documents with retrieval_document task (configurable via embedding_task on the extractor — see Text Extractor)
Query time: Feature search automatically uses retrieval_query task for all query embeddings
This asymmetric pairing (document vs. query) improves retrieval quality by ~10% for instruction-aware models like E5-Large.The embedding_task used is included in the stage response metadata: