Retriever Stages
Build powerful search pipelines by composing modular stages
24 stages available
Feature Search
Search and filter documents by vector similarity using feature embeddings
feature_searchAttribute Filter
Filter documents by metadata attribute values using boolean logic
attribute_filterLlm Filter
Filter documents using LLM-based semantic evaluation
llm_filterQuery Expand
Generate query variations with LLM and fuse search results via RRF
query_expandSort Relevance
Sort documents by relevance score
sort_relevanceSort Attribute
Sort documents by any metadata field value
sort_attributeMmr
Reorder results using Maximal Marginal Relevance for diversity
mmrRerank
Rerank documents using cross-encoder models for accurate relevance
rerankAggregate
Compute aggregations (COUNT, SUM, AVG, etc.) on pipeline results
aggregateGroup By
Group documents by field value for decompose/recompose workflows
group_byCluster
Cluster documents by embedding similarity to discover themes
clusterSample
Sample a subset of documents using random or stratified sampling
sampleHow Retriever Pipelines Work
Combine stages to build sophisticated search and retrieval pipelines. Each stage type serves a specific purpose in the data flow.
1. Search
Retrieve documents from collections using semantic search, keyword search, or hybrid approaches.
2. Filter
Remove documents that don't match criteria using attribute filters, score thresholds, or LLM evaluation.
3. Rank
Reorder documents by relevance using sorting, cross-encoders, or LLM-based reranking.
4. Generate
Create responses from retrieved documents using RAG, summarization, or custom LLM generation.
