inputs.
Stage shape. Every stage is
{ "stage_name", "stage_type", "config": { "stage_id", "parameters" } }. stage_name is your label; stage_id is the implementation (feature_search, rerank, attribute_filter, …); stage_type is the category (filter, sort, reduce, group, enrich, apply). See Retrievers and Stages.Create + execute (every recipe uses this)
Multimodal RAG
Retrieve context for an LLM across a video’s visual content and transcript, fuse with RRF, rerank, and format into a single prompt-ready context block.Hybrid Search (dense + keyword/BM25)
Fuse semantic recall (dense vectors) with exact-keyword precision (BM25) under RRF — so brand names, SKUs, and prices like$9.99 still match. Requires a text payload index (see Text Indexes).
rrf here — it ranks by position, immune to the cosine-vs-BM25 score-scale mismatch.
Video Moment Localization
Search a video and collapse matching segments into a handful of seekable moments with start/end timestamps.Face Search (1:N identification)
Find every clip a person appears in by passing a reference face image as acontent query against the face embedding.
Reverse Image Search + Dedup
Find visually similar items from other sources, deduplicated.Search + Classify
Search, then attach an LLM/taxonomy label to each result in one pipeline.Scan a batch of inputs
Any retriever can run many queries at once — ideal for moderating an upload queue or scanning a catalog:Composing recipes
Stages are independent — add, remove, or reorder them. Common extensions: append arerank for precision, an attribute_filter for metadata scoping (the optimizer pushes it down for you), or a rag_prepare to format for an LLM.
