Feature Search
Vector similarity search across embeddings, including large-file inputs like videos and PDFs, with query preprocessing, multi-chunk fusion, and configurable distance metrics
Why do anything?
Embeddings are useless without search. You need to find similar vectors to power semantic search.
Why now?
Every AI application needs vector search. Users expect semantic understanding, not keyword matching. Large files (>100MB videos, PDFs) previously could not be used directly as search queries.
Why this feature?
High-performance vector search with cosine/euclidean/dot distance metrics. Query preprocessing decomposes large inputs into chunks, runs parallel searches, and fuses results, no pre-splitting required.
How It Works
Feature search is the core retriever stage for vector similarity search. Query preprocessing extends it to handle large files by decomposing them using the ingestion pipeline and fusing multi-chunk results.
Query Preprocessing
Decompose large input (video frames, PDF pages) into chunks using the configured extractor. Skip if input is text/image.
Batch Embedding
Embed all chunks in parallel via the inference service
Multi-Query Search
Run concurrent Qdrant searches for each chunk embedding
Result Fusion
Fuse per-chunk results using RRF, max, or avg. Deduplicate by document.
Filtering & Ranking
Apply pre-filters on payload fields, return top-k unified list
Why This Approach
Same decomposition logic as ingestion ensures query and index embeddings are always aligned. Parallel search across chunks is faster than sequential. RRF fusion is score-magnitude agnostic.
Integration
results = client.retrievers.execute(retriever_id=retriever_id, inputs={"query": "..."})# Large file queryresults = client.retrievers.execute(retriever_id=retriever_id, inputs={"video": "s3://bucket/clip.mp4"})
