Cross-Modal Join
Join collections across shared embedding spaces or time overlap. Enables investigations, analytics, and multi-source RAG.
"Find security incidents in December 2024 by joining video footage with incident logs"
Why This Matters
Joins are infrastructure operations—not ML models. Once collections share embedding spaces, you can query across them.
from mixpeek import Mixpeekclient = Mixpeek(api_key="your-api-key")# Create collections with shared embedding spacevideo_collection = client.collections.create(collection_name="video_library",feature_extractor={"feature_extractor_name": "multimodal_extractor","version": "v1"})transcript_collection = client.collections.create(collection_name="transcripts",feature_extractor={"feature_extractor_name": "text_extractor","version": "v1"})# Cross-modal search across bothresults = client.retrievers.execute(retriever_id="cross-modal-retriever",inputs={"query_text": "security incident","collections": ["video_library", "transcripts"],"time_range": {"start": "2024-12-01T00:00:00Z","end": "2024-12-31T23:59:59Z"}})
Retrieval Flow
Search across multiple collections
Time-based overlap filtering
Merge results from multiple collections
Feature Extractors
Feature Extractors
Text Embedding
Extract semantic embeddings from documents, transcripts and text content
Image Embedding
Generate visual embeddings for similarity search and clustering
Video Embedding
Generate vector embeddings for video content
Retriever Stages
feature search
Search collections using multimodal embeddings
attribute filter
Filter documents by metadata attributes
compose
Compose multiple retriever pipelines together
