Mixpeek Logo
    Joins

    Cross-Modal Join & Correlation

    Join collections across shared embedding spaces or time overlap. Enables investigations, analytics, and multi-source RAG.

    video
    image
    text
    Multi-Stage
    45.0K runs
    Deploy Recipe

    Why This Matters

    Joins are infrastructure operations—not ML models. Once collections share embedding spaces, you can query across them.

    from mixpeek import Mixpeek
    client = Mixpeek(api_key="your-api-key")
    # Create collections with shared embedding space
    video_collection = client.collections.create(
    collection_name="video_library",
    feature_extractor={
    "feature_extractor_name": "multimodal_extractor",
    "version": "v1"
    }
    )
    transcript_collection = client.collections.create(
    collection_name="transcripts",
    feature_extractor={
    "feature_extractor_name": "text_extractor",
    "version": "v1"
    }
    )
    # Cross-modal search across both
    results = client.retrievers.execute(
    retriever_id="cross-modal-retriever",
    inputs={
    "query_text": "security incident",
    "collections": ["video_library", "transcripts"],
    "time_range": {
    "start": "2024-12-01T00:00:00Z",
    "end": "2024-12-31T23:59:59Z"
    }
    }
    )

    Retrieval Flow

    1

    Search across multiple collections

    2

    Time-based overlap filtering

    3
    compose(compose)

    Merge results from multiple collections

    Feature Extractors

    Text Embedding

    Extract semantic embeddings from documents, transcripts and text content

    827K runs

    Image Embedding

    Generate visual embeddings for similarity search and clustering

    752K runs

    Video Embedding

    Generate vector embeddings for video content

    610K runs

    Retriever Stages

    feature search

    Search collections using multimodal embeddings

    search

    attribute filter

    Filter documents by metadata attributes

    filter

    compose

    Compose multiple retriever pipelines together

    compose

    Documentation