Mixpeek Logo
    Similar

    Semantic Multimodal Search

    Unified semantic search across all content types. Query by natural language and retrieve relevant video clips, images, audio segments, and documents based on meaning—not keywords or manual tags.

    video
    image
    audio
    text
    Multi-Tier
    125.0K runs
    Deploy Recipe

    "Find with from "

    Why This Matters

    The foundation for all retrieval workflows. Semantic understanding across any content type means you can search for concepts, not just exact matches.

    import requests
    API_URL = "https://api.mixpeek.com"
    headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "X-Namespace": "your-namespace"
    }
    # Create collection with multimodal extractor
    collection = requests.post(f"{API_URL}/v1/collections", headers=headers, json={
    "collection_name": "media_library",
    "source": {"type": "bucket", "bucket_id": "my-bucket"},
    "feature_extractor": {
    "feature_extractor_name": "multimodal_extractor",
    "version": "v1",
    "input_mappings": {"video": "source_video"}
    }
    }).json()
    # Index content from object storage
    requests.post(f"{API_URL}/v1/buckets/my-bucket/objects", headers=headers, json={
    "blobs": [{"property": "source_video", "url": "s3://bucket/video.mp4"}],
    "metadata": {"category": "demos"}
    })
    # Search semantically across all modalities
    results = requests.post(
    f"{API_URL}/v1/retrievers/semantic-retriever/execute",
    headers=headers,
    json={"query": {"text": "product demo with customer testimonials"}}
    ).json()
    for doc in results["documents"]:
    print(f"{doc['document_id']}: {doc['score']:.3f}")

    Feature Extractors

    Image Embedding

    Generate visual embeddings for similarity search and clustering

    752K runs

    Video Embedding

    Generate vector embeddings for video content

    610K runs

    Text Embedding

    Extract semantic embeddings from documents, transcripts and text content

    827K runs

    Audio Transcription

    Transcribe audio content to text

    450K runs

    Retriever Stages

    feature search

    Search collections using multimodal embeddings

    search

    attribute filter

    Filter documents by metadata attributes

    filter

    limit

    Limit the number of documents returned

    reduce