Multimodal search,
simplified.

Semantic and visual search across video, images and text in just 2 API calls - from any database or object storage.

Even bring your own database Read more

Search Across Any Media Type

Find exactly what you need using natural language, images, or video clips as search input

"People working in a warehouse"
Video Matches
1:45
3:20
3:20
Image Matches

Semantic Search

Use natural language to search across videos, images, and documents.

Reference Image
Visual Query
Similar Content

Visual Search

Upload an image or video clip to find visually similar content.

Video Input
"similar scenes in black and white" Text Input
Location: Outside Filter
Combined Results
Multimodal result relevance:
Visual: 85%+ Text: 75%+

Hybrid Search

Combine images, text, video clips, and metadata filters for precise, multimodal search results.

For developers, by developers

Integrate the latest multimodal AI search with just a few lines of code.


import requests

url = "https://api.mixpeek.com/index/videos/url"
headers = {
    "Authorization": "Bearer API-KEY",
    "Content-Type": "application/json"
}

payload = {
    "url": "https://example.com/video.mp4",
    "collection_id": "my_collection",
    "feature_extractors": [{
        "embed": [
            {
                "type": "url", 
                "vector_index": "multimodal"
            }
        ],
        "transcribe": {"enabled": True},
        "describe": {"enabled": True}
    }]
}

{
    "message": "Video indexing started",
    "task_id": "task_123",
    "metadata": {
        "duration": 120,
        "format": "mp4",
        "size": 15000000
    },
    "features": {
        "embeddings": 1,
        "transcription": True,
        "description": True
    }
}

import requests

url = "https://api.mixpeek.com/features/search"
headers = {
    "Authorization": "Bearer API-KEY",
    "Content-Type": "application/json"
}

payload = {
    "queries": [{
        "type": "text",
        "value": "person walking on beach",
        "vector_index": "multimodal"
    }],
    "collection_ids": ["my_collection"],
    "group_by": {"field": "asset_id"}
}

{
    "results": [{
        "asset_id": "video_123",
        "group": [{
            "feature_id": "feat_123",
            "start_time": 10.5,
            "end_time": 12.5,
            "relevance": 0.92,
            "description": "Person walking along beach",
            "transcription": "A peaceful beach scene",
            "url": "https://example.com/video.mp4"
        }]
    }],
    "elapsed_time": 0.15
}
Before
After

The Problem

Out with the old...

Tedious Annotations

Manually logging videos is time-consuming and unscalable.

Limited Transcriptions

Transcripts miss critical elements of your video, such as visuals and sounds.

Basic Object Detection

Object-level tags miss the context needed to add real value to your video.

AWS
MongoDB
Azure
GCP

Zero Platform Risk

Fully managed or self-hosted

Easy to Use

Get started on the free plan with an easy-to-use API or the Python client.

Scalable

Scale from zero to billions of items, with no downtime and minimal latency impact.

Pay for What you Use

Start free, then pay only for what you use with usage-based pricing.

Free Forever Tier

We will never charge you if you maintain under the file quota.

Reliable

Choose a cloud provider and region — we'll take care of uptime, consistency, and the rest.

Secure

mixpeek is SOC 2 Type II and GDPR-ready. It's built to keep data secure. See our security stance.

Become a multimodal maker.

Upgrade your application with video understanding in one line of code.