Mixpeek Logo

    Video Learning Hub

    Master multimodal AI concepts through comprehensive tutorials, guides, and best practices from our expert team.

    Trusted by engineers at

    NVIDIA
    MongoDB
    AWS
    Google
    Microsoft
    Apple
    Meta
    LinkedIn
    All(25)
    Release Guides(5)
    Multimodal Monday(1)
    Multimodal University(14)
    Use Cases(5)
    Building an Exploratory Multimodal Retriever with the National Gallery

    Building an Exploratory Multimodal Retriever with the National Gallery

    Discover how to build a powerful exploratory image board using multimodal search across 120,000 images from the National Gallery. This walkthrough demonstrates combining text search, reverse image search, and document-based queries into a unified retrieval experience using hybrid search with Reciprocal Rank Fusion (RRF). 👉 Live Demo: https://mxp.co/r/npg What you'll learn: ⚡ Building exploratory search interfaces for visual content ⚡ Combining text, image, and document reference queries ⚡ Implementing hybrid search with RRF for optimal results ⚡ Using Google SigLIP embeddings for image understanding ⚡ Creating multi-stage retriever pipelines with feature search ⚡ Capturing user signals for recommendation systems ⚡ Architecture patterns: Objects → Buckets → Collections → Retrievers Real-world demo: Visual curation across 120k images, 12GB of data, with text + image + document hybrid queries. Full source code available in the Mixpeek showcase repository.

    Jan 25, 2026
    8:20
    Use Cases
    Ethan
    Web Scraper Guide

    Web Scraper Guide

    Learn how to use Mixpeek's Web Scraper to recursively crawl websites and extract multimodal content with automatic embeddings. This guide demonstrates crawling documentation sites, extracting code snippets and images, and making everything searchable with semantic embeddings. What you'll learn: ⚡ Recursive website crawling with depth control ⚡ Extracting text, code blocks, and images ⚡ Multimodal embeddings (E5-Large, Jina Code, SigLIP) ⚡ JavaScript rendering for SPAs ⚡ URL filtering and structured extraction ⚡ Building searchable knowledge bases from docs

    Jan 22, 2026
    12:00
    Release Guides
    Mixpeek Team
    Buckets Guide

    Buckets Guide

    Learn how to use Mixpeek Buckets for schema-backed data ingestion with automatic validation and lineage tracking. This guide demonstrates creating buckets, defining schemas, uploading objects with multimodal blobs, and processing them through collections. What you'll learn: ⚡ Creating buckets with JSON schema validation ⚡ Uploading objects with multimodal blobs (text, image, video, JSON) ⚡ Schema enforcement and blob type validation ⚡ Lineage tracking from source to documents ⚡ Integration with collections for feature extraction ⚡ Best practices for organizing multimodal data

    Jan 22, 2026
    10:00
    Release Guides
    Mixpeek Team
    Video Understanding: From Frames to Contextual Search

    Video Understanding: From Frames to Contextual Search

    Master video understanding and how it differs from basic image understanding. This video covers frame extraction techniques (sampling, keyframe detection, scene-based), video embedding models that capture temporal context, and building sophisticated semantic video search applications. What you'll learn: ⚡ Video vs image understanding: temporal context matters ⚡ Frame extraction techniques: sampling, keyframe, scene-based ⚡ Frame-level vs video-level embeddings ⚡ How video embeddings capture motion and actions ⚡ Scene detection with AutoShot and semantic deduplication ⚡ Vertex AI multimodal embeddings for video ⚡ Building scene-based video search pipelines ⚡ Real demo: Contextual video retrieval in Mixpeek Studio

    Jan 10, 2026
    11:18
    Multimodal University
    Ethan
    Image Understanding: Vision Encoders & Multimodal Search

    Image Understanding: Vision Encoders & Multimodal Search

    Master how computers see and search images. This video covers vision encoding models like CLIP and SigLIP, how images are converted into patches and embeddings, object detection with YOLO, and building multimodal search systems that support text-to-image, image-to-text, and image-to-image queries. What you'll learn: ⚡ How vision transformers convert images into embeddings ⚡ Image patches and mean pooling explained ⚡ CLIP vs SigLIP embedding models ⚡ Object detection and classification with YOLO ⚡ Cross-modal search: text queries on image datasets ⚡ Combining text + image queries with mean pooling ⚡ Feature URIs for image extractors ⚡ Live demo: National Gallery multimodal retriever

    Jan 6, 2026
    14:32
    Multimodal University
    Ethan
    Feature URIs: Evolving Embeddings Without Migration

    Feature URIs: Evolving Embeddings Without Migration

    Learn how to evolve embedding models without painful re-indexing. Master Feature URIs—a core abstraction for managing the lifecycle of embeddings, extractors, and indexes. Discover why vector indexes are stateful, how to A/B test embedding models safely, and how to roll forward and roll back upgrades without downtime. What you'll learn: ⚡ Why vector indexes are inherently stateful and fragile ⚡ The 4 components of a Feature URI ⚡ How extractors, embedding models, versions, and inference endpoints are coupled ⚡ A/B testing embedding models without re-indexing ⚡ Rolling forward and rolling back embedding upgrades ⚡ Real examples using image collections and feature search ⚡ How Feature URIs enable hybrid search, re-ranking, and evaluation

    Dec 29, 2025
    7:35
    Multimodal University
    Ethan
    Hybrid Search: Best of Both Worlds

    Hybrid Search: Best of Both Worlds

    Combine keyword (BM25) and semantic (vector) search for maximum retrieval effectiveness. Learn when keyword search fails with synonyms and paraphrasing, when semantic search fails with specific IDs and acronyms, and how to use score fusion strategies to get the best of both worlds. What you'll learn: ⚡ Trade-offs between keyword and semantic search ⚡ Precision vs recall in retrieval systems ⚡ Score fusion strategies (RRF, weighted, distribution-based) ⚡ The 80/20 rule for catching edge cases ⚡ Building hybrid retrievers with feature_filter and attribute_filter ⚡ Real-world example: Developer documentation search

    Dec 24, 2025
    11:09
    Multimodal University
    Ethan
    Chunking Strategies: Breaking Documents into Searchable Pieces

    Chunking Strategies: Breaking Documents into Searchable Pieces

    Master the art of breaking large documents into searchable chunks. Learn why chunking is necessary for context windows and precision, explore fixed-size, semantic, and sentence-based strategies, and understand chunk overlap techniques that prevent information loss at boundaries. What you'll learn: ⚡ Why chunking matters for context windows and precision ⚡ Chunking strategies: fixed-size, semantic, sentence-based, layout-based ⚡ Chunk overlap as a safety net (67% → 94% accuracy improvement) ⚡ Multimodal chunking: videos, audio, images, PDFs ⚡ Building object decomposition pipelines in Mixpeek ⚡ Real-world example: 200-page legal contract analysis

    Dec 24, 2025
    14:06
    Multimodal University
    Ethan
    Semantic Search Fundamentals

    Semantic Search Fundamentals

    Move beyond keyword matching to meaning-based retrieval. This video covers the evolution from keyword search to semantic search, vector similarity algorithms (cosine similarity, dot product), encoding models, and building feature search pipelines in Mixpeek. What you'll learn: ⚡ Keyword search vs semantic search ⚡ Cosine similarity and dot product explained ⚡ How encoder models capture meaning ⚡ HNSW indexes for vector search ⚡ Building retrieval pipelines with feature search ⚡ Intent-based retrieval in practice

    Dec 21, 2025
    16:30
    Multimodal University
    Ethan