NEWAgents can now see video via MCP.Try it now →

    Video Learning Hub

    Master multimodal AI concepts through comprehensive tutorials, guides, and best practices from our expert team.

    Trusted by engineers at

    NVIDIA
    MongoDB
    AWS
    Google
    Microsoft
    Apple
    Meta
    LinkedIn
    All(28)
    Release Guides(7)
    Multimodal Monday(1)
    Multimodal University(14)
    Use Cases(5)
    IP Safety & Copyright(1)
    Short Form(71)
    Stop Paying S3 Prices: Build a Video AI Pipeline with Backblaze + Mixpeek

    Stop Paying S3 Prices: Build a Video AI Pipeline with Backblaze + Mixpeek

    Learn how to build a cost-effective video AI pipeline by replacing S3 with Backblaze B2 as your object storage backend. This walkthrough covers connecting Backblaze to Mixpeek, ingesting video content, and running multimodal feature extraction at a fraction of the cost. What you'll learn: ⚡ Setting up Backblaze B2 as a Mixpeek data source ⚡ Ingesting video files from B2 buckets ⚡ Running multimodal AI pipelines on stored content ⚡ Cost comparison vs AWS S3 ⚡ End-to-end pipeline from storage to searchable AI features

    Apr 27, 2026
    Release Guides
    Mixpeek Team
    IP Safety Scanner: Pre-Publication Copyright Detection with Mixpeek

    IP Safety Scanner: Pre-Publication Copyright Detection with Mixpeek

    A walkthrough of the IP safety scanner showing how to detect celebrity likenesses, brand logos, and copyrighted audio in video and image content before publication. What you'll learn: ⚡ How pre-publication IP clearance works ⚡ Face detection against custom reference corpora ⚡ Logo and trademark recognition in video frames ⚡ Audio fingerprinting for copyrighted music ⚡ Setting confidence thresholds for automated clearance

    Mar 24, 2026
    IP Safety & Copyright
    Mixpeek Team
    Turn Web Scraping into Structured AI Data (Bright Data + Mixpeek Walkthrough)

    Turn Web Scraping into Structured AI Data (Bright Data + Mixpeek Walkthrough)

    A hands-on walkthrough showing how to combine Bright Data's web scraping infrastructure with Mixpeek to transform raw web content into structured, searchable AI data. What you'll learn: ⚡ Connecting Bright Data as a data source in Mixpeek ⚡ Scraping and ingesting web content at scale ⚡ Extracting structured context with multimodal AI ⚡ Making scraped data searchable via retrievers ⚡ End-to-end pipeline from raw web data to AI-ready output

    Mar 17, 2026
    Release Guides
    Mixpeek Team
    Building an Exploratory Multimodal Retriever with the National Gallery

    Building an Exploratory Multimodal Retriever with the National Gallery

    Discover how to build a powerful exploratory image board using multimodal search across 120,000 images from the National Gallery. This walkthrough demonstrates combining text search, reverse image search, and document-based queries into a unified retrieval experience using hybrid search with Reciprocal Rank Fusion (RRF). 👉 Live Demo: https://mxp.co/r/npg What you'll learn: ⚡ Building exploratory search interfaces for visual content ⚡ Combining text, image, and document reference queries ⚡ Implementing hybrid search with RRF for optimal results ⚡ Using Google SigLIP embeddings for image understanding ⚡ Creating multi-stage retriever pipelines with feature search ⚡ Capturing user signals for recommendation systems ⚡ Architecture patterns: Objects → Buckets → Collections → Retrievers Real-world demo: Visual curation across 120k images, 12GB of data, with text + image + document hybrid queries. Full source code available in the Mixpeek showcase repository.

    Jan 25, 2026
    8:20
    Use Cases
    Ethan
    Web Scraper Guide

    Web Scraper Guide

    Learn how to use Mixpeek's Web Scraper to recursively crawl websites and extract multimodal content with automatic embeddings. This guide demonstrates crawling documentation sites, extracting code snippets and images, and making everything searchable with semantic embeddings. What you'll learn: ⚡ Recursive website crawling with depth control ⚡ Extracting text, code blocks, and images ⚡ Multimodal embeddings (E5-Large, Jina Code, SigLIP) ⚡ JavaScript rendering for SPAs ⚡ URL filtering and structured extraction ⚡ Building searchable knowledge bases from docs

    Jan 22, 2026
    12:00
    Release Guides
    Mixpeek Team
    Buckets Guide

    Buckets Guide

    Learn how to use Mixpeek Buckets for schema-backed data ingestion with automatic validation and lineage tracking. This guide demonstrates creating buckets, defining schemas, uploading objects with multimodal blobs, and processing them through collections. What you'll learn: ⚡ Creating buckets with JSON schema validation ⚡ Uploading objects with multimodal blobs (text, image, video, JSON) ⚡ Schema enforcement and blob type validation ⚡ Lineage tracking from source to documents ⚡ Integration with collections for feature extraction ⚡ Best practices for organizing multimodal data

    Jan 22, 2026
    10:00
    Release Guides
    Mixpeek Team
    Video Understanding: From Frames to Contextual Search

    Video Understanding: From Frames to Contextual Search

    Master video understanding and how it differs from basic image understanding. This video covers frame extraction techniques (sampling, keyframe detection, scene-based), video embedding models that capture temporal context, and building sophisticated semantic video search applications. What you'll learn: ⚡ Video vs image understanding: temporal context matters ⚡ Frame extraction techniques: sampling, keyframe, scene-based ⚡ Frame-level vs video-level embeddings ⚡ How video embeddings capture motion and actions ⚡ Scene detection with AutoShot and semantic deduplication ⚡ Vertex AI multimodal embeddings for video ⚡ Building scene-based video search pipelines ⚡ Real demo: Contextual video retrieval in Mixpeek Studio

    Jan 10, 2026
    11:18
    Multimodal University
    Ethan
    Image Understanding: Vision Encoders & Multimodal Search

    Image Understanding: Vision Encoders & Multimodal Search

    Master how computers see and search images. This video covers vision encoding models like CLIP and SigLIP, how images are converted into patches and embeddings, object detection with YOLO, and building multimodal search systems that support text-to-image, image-to-text, and image-to-image queries. What you'll learn: ⚡ How vision transformers convert images into embeddings ⚡ Image patches and mean pooling explained ⚡ CLIP vs SigLIP embedding models ⚡ Object detection and classification with YOLO ⚡ Cross-modal search: text queries on image datasets ⚡ Combining text + image queries with mean pooling ⚡ Feature URIs for image extractors ⚡ Live demo: National Gallery multimodal retriever

    Jan 6, 2026
    14:32
    Multimodal University
    Ethan
    Feature URIs: Evolving Embeddings Without Migration

    Feature URIs: Evolving Embeddings Without Migration

    Learn how to evolve embedding models without painful re-indexing. Master Feature URIs-a core abstraction for managing the lifecycle of embeddings, extractors, and indexes. Discover why vector indexes are stateful, how to A/B test embedding models safely, and how to roll forward and roll back upgrades without downtime. What you'll learn: ⚡ Why vector indexes are inherently stateful and fragile ⚡ The 4 components of a Feature URI ⚡ How extractors, embedding models, versions, and inference endpoints are coupled ⚡ A/B testing embedding models without re-indexing ⚡ Rolling forward and rolling back embedding upgrades ⚡ Real examples using image collections and feature search ⚡ How Feature URIs enable hybrid search, re-ranking, and evaluation

    Dec 29, 2025
    7:35
    Multimodal University
    Ethan