Video Learning Hub
Master multimodal AI concepts through comprehensive tutorials, guides, and best practices from our expert team.
Trusted by engineers at

Hybrid Search: Best of Both Worlds
Combine keyword (BM25) and semantic (vector) search for maximum retrieval effectiveness. Learn when keyword search fails with synonyms and paraphrasing, when semantic search fails with specific IDs and acronyms, and how to use score fusion strategies to get the best of both worlds. What you'll learn: ⚡ Trade-offs between keyword and semantic search ⚡ Precision vs recall in retrieval systems ⚡ Score fusion strategies (RRF, weighted, distribution-based) ⚡ The 80/20 rule for catching edge cases ⚡ Building hybrid retrievers with feature_filter and attribute_filter ⚡ Real-world example: Developer documentation search

Chunking Strategies: Breaking Documents into Searchable Pieces
Master the art of breaking large documents into searchable chunks. Learn why chunking is necessary for context windows and precision, explore fixed-size, semantic, and sentence-based strategies, and understand chunk overlap techniques that prevent information loss at boundaries. What you'll learn: ⚡ Why chunking matters for context windows and precision ⚡ Chunking strategies: fixed-size, semantic, sentence-based, layout-based ⚡ Chunk overlap as a safety net (67% → 94% accuracy improvement) ⚡ Multimodal chunking: videos, audio, images, PDFs ⚡ Building object decomposition pipelines in Mixpeek ⚡ Real-world example: 200-page legal contract analysis

Semantic Search Fundamentals
Move beyond keyword matching to meaning-based retrieval. This video covers the evolution from keyword search to semantic search, vector similarity algorithms (cosine similarity, dot product), encoding models, and building feature search pipelines in Mixpeek. What you'll learn: ⚡ Keyword search vs semantic search ⚡ Cosine similarity and dot product explained ⚡ How encoder models capture meaning ⚡ HNSW indexes for vector search ⚡ Building retrieval pipelines with feature search ⚡ Intent-based retrieval in practice

The Data Transformation Pipeline
Master the Mixpeek philosophy: Objects → Documents → Enriched Knowledge. This foundational video covers the three-layer architecture (Buckets → Collections → Retrievers), decomposition and recomposition patterns, enrichment as immutable joins, and configuration-over-code principles. Essential mental models for understanding how Mixpeek transforms raw data into searchable intelligence. What you'll learn: ⚡ The three-layer architecture pattern ⚡ Decomposition: Breaking objects into semantic layers ⚡ Recomposition: Multi-stage pipeline assembly ⚡ Enrichment through similarity joins ⚡ Declarative pipeline configuration ⚡ Complete provenance tracking

What Are Embeddings?
Similar meaning = similar numbers. That's the entire idea. Embeddings are vector representations that capture semantic meaning—and they're the foundation of everything in modern search. What you'll learn: ⚡ What embeddings actually are (numbers that represent meaning) ⚡ Why embeddings matter (similar things → similar vectors) ⚡ Dense vs sparse embeddings (ColBERT vs SPLADE) ⚡ Embedding dimensions and what they represent ⚡ How 'refund request' and 'want my money back' map to nearly identical vectors

IAB Content Taxonomy Mapper: Free Open-Source Tool for 2.x → 3.0 Migration
Upgrade to IAB Content Taxonomy 3.0 without the headaches. This video walks you through the open-source IAB Mapper — a free tool that helps adtech platforms, publishers, and brand safety vendors migrate from IAB 2.x to 3.0 in minutes. 👉 Demo UI: https://mxp.co/taxonomy 👉 GitHub Repo: https://github.com/mixpeek/iab-mapper 👉 Docs (Taxonomies API): https://docs.mixpeek.com/enrichment/taxonomies What you'll learn in this video: • Why IAB 3.0 migration is mandatory • How to use the demo UI (upload → map → export) • How to install & run the CLI with pip/npm • The mapping pipeline explained: Exact label match, TF-IDF, BM25, vector KNN, and LLM reranking with Ollama • Real-world use cases: contextual targeting, brand safety, creative attribution • Benefits: open-source, runs locally, confidence-scored outputs, extensible AI methods This mapper is free, open-source, and MIT licensed. Perfect for anyone who needs to migrate, validate, or experiment with IAB 3.0.

Cookies Are Dead — Context Is Alive
💀 Cookies are dead. ⚡ Context is alive. The old world: follow users with creepy IDs. The new world: understand the content itself. 👎 Apple pie ads during an Apple keynote. 👍 Nike ads during a basketball game. This isn't the end of targeting. It's the rebirth.

IAB Taxonomy: The Structure Behind AI Campaigns (60‑Second Breakdown)
ai can scan text, video, and audio at scale. but here's the problem → without structure, it's just chaos. that's where the iab taxonomy comes in. it gives ai campaigns the shared language they need to deliver relevance, safety, comparability, and scale. in this 60-second breakdown, you'll learn: why raw ai outputs aren't enough how taxonomies standardize targeting across platforms what to show in your next rfp deck to prove you're future-ready 👉 subscribe for more practical breakdowns on contextual targeting + ai in adtech.

Multimodal Monday #2 — From Tiny VLMs to 10M‑Token Titans
This week in multimodal AI was wild — we're talking: Meta's Llama 4 with 10M-token context windows Microsoft's Phi-4-Multimodal outperforming much larger models Hugging Face's SmolVLM that runs on less than 1GB RAM Poisoned image attacks on retrieval-augmented generation (!) We'll break down the latest research, tools, real-world use cases, and what it all means for developers, founders, and builders in the AI space. ⏱ Timestamps: 00:00 – Welcome 00:25 – Quick Take 01:05 – Research Highlights 02:10 – Tools & Techniques 03:00 – Real-World Applications 03:40 – Trends & Predictions 04:30 – Community & Shoutouts 04:55 – Wrap-up
