Guides

Vendor-neutral, engineer-written guides to the concepts behind multimodal AI — perception, retrieval, embeddings, and the infrastructure agents use to see, hear, and search unstructured data. Learn the idea first; then see how Mixpeek applies it.

96 guides across 16 topics

Embeddings

11 min read

How Does LoRA Fine-Tuning Work? (Adapters, QLoRA, DoRA, and Fine-Tuning Retrieval Models)

LoRA fine-tunes a large model by freezing its weights and training a tiny low-rank adapter (a few hundred MB, under 1% of parameters) — trainable on one GPU, swappable at inference. How LoRA, QLoRA, and DoRA work; the rank/alpha/target-module knobs that matter in 2026; and the retrieval-specific part most guides skip: fine-tuning embedding models, rerankers, and VLMs with LoRA, why you must re-embed your corpus afterward, and how swappable adapters enable multi-tenant retrieval. Vendor-neutral, with a comparison table and FAQs.

LoRA

Fine-Tuning

QLoRA

Jul 2026Read guide

Agent Perception

10 min read

How Do I Evaluate a RAG Pipeline? (Faithfulness, Answer Relevance, Hallucination, and Context Metrics)

Evaluating RAG means scoring two different things separately: did retrieval fetch the right context (context precision and recall), and did generation use it honestly (faithfulness/groundedness and answer relevance). The four metrics that matter, how LLM-as-judge actually scores faithfulness and where it goes wrong, how to build a golden eval set, why you must measure retrieval and generation independently to know which half is broken, the multimodal wrinkle, and how to run it on a Mixpeek pipeline. Vendor-neutral, with the RAGAS-style metric definitions.

RAG Evaluation

Faithfulness

Hallucination

Jul 2026Read guide

Data Infrastructure

9 min read

How Do I Ingest Millions of Files into a Search Index? (Images, Video, Documents at Scale)

Ingesting millions of files is not a bigger version of ingesting a hundred — it is a different problem governed by five constraints: batching, bounded concurrency, backpressure, resumability, and idempotent retries that don't re-pay for work already done. The scaling architecture (chunk into batches, cap in-flight work, checkpoint so a failure resumes instead of restarting, and dedup so a retry reuses prior extraction instead of re-running the GPU), the cost trap that quietly doubles GPU spend, how it differs from a one-off script, and how to run it on Mixpeek.

Batch Ingestion

Scale

Data Pipelines

Jul 2026Read guide

Search & Discovery

9 min read

How Do AI Agents Search Big Datasets by Navigating Clusters? (Hierarchical Cluster Search)

Flat vector search returns top-k against one query vector, which breaks down when an agent does not know the right query, the corpus is huge and diverse, or the task is exploratory. Agentic hierarchical cluster search gives the agent a map instead: a cluster hierarchy (themes -> sub-clusters -> records) it navigates coarse-to-fine, scoring its goal against a few dozen centroids and drilling into the matching branch before running a precise retrieval at the leaf. When it beats flat ANN, the navigation loop, the cost math, the honest limits, and how to build it from clustering + composite clustering + a cluster-scoped retriever.

Agentic Retrieval

Clustering

Hierarchical Search

Jul 2026Read guide

Search & Discovery

8 min read

What Is Composite Clustering? Clustering Across Multiple Feature Spaces (and Clusters of Clusters)

Two things people call composite clustering: multi-feature clustering groups documents once using several embedding spaces at once (text + image + audio + faces), with concatenate / independent / weighted combine strategies; composite (cluster-of-clusters) clustering groups the centroids of prior clusterings to reveal how your groupings relate. How each works, the normalization and per-feature-weight traps, how it differs from multi-vector retrieval, and the honest limits (composite is a pattern map, not a per-document cross-tab).

Composite Clustering

Multimodal Clustering

Cluster of Clusters

Jul 2026Read guide

Search & Discovery

8 min read

How Do I Filter Vector Search Results by Location? (Radius, Bounding Box, Polygon)

Filter vector-search and retriever results by geographic location with three operators in an attribute_filter stage — geo_radius (within N meters), geo_bounding_box, and geo_polygon. Exact request shapes, the lat/lon-vs-GeoJSON lon-first gotcha, when to use each operator, combining geo with semantic search and metadata for location-aware RAG, and honest scope (a precise scan over your retrieved set, not a planet-scale geo index).

Geospatial Search

Vector Search

Filters

Jul 2026Read guide

Search & Discovery

9 min read

How Do I Automatically Classify Content Against a Taxonomy?

Auto-classifying images, video, documents, and audio into predefined categories at scale: the four viable methods in 2026 (zero-shot, embedding-similarity, trained head, LLM) and when each wins, how taxonomy classification differs from discovered taxonomies and metadata extraction, classifying non-text content by decomposing signals, taxonomy design rules, and the query-time reclassification pattern that avoids re-paying analysis when categories change.

Taxonomy Classification

Content Classification

Auto-Tagging

Jul 2026Read guide

Search & Discovery

8 min read

Brand Safety vs Brand Suitability: How AI Classifies Video for Advertisers

Safety is a universal floor; suitability is a per-brand tolerance curve over graded risk tiers — and the GARM-style taxonomy remains the reference for both even after the organization wound down. How multimodal classification actually works on video (visual + speech + on-screen text + adjacency), why treatment beats topic, why moderation APIs can't express per-brand tiers, and the query-time reclassification pattern that avoids re-paying analysis.

Brand Safety

Brand Suitability

Content Moderation

Jul 2026Read guide

Search & Discovery

8 min read

How Do I Debug Bad Retrieval Results in RAG and Vector Search?

Bad retrieval fails in one of five layers — corpus, embedding space, query, filters, or fusion — and each layer's failure masquerades as the one below it. A practical debugging methodology: the five-layer checklist, why raw similarity scores mislead, stage-boundary tracing with a real silent-decimation incident, and what a retrieval explain plan should contain.

Retrieval Debugging

RAG

Vector Search

Jul 2026Read guide

Infrastructure

9 min read

Can You Run a Vector Database on S3? Object-Storage-Backed Vector Search, Explained

Object-storage-backed vector search became mainstream in 2026: S3 Vectors went GA, turbopuffer and LanceDB proved the architecture, and costs dropped ~10x versus RAM-resident clusters. How it actually works — immutable segments, coarse partitioning, quantization plus rescoring, cache tiers — when it is the wrong choice, and what bring-your-own-bucket adds.

Vector Databases

Object Storage

Jul 2026Read guide

Search & Discovery

9 min read

How Do I Build a Deep Research Agent Over My Own Data?

Consumer deep research browses the public web — but your evidence lives in PDFs, call recordings, decks, and videos behind your own APIs. The architecture of private-corpus research agents: question decomposition, multimodal retrieval loops, stopping criteria, provenance, and verification, with the retriever stages that implement each step.

Deep Research

Agentic RAG

Retrieval Pipelines

Jul 2026Read guide

Search & Discovery

8 min read

What Concepts Exist in My Data That Nobody Has Labeled Yet?

Some queries cannot be answered by keyword search, vector search, or RAG — because the user does not know the ontology ahead of time. How discovered taxonomies answer questions like 'how does this brand visually represent trust?' and 'what stereotypes recur across our advertising?' with six worked examples.

Discovered Taxonomies

Exploratory Search

Clustering

Jul 2026Read guide

From concept to production

These guides explain how multimodal perception and retrieval actually work. Mixpeek is the platform that runs them — point it at your storage and get back relevant, timestamped results.

Start free Book a demo