S3-compatible AI pipelines at 1/5th the storage cost
Connect Backblaze B2 buckets to Mixpeek for automatic multimodal extraction at a fraction of AWS S3 prices. Store your videos, images, and documents in B2, run feature extractors and embeddings through Mixpeek, and write indexed results back to B2 — with zero egress fees through Bandwidth Alliance partners.

Teams building multimodal AI pipelines hit a cost wall fast. AWS S3 charges $23/TB/month for storage and $0.09/GB for egress — costs that compound quickly when you're storing terabytes of video, images, and documents, then moving them to processing infrastructure. A 50TB media library costs $1,150/month just to store, and every extraction run that pulls data out of S3 adds egress fees on top. Teams end up choosing between processing everything they need and staying within budget.
Mixpeek connects directly to Backblaze B2 via the S3-compatible API — same SDKs, same tools, no code changes. B2 stores your data at $6/TB/month (75% less than S3) with free egress through Bandwidth Alliance partners like Cloudflare. Mixpeek reads objects from your B2 buckets, runs multimodal extractors — visual embeddings, object detection, face recognition, OCR, and transcription — then indexes everything into retrievers. Processed results and vector indexes are written back to B2 through Mixpeek Vector Store, keeping your entire pipeline on low-cost infrastructure end-to-end.
What teams see after connecting Mux to Mixpeek
75% lower storage costs
$6/TB/month on B2 vs $23/TB on AWS S3, saving $850/month on a 50TB library
Zero egress fees
free data transfer through Bandwidth Alliance partners (Cloudflare, Fastly, Bunny CDN)
No code changes
S3-compatible API means existing SDKs and tools work out of the box with B2
Same-hour setup
connect a B2 bucket, configure extractors, and start processing in under 60 minutes
End-to-end B2 pipeline
source objects, extracted features, and vector indexes all stored on Backblaze
Parallel extraction at scale
Ray GPU clusters process thousands of assets concurrently across your entire library
Hover over each step to see how the components connect
B2 Bucket Connection
S3-Compatible API
Connect your Backblaze B2 bucket to Mixpeek using the S3-compatible API. Same endpoint format, same SDKs — just point to your B2 region (e.g., s3.us-west-004.backblazeb2.com).
Object Discovery
Include Patterns
Mixpeek scans your B2 bucket and applies include patterns to select which objects to process. Filter by file extension, path prefix, or naming convention.
Multimodal Extraction
Extractors
Selected objects are processed through parallel extractors: visual embeddings, object detection, face identity, OCR, speech transcription, and scene splitting — running on Ray GPU clusters.
Feature Indexing
Collections
Extracted features are stored in Mixpeek collections with full lineage back to the source B2 object, including bucket, key, and extraction metadata.
Search Retriever
Feature Search + Filters
A retriever combines vector similarity, face identity matching, metadata filters, and full-text search. Query across all extracted features from a single API call.
Results to B2
Mixpeek Vector Store
Processed results, vector indexes, and scan reports are written back to Backblaze B2 via Mixpeek Vector Store. Your data stays on B2 end-to-end — zero egress fees.
Point a Mixpeek connector at your B2 bucket endpoint using the S3-compatible API. Mixpeek treats B2 buckets identically to AWS S3 — no adapter code, no migration. Set up collections with the extractors you need, configure include patterns to control which objects get processed, and Mixpeek handles the rest. New objects added to B2 are detected and processed automatically. The pipeline decomposes each asset into extracted features — scene compositions, detected objects, recognized faces, on-screen text, and transcribed speech — then indexes everything into a retriever with feature search and metadata filtering. Batch processing runs across your entire library in parallel on Ray GPU clusters, and results are written back to B2 via Mixpeek Vector Store with full lineage tracking.
Get started with Mixpeek + Backblaze B2 in minutes. Read the docs, create a free account, or schedule a walkthrough with our team.