Backblaze B2 + Mixpeek
Store objects in Backblaze B2. Process and index with Mixpeek. Write results back to B2 — no egress fees.
Backblaze B2 Cloud Storage is S3-compatible object storage at 1/5th the cost of AWS S3, with free egress through Bandwidth Alliance partners. Combined with Mixpeek, you get a complete pipeline for storing, processing, and retrieving multimodal data — with Backblaze B2 as both the source and destination.
Why Backblaze B2
Reference Architecture
Backblaze B2 as source and destination — Mixpeek processes in between.

How it works
Four steps to a complete multimodal processing pipeline with Backblaze B2 as your storage layer.
Store assets in Backblaze B2
Upload your brand assets, product images, video content, and reference materials to a B2 bucket using the S3-compatible API. Backblaze stores them at $6/TB/month — 75% less than AWS S3.
Connect B2 as a Mixpeek bucket source
Point a Mixpeek bucket at your B2 endpoint. Mixpeek reads objects via the S3-compatible API — no code changes needed. Set up a collection with the extractors you need (face detection, logo recognition, embeddings).
Mixpeek processes and indexes everything
Mixpeek's Ray GPU clusters extract features, generate embeddings, and build searchable indexes. For copyright detection: face identity matching, logo recognition, and visual similarity scoring all run automatically.
Results stored back in Backblaze B2 via MVS
Processed results, scan reports, and vector indexes are written back to Backblaze B2 through Mixpeek Vector Store (MVS). Your data stays in B2 end-to-end — source objects and processed output in one place, with zero egress fees.
Primary use case
Copyright & Trademark Infringement Detection
Store brand reference assets (logos, product images, celebrity faces) in B2. Mixpeek scans incoming content against your reference library using face identity matching, logo detection, and visual similarity. Flagged violations and scan reports are written back to B2. The entire pipeline — storage, processing, and results — runs on Backblaze + Mixpeek with no egress fees.
Storage cost comparison
See how Backblaze B2 compares to traditional cloud storage for AI workloads.
| Item | Backblaze B2 | AWS S3 |
|---|---|---|
| Storage (per TB/month) | $6 | $23 |
| Egress (per GB) | Free via CDN partners | $0.09 |
| API transactions (per 10K) | $0.004 | $0.005 |
| Minimum storage duration | None | 30 days (varies by tier) |
| Retrieval fees | None | Varies by tier |
Pricing as of 2026. Check partner websites for current rates.
Quick start
Connect Backblaze B2 to Mixpeek and start processing in minutes.
from mixpeek import Mixpeek
client = Mixpeek(api_key="YOUR_API_KEY")
# 1. Connect Backblaze B2 as source bucket
bucket = client.buckets.create(
namespace_id="ns_brand_protection",
source="s3://brand-assets/",
credentials={
"endpoint_url": "https://s3.us-west-004.backblazeb2.com",
"access_key_id": "YOUR_B2_KEY_ID",
"secret_access_key": "YOUR_B2_APP_KEY"
}
)
# 2. Define extractors for copyright detection
collection = client.collections.create(
namespace_id="ns_brand_protection",
bucket_id=bucket.id,
extractors=[
{"type": "face_identity"},
{"type": "logo_detection"},
{"type": "image_embedding"},
{"type": "video_keyframe"},
]
)
# 3. Scan incoming content against reference library
results = client.retrievers.execute(
namespace_id="ns_brand_protection",
stages=[
{
"type": "feature_search",
"method": "hybrid",
"query": {"url": "s3://incoming-content/suspect-ad.mp4"},
"limit": 20
},
{"type": "rerank", "model": "cross-encoder", "limit": 5}
]
)
# Matches flagged — results written back to B2 via MVS
for match in results:
print(f"{match.modality}: {match.content[:60]} (score: {match.score})")Frequently Asked Questions
How does Mixpeek connect to Backblaze B2?
Mixpeek connects to Backblaze B2 through its S3-compatible API. You provide your B2 endpoint URL, Key ID, and Application Key when creating a Mixpeek bucket. From there, Mixpeek reads objects directly from B2 — no data migration or special connectors needed.
Do I pay egress fees when Mixpeek reads from Backblaze B2?
Backblaze B2 offers free egress up to 3x your stored data volume per month through their Bandwidth Alliance partnerships. For most workloads, this means effectively zero egress fees. Even outside the free tier, B2 egress is $0.01/GB vs $0.09/GB on AWS S3.
Can processed results be stored back in Backblaze B2?
Yes. Mixpeek Vector Store (MVS) can write processed results, embeddings, and scan reports back to Backblaze B2. This means your source objects and processed output live in the same storage layer — B2 end-to-end.
What is the copyright detection use case?
Brands store reference assets (logos, celebrity faces, product images) in B2. Mixpeek indexes these with face identity, logo detection, and visual similarity extractors. When new content arrives, Mixpeek scans it against the reference library and flags potential trademark or copyright violations. See copyright.mixpeek.com for a live demo.
How much cheaper is Backblaze B2 compared to AWS S3?
Backblaze B2 storage costs $6/TB/month vs $23/TB/month on AWS S3 Standard — roughly 75% cheaper. Egress is free through CDN partners (Cloudflare, Fastly, Bunny) vs $0.09/GB on S3. There are also no minimum storage duration fees and no retrieval fees.
Is Backblaze B2 S3-compatible?
Yes. Backblaze B2 provides a fully S3-compatible API. You use the same AWS SDKs and tools — just change the endpoint URL to your B2 region (e.g., s3.us-west-004.backblazeb2.com). Mixpeek treats B2 buckets identically to AWS S3 buckets.
