Mixpeek Logo
    Login / Signup
    Backblaze B2 Cloud Storage logo

    Backblaze B2 + Mixpeek

    Store objects in Backblaze B2. Process and index with Mixpeek. Write results back to B2 — no egress fees.

    Backblaze B2 Cloud Storage is S3-compatible object storage at 1/5th the cost of AWS S3, with free egress through Bandwidth Alliance partners. Combined with Mixpeek, you get a complete pipeline for storing, processing, and retrieving multimodal data — with Backblaze B2 as both the source and destination.

    Why Backblaze B2

    S3-compatible API — drop-in replacement, same SDKs
    Free egress via Bandwidth Alliance (Cloudflare, Fastly, Bunny CDN)
    $6/TB/month storage vs $23/TB on AWS S3
    11 nines (99.999999999%) durability
    No minimum file size, no minimum storage duration, no retrieval fees
    Object Lock for compliance and immutability

    Reference Architecture

    Backblaze B2 as source and destination — Mixpeek processes in between.

    Backblaze B2
    Source Bucket
    Brand logos & images
    Celebrity face references
    Video content library
    S3-compatible API
    Mixpeek
    Mixpeek
    Processing Pipeline
    Face identity matching
    Logo detection
    Visual similarity scoring
    MVS
    Backblaze B2
    Destination (MVS)
    Vector indexes
    Scan reports
    Processed metadata
    Zero egress fees — data stays in Backblaze B2 end-to-end

    How it works

    Four steps to a complete multimodal processing pipeline with Backblaze B2 as your storage layer.

    1

    Store assets in Backblaze B2

    Upload your brand assets, product images, video content, and reference materials to a B2 bucket using the S3-compatible API. Backblaze stores them at $6/TB/month — 75% less than AWS S3.

    2

    Connect B2 as a Mixpeek bucket source

    Point a Mixpeek bucket at your B2 endpoint. Mixpeek reads objects via the S3-compatible API — no code changes needed. Set up a collection with the extractors you need (face detection, logo recognition, embeddings).

    3

    Mixpeek processes and indexes everything

    Mixpeek's Ray GPU clusters extract features, generate embeddings, and build searchable indexes. For copyright detection: face identity matching, logo recognition, and visual similarity scoring all run automatically.

    4

    Results stored back in Backblaze B2 via MVS

    Processed results, scan reports, and vector indexes are written back to Backblaze B2 through Mixpeek Vector Store (MVS). Your data stays in B2 end-to-end — source objects and processed output in one place, with zero egress fees.

    Primary use case

    Copyright & Trademark Infringement Detection

    Store brand reference assets (logos, product images, celebrity faces) in B2. Mixpeek scans incoming content against your reference library using face identity matching, logo detection, and visual similarity. Flagged violations and scan reports are written back to B2. The entire pipeline — storage, processing, and results — runs on Backblaze + Mixpeek with no egress fees.

    Storage cost comparison

    See how Backblaze B2 compares to traditional cloud storage for AI workloads.

    ItemBackblaze B2AWS S3
    Storage (per TB/month)$6$23
    Egress (per GB)Free via CDN partners$0.09
    API transactions (per 10K)$0.004$0.005
    Minimum storage durationNone30 days (varies by tier)
    Retrieval feesNoneVaries by tier

    Pricing as of 2026. Check partner websites for current rates.

    Quick start

    Connect Backblaze B2 to Mixpeek and start processing in minutes.

    backblaze_pipeline.py
    from mixpeek import Mixpeek
    
    client = Mixpeek(api_key="YOUR_API_KEY")
    
    # 1. Connect Backblaze B2 as source bucket
    bucket = client.buckets.create(
        namespace_id="ns_brand_protection",
        source="s3://brand-assets/",
        credentials={
            "endpoint_url": "https://s3.us-west-004.backblazeb2.com",
            "access_key_id": "YOUR_B2_KEY_ID",
            "secret_access_key": "YOUR_B2_APP_KEY"
        }
    )
    
    # 2. Define extractors for copyright detection
    collection = client.collections.create(
        namespace_id="ns_brand_protection",
        bucket_id=bucket.id,
        extractors=[
            {"type": "face_identity"},
            {"type": "logo_detection"},
            {"type": "image_embedding"},
            {"type": "video_keyframe"},
        ]
    )
    
    # 3. Scan incoming content against reference library
    results = client.retrievers.execute(
        namespace_id="ns_brand_protection",
        stages=[
            {
                "type": "feature_search",
                "method": "hybrid",
                "query": {"url": "s3://incoming-content/suspect-ad.mp4"},
                "limit": 20
            },
            {"type": "rerank", "model": "cross-encoder", "limit": 5}
        ]
    )
    
    # Matches flagged — results written back to B2 via MVS
    for match in results:
        print(f"{match.modality}: {match.content[:60]} (score: {match.score})")

    Frequently Asked Questions

    How does Mixpeek connect to Backblaze B2?

    Mixpeek connects to Backblaze B2 through its S3-compatible API. You provide your B2 endpoint URL, Key ID, and Application Key when creating a Mixpeek bucket. From there, Mixpeek reads objects directly from B2 — no data migration or special connectors needed.

    Do I pay egress fees when Mixpeek reads from Backblaze B2?

    Backblaze B2 offers free egress up to 3x your stored data volume per month through their Bandwidth Alliance partnerships. For most workloads, this means effectively zero egress fees. Even outside the free tier, B2 egress is $0.01/GB vs $0.09/GB on AWS S3.

    Can processed results be stored back in Backblaze B2?

    Yes. Mixpeek Vector Store (MVS) can write processed results, embeddings, and scan reports back to Backblaze B2. This means your source objects and processed output live in the same storage layer — B2 end-to-end.

    What is the copyright detection use case?

    Brands store reference assets (logos, celebrity faces, product images) in B2. Mixpeek indexes these with face identity, logo detection, and visual similarity extractors. When new content arrives, Mixpeek scans it against the reference library and flags potential trademark or copyright violations. See copyright.mixpeek.com for a live demo.

    How much cheaper is Backblaze B2 compared to AWS S3?

    Backblaze B2 storage costs $6/TB/month vs $23/TB/month on AWS S3 Standard — roughly 75% cheaper. Egress is free through CDN partners (Cloudflare, Fastly, Bunny) vs $0.09/GB on S3. There are also no minimum storage duration fees and no retrieval fees.

    Is Backblaze B2 S3-compatible?

    Yes. Backblaze B2 provides a fully S3-compatible API. You use the same AWS SDKs and tools — just change the endpoint URL to your B2 region (e.g., s3.us-west-004.backblazeb2.com). Mixpeek treats B2 buckets identically to AWS S3 buckets.

    Get started with Backblaze B2 + Mixpeek

    Store your objects in Backblaze B2. Process with Mixpeek. Write results back. Zero egress fees.