Mixpeek Logo
    Login / Signup

    What is Storage Tiering

    Storage Tiering - Automatic lifecycle management that moves vector data between hot, warm, and cold storage tiers based on query frequency and cost targets.

    Storage tiering is what makes a multimodal data warehouse economically viable at scale. Instead of keeping all vectors in expensive, always-hot memory (as vector databases do), a warehouse automatically moves infrequently queried data to cheaper storage tiers while keeping it searchable. This mirrors how structured data warehouses tier between compute-optimized and storage-optimized layers.

    The Three Tiers

    • Hot tier: In-memory vector engine (e.g., Qdrant). ~10ms latency. Used for actively queried collections. Highest cost per vector.
    • Warm tier: MVS (Mixpeek Vector Store) on S3-compatible object storage. ~100ms latency. 90% cheaper than hot. Still fully searchable. Works with AWS S3, Backblaze B2, Tigris, Cloudflare R2, Wasabi.
    • Cold tier: Archive storage. Minutes to rehydrate. Lowest cost. Used for compliance, backup, and rare-access data.

    How It Works

    Lifecycle rules define when collections move between tiers based on query frequency, age, or manual policy. A collection that was hot last month but hasn't been queried in 2 weeks can automatically move to warm, still searchable at ~100ms but 90% cheaper. If query traffic returns, the collection rehydrates to hot automatically.

    Why It Matters

    • At scale (100M+ vectors), always-hot storage is prohibitively expensive
    • Most collections follow a power-law query pattern: 20% of collections handle 80% of queries
    • Tiering lets you index everything without paying hot-tier prices for cold data
    • Object storage providers (Backblaze, Tigris, R2, Wasabi) make warm-tier extremely affordable

    Related Pages

    • MVS (Mixpeek Vector Store): /mvs
    • Architecture: /docs/overview/architecture
    • Blog: Why Vector Databases Aren't Enough - /blog/why-vector-databases-arent-enough