NEWVectors or files. Pick a path.Start →
    Mixpeek Vector Store

    Agent-native vector store on object storage

    Stream shard results, cancel stale searches, cap per-agent spend, and run dense + sparse + BM25 hybrid search from your own S3-compatible storage.

    BYO Storage

    Amazon S3Google Cloud StorageAzure Blob StorageBackblaze B2Cloudflare R2MinIOWasabiAmazon S3Google Cloud StorageAzure Blob StorageBackblaze B2Cloudflare R2MinIOWasabi

    Competitive single-thread latency, scale-to-zero idle cost, and capabilities no other store has.

    Cost proof

    And it's 10-80x cheaper at scale.

    Cost is the closer, not the opener. The model works because idle namespaces scale to zero and vectors persist on your object storage.

    Cost calculatorEstimate your monthly MVS bill based on vector count, dimensions, and usage. All pricing is pay-as-you-go with no upfront commitments.

    Vector dimensionsThe size of each embedding vector. Higher dimensions capture more nuance but use more storage. Common models: 384 (MiniLM), 768 (BERT), 1536 (OpenAI ada-002).
    Number of vectorsTotal documents stored across all namespaces. Each document is a vector embedding plus its metadata payload.
    1M1 shard
    1M10M100M1B10B
    StorageObject storage cost for persisting vector data and metadata. Priced at ~$0.023/GB/mo -- the same rate as S3 Standard.$0.01
    Hot CachePQ-compressed indexes and payload indexes kept in RAM for sub-10ms queries. Payloads are disk-backed (RocksDB) so hot cache is compact -- only ~60MB per million vectors.$2.50
    WritesCost of upsert operations. Includes WAL logging, index updates, and replication to object storage.$0.00
    QueriesSearch queries against your namespaces. Idle namespaces scale to zero -- you only pay when querying.$1/1M queries
    1 shardsShards partition your data across multiple Rust workers for parallel query execution. MVS auto-scales shards as your dataset grows.1,000 namespacesNamespaces are isolated collections within your account. Use them for multi-tenancy, A/B testing, or separating data by environment.
    WorkloadBenchmark scenario: dense ANN search over 768-dimensional vectors with top_k=10 on the configured number of shards.768 dimensions, 1M docs, ~500 MB
    p50Median latency -- 50% of queries complete faster than this. Benchmark: 50K vectors, 768d, concurrency=1.
    6ms50ms
    p9595th percentile -- only 5% of queries are slower. A good measure of consistent performance.
    12ms80ms
    p9999th percentile (tail latency) -- the worst 1% of queries. Critical for SLA guarantees and real-time apps.
    22ms120ms
    Warm namespaceData is cached in memory/SSD. Queries hit the hot cache and return in single-digit milliseconds. This is the default for actively queried namespaces.
    Cold namespaceCentroid index is memory-mapped on local NVMe for instant partition selection. Only the needed PQ codes are fetched from object storage on demand (~50ms). Idle namespaces scale to zero compute cost.
    Dense ANN, 1 shards, top_k=10, 50K vectors

    vs Competitors: 50K vectors, 768d, concurrency=1

    p50 Latency

    Milvus
    3.1ms
    Weaviate
    3.3ms
    MVS
    5.5ms
    Qdrant
    5.5ms
    pgvector
    11.4ms

    QPS (concurrency=10)

    Weaviate
    1,785
    Milvus
    1,728
    MVS
    459
    Qdrant
    360
    LanceDB
    307

    Recall@10

    Milvus
    26.1%
    pgvector
    21.0%
    Weaviate
    17.6%
    MVS
    17.1%
    Qdrant
    16.6%

    MVS stores vectors on your object storage (S3/GCS/B2) with PQ-compressed indexes in RAM. Competitive latency at single-thread, scale-to-zero idle cost, 10-80x cheaper than Pinecone/Weaviate at scale. Full benchmark methodology.

    View full benchmark methodology & results

    Pure Usage-Based Pricing

    No per-vector caps. No namespace limits. Pay only for what you use.

    $0.023

    per GB / month

    Storage

    S3 / GCS / B2 pass-through

    $25

    per GB / month

    Hot Cache

    PQ indexes in RAM, sub-10ms

    $1

    per 1M writes

    Writes

    Upsert, update, delete

    $1

    per 1M queries

    Queries

    Idle namespaces scale to zero

    How MVS compares at scale

    VectorsMVSQdrantPineconeWeaviate
    1MFree$120$500$73
    10M$50$460$2,700$730
    100M$299$1,255$25,000$7,300
    1B$1,999$12,500$26,000$73,000
    5B$7,999Contact salesContact salesContact sales
    10B$14,999Contact salesContact salesContact sales

    Competitor prices from public pricing calculators (768d, ~100 QPS). Qdrant = dedicated cluster; Pinecone = serverless read units; Weaviate = per-dimension pricing. MVS includes scale-to-zero; idle namespaces cost only storage. Benchmark repo.

    Support Tiers

    Same usage rates on every tier. Tiers gate support level, not features.

    Starter

    $0/mo

    First 1M vectors free

    • All search types (dense, sparse, BM25, hybrid)
    • Unlimited namespaces
    • Schema-on-write & adaptive indexes
    • Community support
    Most Popular

    Growth

    $50/mo minimum

    Usage applies toward minimum

    • Everything in Starter
    • Email support + SLA
    • Replication & backups
    • Priority support

    Enterprise

    Custom

    Volume discounts available

    • Everything in Growth
    • Dedicated shards & BYOC
    • SSO, HIPAA, audit logs
    • Dedicated support + SLA

    Managed path

    Need Mixpeek to extract the embeddings too?

    Start with your own vectors in MVS. When you want faces, scenes, transcripts, OCR, or other features generated for the same objects, move that workload to Managed without rebuilding your retrieval layer.

    What Managed adds

    • Feature extractors

      Vision, audio, text -- embeddings generated for you on ingest

    • Automatic indexing

      Upload a file, get searchable vectors without writing pipeline code

    • Pipeline orchestration

      Multi-step processing with branching, retries, and monitoring

    • Webhooks and alerts

      Real-time notifications on ingest completion, anomalies, and threshold triggers

    How to upgrade

    • Same namespace

      Your existing vectors, metadata, and indexes stay exactly where they are

    • No migration

      Managed builds on top of MVS -- your data is already in the right place

    • One click in Studio

      Toggle your namespace from Standalone to Managed in the dashboard

    • Incremental adoption

      Start with one extractor, add more as you need them -- no all-or-nothing switch

    Feature Comparison

    S3 Vectors is a cheap index. MVS is a database with agent-native retrieval. Different tools.

    CapabilityPineconeQdrantS3 VectorsMVS
    Database operations on your vectors
    Dense vector search (ANN)Approximate nearest neighbor search over high-dimensional embeddings. The baseline capability every vector store needs.
    Native dense + sparse + BM25 hybrid searchFuse semantic, sparse, and keyword relevance in one query plan. S3 Vectors is an index; MVS runs hybrid retrieval natively.Dense + sparseDense + sparseDense + sparse + BM25
    Semantic JOINs across namespacesJoin two namespaces by vector similarity, like a SQL JOIN but on embeddings. No denormalization or data duplication required.
    Aggregations (GROUP BY, COUNT, SUM, AVG)Run analytics directly on your vector store. Group documents by metadata fields and compute counts, averages, and sums without ETL.
    Cross-shard transactionsAtomic writes across multiple shards using two-phase commit, so multi-namespace changes remain all-or-nothing.
    Time-travel queriesQuery data as it existed at a past point in time by replaying the write-ahead log. Useful for debugging, auditing, and reproducibility.
    Object storage-native persistenceData lives in your object storage. S3 Vectors gives you a cheap index; MVS adds database operations on the same storage foundation.
    Built for agentic workloads
    Streaming partial resultsGet results as shards respond instead of waiting for every shard. Agents can evaluate early hits and decide whether to refine or cancel.
    Query cancellationCancel in-flight fan-out queries that an agent no longer needs. Freed shard work returns to the pool instead of burning through the loop.
    Per-agent budget limitsEnforce max queries, writes, and compute per agent or API key at the coordinator level. Prevent runaway autonomous loops from running up spend.
    Standing queriesRegister a persistent query that fires a webhook whenever a newly ingested document matches.
    Multi-stage retrieval pipelinesChain retrieval stages such as broad recall, filtering, joining, and reranking in one request instead of wiring round trips in application code.
    Query audit logFull audit trail of every query: who ran it, when, and what was returned. Agents need inspectable retrieval, not black-box calls.

    Quick Start

    Start searching in 60 seconds

    Generate embeddings with any provider. Upsert to MVS. Search.

    from openai import OpenAI
    from mixpeek import Mixpeek
    openai = OpenAI()
    mvs = Mixpeek(api_key="YOUR_KEY")
    # Generate embedding
    resp = openai.embeddings.create(
    model="text-embedding-3-small",
    input="red sports car on a mountain road"
    )
    embedding = resp.data[0].embedding
    # Upsert to MVS
    mvs.namespaces.documents.upsert(
    namespace="ns_demo",
    documents=[{
    "id": "doc_1",
    "dense_embedding": embedding,
    "metadata": {"source": "catalog", "type": "image"}
    }]
    )
    # Search
    results = mvs.namespaces.documents.search(
    namespace="ns_demo",
    queries=[{"vector": embedding, "top_k": 10}]
    )

    Works with any provider that outputs a float array. Bring your own model, your own dimensions.

    API Examples

    Capabilities you will not find in any other vector database.

    Write documents with dense, sparse, and metadata in a single call.

    from mixpeek import Mixpeek
    client = Mixpeek(api_key="YOUR_KEY")
    client.namespaces.upsert(
    namespace="products",
    documents=[
    {
    "id": "doc-001",
    "dense_embedding": [0.12, -0.34, ...], # 768-d
    "sparse_embedding": {"tokens": [1204, 879], "weights": [0.9, 0.4]},
    "metadata": {"category": "electronics", "price": 299.99},
    "text": "Noise-cancelling wireless headphones"
    }
    ]
    )

    How it works

    Watch a request flow through the architecture.

    Client

    SDK / REST API

    Ray Coordinator

    1 per namespace · consistent-hash routing

    gRPC fan-outQuery planning
    Rust Shard 0~8ms
    LIRE ANNTantivy BM25SparsePayload
    Rust Shard 1~8ms
    LIRE ANNTantivy BM25SparsePayload
    Cold Shard~50ms

    Snapshot on S3

    Your Object Storage

    Any S3-compatible object store

    Canonical storeWAL durabilityYou own the data

    Free Proof of Concept

    See it working on your data

    Book a 60-minute architecture review. We'll run MVS on your actual workload and benchmark it against your current vector database.

    ES

    Ethan Steininger

    Founder & CEO, Mixpeek

    In 60 minutes, you will get:

    Live migration of your vectors to MVS on your own object storage
    Side-by-side latency and cost comparison vs your current setup
    Multi-stage retrieval pipeline built for your use case
    Tiered storage plan: what stays hot, what moves to warm
    A concrete POC timeline with success criteria

    10-80x

    Lower cost range
    at scale

    < 1hr

    Typical migration
    for 100M vectors

    10B+

    Vectors supported
    per index

    Benchmark, stated plainly

    Competitive single-thread latency, scale-to-zero idle cost, and database operations the cheap object-storage indexes do not expose.

    Review the benchmark methodology

    Need Mixpeek to extract the embeddings too?

    Use MVS for your own vectors. When you need faces, scenes, transcripts, OCR, or multimodal embeddings generated from the same objects, move that workload to Managed without rebuilding retrieval.

    Already stitching extractors, indexes, and retrievers together? Start with MVS, then climb to Managed