Mixpeek Logo
    Login / Signup
    Now in Public Beta

    The vector database built on object storage

    Dense + sparse + BM25 hybrid search, aggregations, transactions, and automatic storage tiering. 10x cheaper than legacy vector DBs.

    ~8ms hot search, 50K+ writes/s, and 10K+ queries/s in prod

    Cost calculatorEstimate your monthly MVS bill based on vector count, dimensions, and usage. All pricing is pay-as-you-go with no upfront commitments.

    Vector dimensionsThe size of each embedding vector. Higher dimensions capture more nuance but use more storage. Common models: 384 (MiniLM), 768 (BERT), 1536 (OpenAI ada-002).
    Number of vectorsTotal documents stored across all namespaces. Each document is a vector embedding plus its metadata payload.
    1M1 shard
    1M10M100M1B10B
    StorageObject storage cost for persisting vector data and metadata. Priced at ~$0.023/GB/mo -- the same rate as S3 Standard.$0.01
    MemoryRAM used for hot caching frequently accessed vectors and indexes. Enables sub-10ms query latency on warm namespaces.$0.00
    WritesCost of upsert operations. Includes WAL logging, index updates, and replication to object storage.$0.00
    QueriesSearch queries against your namespaces. First 100K queries per month are included free with every plan.100K/mo included
    1 shardsShards partition your data across multiple Rust workers for parallel query execution. MVS auto-scales shards as your dataset grows.1,000 namespacesNamespaces are isolated collections within your account. Use them for multi-tenancy, A/B testing, or separating data by environment.
    WorkloadBenchmark scenario: dense ANN search over 768-dimensional vectors with top_k=10 on the configured number of shards.768 dimensions, 1M docs, ~500 MB
    p50Median latency -- 50% of queries complete faster than this. Represents the typical user experience.
    8ms543ms
    p9090th percentile -- only 10% of queries are slower. A good measure of consistent performance.
    10ms612ms
    p9999th percentile (tail latency) -- the worst 1% of queries. Critical for SLA guarantees and real-time apps.
    35ms854ms
    Warm namespaceData is cached in memory/SSD. Queries hit the hot cache and return in single-digit milliseconds. This is the default for actively queried namespaces.
    Cold namespaceData lives in object storage and must be fetched on demand. First query warms the cache -- subsequent queries are fast. Ideal for rarely accessed data at minimal cost.
    Approach (1 shards with top_k=10)
    View full benchmark methodology & results

    Scale Tiers

    VectorsShardsRAMObj. StorageMVS/moSavings
    1M130 MB480 MBFreeFree
    10M1300 MB4.8 GB$8068%Competitor pricingQdrant$500Pinecone$700Weaviate$250MVS$80
    100M103.0 GB48 GB$80077%Competitor pricingQdrant$5,000Pinecone$7,000Weaviate$3,500MVS$800
    1B10030 GB480 GB$3,50092%Competitor pricingQdrant$75,000Pinecone$80,000Weaviate$45,000MVS$3,500
    5B50075 GB2.4 TB$15,000Only MVS
    10B1,000150 GB4.8 TB$30,000Only MVS

    Feature Comparison

    MVS vs the leading vector databases. Rows highlighted in purple are MVS-exclusive capabilities.

    CapabilityPineconeQdrantTurbopufferMVS
    Search
    Dense vector search (ANN)Approximate nearest neighbor search over high-dimensional embeddings. The foundation of semantic search -- find results by meaning, not keywords.
    Sparse vector searchSearch using sparse vectors like SPLADE or learned sparse embeddings. Captures keyword-level signals that dense vectors miss.
    BM25 full-text searchClassic keyword search built on an inverted index. MVS uses Tantivy natively -- no workarounds or external engines needed.SPLADE workaroundNative Tantivy
    Multi-dense (ColBERT)Late-interaction retrieval that stores per-token embeddings for higher recall. Enables token-level matching without collapsing to a single vector.
    Hybrid search (RRF/DBSF fusion)Combine dense, sparse, and keyword results into a single ranked list using Reciprocal Rank Fusion or Distribution-Based Score Fusion.
    Multi-stage retrieval pipelinesChain retrieval stages -- e.g. broad recall with ANN, then re-rank with a cross-encoder -- in a single query. Reduces latency vs round-trips.
    Standing queries (push on match)Register a persistent query that fires a webhook whenever a newly ingested document matches. Useful for alerting, monitoring, and real-time feeds.
    Semantic JOINs across namespacesJoin two namespaces by vector similarity -- like a SQL JOIN but on embeddings. No denormalization or data duplication required.
    Data Operations
    Aggregation (GROUP BY, COUNT, SUM, AVG)Run analytics directly on your vector store. Group documents by metadata fields and compute counts, averages, sums -- no ETL to a data warehouse.
    Cross-shard transactions (2PC)Atomic writes across multiple shards using two-phase commit. Ensures all-or-nothing consistency even at billion-scale datasets.
    Optimistic concurrency (_version)Prevent write conflicts with version-based optimistic locking. Critical for multi-writer workloads where two processes might update the same document.
    Change streams (WAL-tailing, SSE)Subscribe to real-time insert/update/delete events via Server-Sent Events. Build reactive pipelines without polling your database.
    Time-travel queries (WAL replay)Query your data as it existed at a past point in time by replaying the write-ahead log. Useful for debugging, auditing, and reproducibility.
    Document version historyEvery mutation is versioned. Roll back a document to any prior state or diff two versions to see exactly what changed.
    Query audit logFull audit trail of every query executed -- who ran it, when, and what was returned. Essential for compliance and debugging in production.
    Reliability & Governance
    Storage tiering (hot/cold/archive)Automatically move infrequently accessed data from memory/SSD to object storage. Cut costs without manual data management.Automatic, object storage-backed
    Retention policiesSet TTLs on documents or namespaces. Data is automatically purged after the retention window -- no cron jobs or manual cleanup.
    Namespace catalog (INFORMATION_SCHEMA)Discover all namespaces, their schemas, row counts, and storage usage via a system catalog. Like INFORMATION_SCHEMA in SQL databases.
    Multi-tenant isolation (noisy neighbor)Resource isolation between tenants prevents one workload from starving others. Each namespace has independent rate limits and resource quotas.
    Priority lanes (QoS scheduling)Assign CRITICAL/NORMAL/BACKGROUND/BULK priority to requests. Higher-priority queries get reserved compute slots and preempt lower-priority work in the shard queue.
    Idempotent operationsEvery write accepts an idempotency key. Retries from crashes or network timeouts are automatically deduplicated -- no duplicate documents, no double-counted aggregations.
    Distributed execution tracesFull distributed trace for every query -- coordinator routing, per-shard timing, filter selectivity, index hits. Debug multi-hop requests across the entire fan-out path.
    Agentic Workloads
    Streaming partial results (SSE)Get results as shards respond instead of waiting for all shards. Agents evaluate early hits and decide whether to refine or cancel -- the tight feedback loop pattern that defines agentic retrieval.
    Query cancellation (cooperative termination)Cancel in-flight fan-out queries that are no longer needed. When an agent fires 5 parallel searches and gets an answer from the first, the other 4 are terminated at the shard level, freeing compute instantly.
    Per-agent budget limitsEnforce max queries, writes, and compute per agent or API key at the coordinator level. Prevents runaway autonomous loops -- the specific failure mode where an LLM in a loop issues unbounded queries.
    Infrastructure
    Object storage-native (no separate DB to manage)Data lives in your object storage (S3, GCS, Azure Blob). No separate database cluster to provision, back up, or scale -- just point MVS at your bucket.
    Self-hosted optionDeploy MVS in your own VPC or on-prem. Full control over data residency, network policies, and infrastructure -- no vendor lock-in.OSS

    API Examples

    Capabilities you will not find in any other vector database.

    Write documents with dense, sparse, and metadata in a single call.

    from mixpeek import Mixpeek
    client = Mixpeek(api_key="YOUR_KEY")
    client.namespaces.upsert(
    namespace="products",
    documents=[
    {
    "id": "doc-001",
    "dense_embedding": [0.12, -0.34, ...], # 768-d
    "sparse_embedding": {"tokens": [1204, 879], "weights": [0.9, 0.4]},
    "metadata": {"category": "electronics", "price": 299.99},
    "text": "Noise-cancelling wireless headphones"
    }
    ]
    )

    Ready to scale to billions of vectors?

    Start with 1M vectors free. No credit card required. Deploy in minutes.