Vector Store - Mixpeek

Bring your own embeddings, upsert them directly, and query with dense, sparse, BM25, or hybrid search. No collections or extractors required.

Quickstart

Create a namespace

No schema needed — MVS infers vector dimensions on first write.

curl -X POST "https://api.mixpeek.com/v1/namespaces/standalone" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"namespace_id": "product-search"}'

You can optionally pre-declare vector configs if you want to set a specific distance metric:

curl -X POST "https://api.mixpeek.com/v1/namespaces/standalone" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "namespace_id": "product-search",
    "vector_configs": [
      {"name": "text_embedding", "dimension": 1536, "metric": "dot"}
    ]
  }'

Upsert documents with your vectors

curl -X POST "https://api.mixpeek.com/v1/namespaces/product-search/documents/upsert" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [{
      "document_id": "prod-001",
      "vectors": {"text_embedding": [0.12, -0.34, 0.56, "...1536 floats"]},
      "payload": {"title": "Wireless Headphones", "category": "audio", "price": 79.99}
    }]
  }'

Create a retriever (one-time)

Querying is unified on retrievers — one query concept whether you bring your own vectors or promote to managed embedding. Create a retriever once; for a standalone namespace it takes the query vector you computed (input_mode: vector). All requests are scoped to the namespace via the X-Namespace header.

curl -X POST "https://api.mixpeek.com/v1/retrievers" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "X-Namespace: product-search" \
  -H "Content-Type: application/json" \
  -d '{
    "retriever_name": "product_search",
    "input_schema": {"query_vector": {"type": "array", "required": true}},
    "stages": [{
      "stage_name": "search",
      "stage_type": "filter",
      "config": {
        "stage_id": "feature_search",
        "parameters": {
          "searches": [{
            "feature_uri": "text_embedding",
            "query": {"input_mode": "vector", "value": "{{INPUT.query_vector}}"},
            "filters": {"must": [{"key": "category", "match": {"value": "audio"}}]},
            "top_k": 10
          }],
          "final_top_k": 10
        }
      }
    }]
  }'

Execute the retriever

Run the retriever with your query embedding. Each hit exposes document_id and payload fields.

curl -X POST "https://api.mixpeek.com/v1/retrievers/{retriever_id}/execute" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "X-Namespace: product-search" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {"query_vector": [0.15, -0.28, 0.44, "...query embedding"]}
  }'

Architecture

Standalone vs Managed

Every namespace runs in one of two modes. Start standalone and promote when you’re ready — no reindexing.

	Standalone	Managed
Query latency	Lower — no embedding at query time	+50-200ms for auto-embedding
Embedding cost	You pay your provider directly	Included in platform pricing
Model flexibility	Any model, any fine-tune	Bound to registered inference services
Write path	Direct upsert only	Collections auto-process + direct upsert
Search input	Pre-computed vectors, text (BM25), sparse	Also accepts raw text/URLs (auto-embedded)
Best for	Existing ML infra, low-latency, custom models	End-to-end processing, file pipelines

Start standalone if you already have embeddings. Promotion is additive — all existing data is preserved.

Features

All features work identically in standalone and managed modes.

Capability	Description
Dense search	Vector similarity with cosine, dot product, or euclidean distance
Sparse search	Sparse vector queries for learned sparse representations (SPLADE, etc.)
BM25 keyword search	Full-text search on payload fields via text indexes
Hybrid search	Combine dense + BM25 + sparse in one query with RRF or DBSF fusion
Metadata filtering	Filter on any payload field — combine with any search type
Payload indexes	Manual or adaptive — auto-created based on query patterns
Schema-on-write	Auto-create vector indexes on first upsert — no upfront declaration needed
Usage metrics	Per-namespace breakdowns of vectors, storage tiers, queries, and writes
Storage tiering	Hot, cold, archive tiers — see storage tiering
Namespace cloning	Clone namespaces for testing or environment branching

Billing

MVS pricing is pure usage-based — no per-vector caps, no namespace limits. Tiers gate support level, not features.

Resource	Price
Storage	$0.023 / GB / month
Hot cache	$25 / GB / month
Queries	$1 / 1M queries
Writes	$1 / 1M writes

Your first 1M vectors are free on the Starter tier.

Tier	Minimum	Support
Starter	$0/mo	Community
Growth	$50/mo (usage applies toward minimum)	Email + SLA
Enterprise	Custom	Dedicated + SSO + HIPAA

All search features (dense, sparse, BM25, hybrid, adaptive indexes) are available on every tier. See the pricing calculator for cost estimates at scale. Track usage programmatically with the vector-backend usage endpoint (GET /v1/organizations/billing/usage/vector-backend) or view it in the Studio dashboard under Billing.

Next Steps

Namespaces

Vector indexes, metrics, BM25

Documents & Search

Upsert, query, manage

Promote

Standalone → managed

Ready to go beyond BYO vectors? Promote your standalone namespace to managed mode and unlock automatic embedding, file processing pipelines, and enrichment — without reindexing. Your retrievers keep working unchanged — after promotion, the same retriever can auto-embed raw text instead of taking a pre-computed vector. See the migration guide for details. Learn how to promote →

​Quickstart

​Architecture

​Standalone vs Managed

​Features

​Billing

​Next Steps

Namespaces

Documents & Search

Promote

Quickstart

Architecture

Standalone vs Managed

Features

Billing

Next Steps