Skip to main content
Bring your own embeddings, upsert them directly, and query with dense, sparse, BM25, or hybrid search. No collections or extractors required.

Quickstart

1

Create a namespace

No schema needed — MVS infers vector dimensions on first write.
curl -X POST "https://api.mixpeek.com/v1/namespaces/standalone" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"namespace_id": "product-search"}'
You can optionally pre-declare vector configs if you want to set a specific distance metric:
curl -X POST "https://api.mixpeek.com/v1/namespaces/standalone" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "namespace_id": "product-search",
    "vector_configs": [
      {"name": "text_embedding", "dimension": 1536, "metric": "dot"}
    ]
  }'
2

Upsert documents with your vectors

curl -X POST "https://api.mixpeek.com/v1/namespaces/product-search/documents/upsert" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [{
      "document_id": "prod-001",
      "vectors": {"text_embedding": [0.12, -0.34, 0.56, "...1536 floats"]},
      "payload": {"title": "Wireless Headphones", "category": "audio", "price": 79.99}
    }]
  }'
3

Create a retriever (one-time)

Querying is unified on retrievers — one query concept whether you bring your own vectors or promote to managed embedding. Create a retriever once; for a standalone namespace it takes the query vector you computed (input_mode: vector). All requests are scoped to the namespace via the X-Namespace header.
curl -X POST "https://api.mixpeek.com/v1/retrievers" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "X-Namespace: product-search" \
  -H "Content-Type: application/json" \
  -d '{
    "retriever_name": "product_search",
    "input_schema": {"query_vector": {"type": "array", "required": true}},
    "stages": [{
      "stage_name": "search",
      "stage_type": "filter",
      "config": {
        "stage_id": "feature_search",
        "parameters": {
          "searches": [{
            "feature_uri": "text_embedding",
            "query": {"input_mode": "vector", "value": "{{INPUT.query_vector}}"},
            "filters": {"must": [{"key": "category", "match": {"value": "audio"}}]},
            "top_k": 10
          }],
          "final_top_k": 10
        }
      }
    }]
  }'
4

Execute the retriever

Run the retriever with your query embedding. Each hit exposes document_id and payload fields.
curl -X POST "https://api.mixpeek.com/v1/retrievers/{retriever_id}/execute" \
  -H "Authorization: Bearer $MIXPEEK_API_KEY" \
  -H "X-Namespace: product-search" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {"query_vector": [0.15, -0.28, 0.44, "...query embedding"]}
  }'

Architecture

Standalone vector store architecture — your pipeline feeds vectors into Mixpeek's sharded storage with dense, BM25, and payload indexes

Standalone vs Managed

Every namespace runs in one of two modes. Start standalone and promote when you’re ready — no reindexing.
StandaloneManaged
Query latencyLower — no embedding at query time+50-200ms for auto-embedding
Embedding costYou pay your provider directlyIncluded in platform pricing
Model flexibilityAny model, any fine-tuneBound to registered inference services
Write pathDirect upsert onlyCollections auto-process + direct upsert
Search inputPre-computed vectors, text (BM25), sparseAlso accepts raw text/URLs (auto-embedded)
Best forExisting ML infra, low-latency, custom modelsEnd-to-end processing, file pipelines
Start standalone if you already have embeddings. Promotion is additive — all existing data is preserved.

Features

All features work identically in standalone and managed modes.
CapabilityDescription
Dense searchVector similarity with cosine, dot product, or euclidean distance
Sparse searchSparse vector queries for learned sparse representations (SPLADE, etc.)
BM25 keyword searchFull-text search on payload fields via text indexes
Hybrid searchCombine dense + BM25 + sparse in one query with RRF or DBSF fusion
Metadata filteringFilter on any payload field — combine with any search type
Payload indexesManual or adaptive — auto-created based on query patterns
Schema-on-writeAuto-create vector indexes on first upsert — no upfront declaration needed
Usage metricsPer-namespace breakdowns of vectors, storage tiers, queries, and writes
Storage tieringHot, cold, archive tiers — see storage tiering
Namespace cloningClone namespaces for testing or environment branching

Billing

MVS pricing is pure usage-based — no per-vector caps, no namespace limits. Tiers gate support level, not features.
ResourcePrice
Storage$0.023 / GB / month
Hot cache$25 / GB / month
Queries$1 / 1M queries
Writes$1 / 1M writes
Your first 1M vectors are free on the Starter tier.
TierMinimumSupport
Starter$0/moCommunity
Growth$50/mo (usage applies toward minimum)Email + SLA
EnterpriseCustomDedicated + SSO + HIPAA
All search features (dense, sparse, BM25, hybrid, adaptive indexes) are available on every tier. See the pricing calculator for cost estimates at scale. Track usage programmatically with the vector-backend usage endpoint (GET /v1/organizations/billing/usage/vector-backend) or view it in the Studio dashboard under Billing.

Next Steps

Namespaces

Vector indexes, metrics, BM25

Documents & Search

Upsert, query, manage

Promote

Standalone → managed

Ready to go beyond BYO vectors? Promote your standalone namespace to managed mode and unlock automatic embedding, file processing pipelines, and enrichment — without reindexing. Your retrievers keep working unchanged — after promotion, the same retriever can auto-embed raw text instead of taking a pre-computed vector. See the migration guide for details. Learn how to promote →