Similar

BYO Embeddings Vector Search

Bring pre-computed embeddings from any provider (OpenAI, Cohere, Together, etc.) and upsert them directly into MVS for instant vector search. No feature extractors, no pipelines -- just embeddings in, results out.

text

image

Single Tier

34.2K runs

Run in Builder

"Find DevOps tutorials about container orchestration"

Why This Matters

Skip the managed pipeline entirely when you already have embeddings. MVS gives you production-grade vector search with filtering, hybrid queries, and multi-tenancy without lock-in to any embedding provider.

from openai import OpenAI
from mixpeek import Mixpeek

openai = OpenAI(api_key="your-openai-key")
mvs = Mixpeek(api_key="your-mvs-key")

NAMESPACE = "my-namespace"

# Generate embeddings with any provider
def embed(text: str) -> list[float]:
    resp = openai.embeddings.create(model="text-embedding-3-small", input=text)
    return resp.data[0].embedding

# Upsert documents with pre-computed embeddings
documents = [
    {"text": "How to deploy a Kubernetes cluster", "category": "devops"},
    {"text": "Introduction to neural network architectures", "category": "ml"},
    {"text": "Building REST APIs with FastAPI", "category": "backend"},
]

for doc in documents:
    mvs.namespaces.documents.upsert(
        namespace=NAMESPACE,
        documents=[{
            "dense_embedding": embed(doc["text"]),
            "metadata": {"text": doc["text"], "category": doc["category"]}
        }]
    )

# Search with a query embedding
query = "how to set up container orchestration"
results = mvs.namespaces.documents.search(
    namespace=NAMESPACE,
    query={
        "dense_embedding": embed(query)
    },
    top_k=5
)

for doc in results:
    print(f"{doc['score']:.3f} | {doc['metadata']['text']}")

Feature Extractors

Retriever Stages

limit

Truncate results to a maximum count with optional offset for pagination

reduce

Documentation

MVS Overview Document Upsert Vector Search

Related Recipes & Resources

Explore these related resources to deepen your understanding and discover more powerful features

Recipe

Document Intelligence Search

Extract and search through PDFs, presentations, and documents. Combines OCR, layout analysis, and semantic search for comprehensive document retrieval.

Learn more

Recipe

Multimodal Hybrid Search Pipeline

Combine vector search with keyword search (BM25) across text, images, and video for the most comprehensive multimodal retrieval system.

Learn more

Recipe

Clinical Documentation Structuring

Production-grade pipeline for ingesting clinical documents, scanned charts, EHR exports, wound photos, and therapy notes, and structuring them into coded fields aligned with MDS 3.0, PDPM, and CMS audit requirements. Combines OCR, clinical NER, taxonomy classification, and hybrid retrieval to turn unstructured bedside documentation into queryable, auditable data.

Learn more

Recipe

Hybrid BM25 + Dense Vector Search

Use MVS hybrid search to combine BM25 keyword matching with dense vector similarity. Get the precision of exact keyword matches and the recall of semantic understanding in a single query.

Learn more

Recipe

Multimodal Search with MVS

Build multimodal search by embedding different content types (text, images, video frames) with your own models and searching across them in a single MVS namespace. Use CLIP or any multimodal embedding model for cross-modal retrieval.

Learn more

Recipe

PDF Data Extraction Pipeline

Extract structured data from PDFs including tables, forms, and text. Convert unstructured documents into structured, queryable data.

Learn more