embedding

Together AI

Generate embeddings with Together AI's open-source models and upsert directly to MVS

Generate embeddings with 50+ open-source models hosted on Together AI (BGE, Llama, Mistral, and more) then upsert directly to MVS. No GPUs to manage. Swap models by changing one parameter. Go from text to searchable vectors in three API calls.

Read the Docs Start Building Schedule Walkthrough

Measurable impact from day one

What teams see after connecting Together AI to Mixpeek

GPU infrastructure cost

Together AI hosts the models: no GPU provisioning, no autoscaling, no model serving to manage

50+

Open-source models available

Access BGE, Llama, Mistral, and dozens of other embedding models from a single API

<5 min

Time to first search

Generate embeddings, upsert to MVS, configure a retriever, and run your first query in under 5 minutes

<100ms

End-to-end query latency

Together AI embedding generation plus MVS vector search in under 100ms for production workloads

Infrastructure to manage

No GPUs, no vector database ops, no index rebuilds: Together AI and MVS handle everything

1 param

To swap models

Change the model name in your Together AI call to test a new embedding model: same pipeline, same MVS collection schema

The Problem

Teams building semantic search need to generate embeddings, store them, index them, and wire up retrieval: all as separate infrastructure problems. Running open-source embedding models means provisioning GPUs, managing model versions, and handling autoscaling. Even after generating embeddings, you still need a vector database, an indexing pipeline, and a query layer. The result is a fragile stack of services that takes weeks to stand up and constant maintenance to keep running.

The Solution

Together AI handles embedding generation with hosted open-source models: no GPUs to manage, no model serving to configure. Mixpeek Vector Store handles everything after: vector storage, indexing, metadata filtering, and retrieval. Generate embeddings with a single Together AI API call, upsert them to MVS, and query through Mixpeek retrievers. The entire pipeline from text to searchable vector takes three API calls and zero infrastructure setup.

Pipeline Architecture

Hover over each step to see how the components connect

Together AI Embedding

Model Selection

Choose from open-source embedding models hosted on Together AI: BGE, Llama embeddings, Mistral embed, and more. Generate vectors via a single API call with the OpenAI-compatible format.

Vector Upsert

MVS Collection

Upsert the embedding vector along with metadata (document ID, source, timestamps, custom fields) into a Mixpeek Vector Store collection. MVS handles indexing automatically.

Index Configuration

Automatic Indexing

MVS builds and maintains vector indexes as you upsert. No manual index creation, no rebuild triggers: vectors are searchable within seconds of upsert.

Retriever Setup

Feature Search

Configure a Mixpeek retriever with feature search stages that query the MVS collection. Add metadata filters, full-text search, or reranking stages as needed.

Query Pipeline

Search API

Send a query to the Mixpeek retriever. The query is embedded using the same Together AI model, then searched against the MVS index with configurable similarity thresholds and filters.

Model Iteration

A/B Testing

Swap Together AI models by changing a single parameter. Upsert to separate MVS collections and compare retrieval quality across models without changing infrastructure.

Together AI Integration Deep Dive

Use the Together AI Python SDK or REST API to generate embeddings from any supported model: BGE-large, Llama-based embeddings, Mistral embed, or any new model Together adds to their platform. The response returns a vector array that you upsert directly to a Mixpeek Vector Store collection along with metadata (source document ID, timestamps, content type, custom fields). MVS indexes the vectors automatically and makes them available through Mixpeek retrievers configured with feature search stages. Because Together AI uses the OpenAI-compatible format, you can swap models by changing a single model parameter: no pipeline changes, no re-indexing logic. For batch workloads, iterate over your corpus, call Together AI embeddings in parallel, and bulk upsert to MVS. Mixpeek handles deduplication, versioning, and lineage tracking so you always know which model and version produced each vector.

Quick Start

together_ai_mvs.py

import together
from mixpeek import Mixpeek

# 1. Generate embedding with Together AI
response = together.Embeddings.create(
    model="BAAI/bge-large-en-v1.5",
    input="quarterly revenue presentation"
)
vector = response.data[0].embedding

# 2. Upsert to Mixpeek Vector Store
client = Mixpeek(api_key="YOUR_API_KEY")
client.vector_store.upsert(
    namespace="documents",
    vectors=[{
        "id": "doc_001",
        "values": vector,
        "metadata": {"source": "earnings", "quarter": "Q1"}
    }]
)

# 3. Search
results = client.vector_store.search(
    namespace="documents",
    vector=query_vector,
    top_k=10,
    filters={"source": "earnings"}
)

See the full API reference in the Vector Store docs.