Skip to main content
The Mixpeek MCP server lets AI assistants like Claude manage your entire Mixpeek workflow — creating namespaces, uploading files, building search pipelines, and querying results — all through natural language.
What is MCP? The Model Context Protocol is an open standard that lets AI assistants connect to external tools and data sources. Instead of copy-pasting API calls, you describe what you want and Claude handles the rest.

Choose Your Server

MCP clients have context limits. Instead of loading all 43 tools, pick the scoped server that matches your workflow:

Ingestion Server

18 tools — Buckets, collections, and documents. Best for data pipelines and content upload workflows.https://mcp.mixpeek.com/ingestion/mcp

Retrieval Server

11 tools — Retrievers, agents, and search. Best for RAG applications, search UIs, and agent workflows.https://mcp.mixpeek.com/retrieval/mcp

Admin Server

14 tools — Namespaces, taxonomies, and clusters. Best for platform administration and enrichment.https://mcp.mixpeek.com/admin/mcp

Full Platform

43 tools — Everything in one server. Best for power users who need all capabilities.https://mcp.mixpeek.com/mcp
Need just one retriever? Use the Per-Retriever Server — it exposes a single typed search tool with parameters generated from your retriever’s input schema.

Setup

Add to your Claude Desktop or Claude Code config. Replace YOUR_API_KEY with your key from the Mixpeek dashboard.
{
  "mcpServers": {
    "mixpeek-ingestion": {
      "url": "https://mcp.mixpeek.com/ingestion/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      }
    }
  }
}

Stdio (local development)

Run the server locally with an optional --scope flag:
# Full server (default)
python -m mixpeek_mcp.main --transport stdio

# Scoped servers
python -m mixpeek_mcp.main --transport stdio --scope ingestion
python -m mixpeek_mcp.main --transport stdio --scope retrieval
python -m mixpeek_mcp.main --transport stdio --scope admin

Ingestion Server — 18 tools

Manage buckets, collections, and documents. Use this server when building data ingestion pipelines.
Buckets store your raw files (videos, images, documents) before processing.
ToolDescription
create_bucketCreate a new bucket for file storage and processing
list_bucketsList all buckets in a namespace
get_bucketGet details of a specific bucket
update_bucketUpdate bucket configuration
delete_bucketDelete a bucket and all its objects
upload_objectUpload an object (file) to a bucket from a URL
Collections define how your data is processed — which feature extractor runs, what embeddings are generated. Each collection has exactly one feature extractor.
ToolDescription
create_collectionCreate a collection with a feature extractor, source config, and optional taxonomy/cluster/alert applications
list_collectionsList all collections in a namespace
get_collectionGet collection details
update_collectionUpdate collection configuration
clone_collectionClone an existing collection with optional overrides
trigger_collectionTrigger the processing pipeline on bucket objects
delete_collectionDelete a collection and all its documents
Available feature extractors:
ExtractorDescription
text_extractorText embeddings (multilingual E5)
image_extractorImage embeddings (CLIP, SigLIP)
multimodal_extractorText + image joint embeddings
face_identity_extractorFace detection and recognition
document_graph_extractorDocument structure extraction
sentiment_classifierSentiment analysis
web_scraperWeb page content extraction
course_content_extractorVideo/course content processing
Documents are the processed records stored in your namespace with extracted features and embeddings.
ToolDescription
create_documentCreate a new document in a collection
list_documentsList documents in a collection with filters
get_documentGet a specific document by ID
update_documentUpdate a document’s data
delete_documentDelete a document from a collection
Example prompts:
"Create a bucket called marketing-videos, then create a collection with the
multimodal extractor and trigger processing"

"Upload this CSV to my data-imports bucket: https://example.com/products.csv"

Retrieval Server — 11 tools

Search and query your data. Use this server for RAG applications, search UIs, and agent workflows.
Retrievers are multi-stage search pipelines. Chain stages together to search, filter, rerank, and enrich results.
ToolDescription
create_retrieverCreate a multi-stage search pipeline
list_retrieversList all retrievers in a namespace
get_retrieverGet retriever configuration and all stages
update_retrieverUpdate retriever metadata (name, description, tags)
clone_retrieverClone an existing retriever with optional modifications
execute_retrieverExecute a retriever with inputs and get search results
delete_retrieverDelete a retriever
29+ available stages across 5 categories:
CategoryStage IDs
Filterfeature_search, attribute_filter, llm_filter, query_expand, agent_search
Sortsort_relevance, sort_attribute, rerank, mmr, score_normalize
Reducelimit, group_by, aggregate, summarize, sample, deduplicate, cluster
Applyjson_transform, api_call, web_search, sql_lookup, cross_compare, unwind, rag_prepare
Enrichllm_enrich, document_enrich, taxonomy_enrich, code_execution, web_scrape
Stages support template variables: {{INPUT.field}}, {{DOC.field}}, {{STAGE.field}}, {{CONTEXT.field}}.
Conversational AI sessions with retriever-backed responses.
ToolDescription
create_agent_sessionCreate a new conversational agent session
send_agent_messageSend a message to an agent session and get response
get_agent_historyGet conversation history for an agent session
ToolDescription
search_namespaceSearch across all resources in a namespace (buckets, collections, retrievers, etc.)
Example prompts:
"Create a retriever that does a feature search on my product-demos collection,
reranks the top 50 down to 10 with Cohere, and adds a 2-sentence summary"

"Execute my product-search retriever with query 'red running shoes under $100'"

Admin Server — 14 tools

Manage namespaces, taxonomies, and clusters. Use this server for platform administration and data enrichment.
Namespaces are isolated workspaces. Each namespace maps to its own vector collection in Qdrant.
ToolDescription
create_namespaceCreate a new workspace for organizing collections and resources
list_namespacesList all namespaces in your organization
get_namespaceGet namespace details by ID or name
update_namespaceUpdate namespace configuration
delete_namespaceDelete a namespace and all its resources
Taxonomies are hierarchical classification systems you can apply to documents.
ToolDescription
create_taxonomyCreate a hierarchical classification taxonomy
list_taxonomiesList all taxonomies
get_taxonomyGet taxonomy details
execute_taxonomyApply taxonomy classification to document data
delete_taxonomyDelete a taxonomy
Clusters group similar documents together for discovery and organization.
ToolDescription
create_clusterCreate a document clustering configuration
list_clustersList all clusters
execute_clusterExecute clustering algorithm on collection
delete_clusterDelete a cluster configuration
Example prompts:
"Create a new namespace called production-catalog"

"Create an IAB taxonomy for content classification, then run it on my articles collection"

"Cluster the documents in my product-images collection into 10 groups"

Retriever Server

The Retriever MCP server is a lightweight server scoped to a single retriever. It reads your retriever’s input_schema at startup and generates a typed search tool whose parameters match exactly — so the AI assistant knows what inputs are available without any guesswork.

Tools

ToolDescription
searchExecute the retriever. Parameters are generated from the retriever’s input_schema — including the correct field names, types, required flags, enums, and descriptions. Pagination parameters (page, page_size) are added automatically.
describeReturns structured metadata: retriever ID, name, collections, input fields, and stage configuration.
explainReturns a human-readable explanation of the pipeline: what each stage does, in what order.

How Dynamic Schema Works

When the server starts, it fetches your retriever’s configuration and converts its input_schema into a JSON Schema for the search tool. For example, if your retriever has:
{
  "input_schema": {
    "query": { "type": "text", "required": true, "description": "Search query" },
    "category": {
      "type": "string",
      "required": false,
      "enum": ["electronics", "clothing", "home"],
      "description": "Filter by category"
    }
  }
}
The search tool will expose query (required string) and category (optional enum) as typed parameters — plus page and page_size for pagination.
If your retriever’s input_schema has a field named page or page_size, the pagination parameters are automatically renamed to _pagination_page and _pagination_page_size to avoid conflicts.

Setup

{
  "mcpServers": {
    "my-search": {
      "command": "mixpeek-mcp-retriever",
      "args": [
        "--retriever-id", "ret_abc123",
        "--namespace-id", "ns_xyz789",
        "--api-key", "YOUR_API_KEY"
      ]
    }
  }
}
Requires the mixpeek-mcp-retriever CLI installed:
pip install mixpeek  # includes the CLI

Example Conversation

Once connected, you can interact naturally:
You: "What does this retriever do?"
Claude: [calls describe] → "This is 'Product Search' — it searches your
products collection by text query with an optional category filter,
using feature search → attribute filter → reranking."

You: "Search for wireless headphones under electronics"
Claude: [calls search with query="wireless headphones", category="electronics"]
→ Returns top 10 matching products with scores and metadata.

You: "Explain the pipeline stages"
Claude: [calls explain] → "1. feature_search: embeds your query and finds
the top 50 matches. 2. attribute_filter: filters by category field.
3. rerank: reranks with Cohere down to the top 10."

Authentication & Security

All MCP servers use your existing Mixpeek API key with the same permissions as the REST API.
  • HTTP transport: Pass the API key in the Authorization: Bearer header. The server extracts it and injects it into every tool call.
  • Stdio transport: Set the MIXPEEK_API_KEY environment variable or pass api_key in tool arguments.
  • Same RBAC permissions as the REST API
  • Rate limiting per organization
  • Audit logging for all operations
  • TLS encryption on the hosted server
Keep your API key secure. Never commit keys to version control. For the Retriever Server, prefer environment variables over CLI arguments in production.

Architecture

┌─────────────────────────────────────────────────────────┐
│                   Claude / AI App                       │
└──────────────┬──────────────────────────────────────────┘
               │ MCP Protocol (Streamable HTTP / Stdio)

┌─────────────────────────────────────────────────────────┐
│              Mixpeek MCP Server                         │
│  ┌──────────┐ ┌───────────┐ ┌─────────┐ ┌──────────┐  │
│  │   Full   │ │ Ingestion │ │Retrieval│ │  Admin   │  │
│  │  / (43)  │ │ /ing (18) │ │/ret (11)│ │/adm (14) │  │
│  └────┬─────┘ └─────┬─────┘ └────┬────┘ └────┬─────┘  │
│       └──────────────┴────────────┴───────────┘        │
│                    Tool Handlers                        │
└──────────────┬──────────────────────────────────────────┘
               │                    ┌──────────────────┐
               │                    │ Retriever Server  │
               │                    │ (per-retriever)   │
               │                    │ 3 tools           │
               │                    └────────┬─────────┘
               └──────────┬─────────────────┘
                          │ Direct service calls

               ┌──────────────────────┐
               │   Mixpeek Services   │
               │ MongoDB · Qdrant     │
               │ Redis · S3 · Ray     │
               └──────────────────────┘
The scoped servers share the same codebase and tool handlers — scoping controls which tools are registered, not how they execute. Each scoped sub-app is mounted at its path prefix (/ingestion, /retrieval, /admin) while the full server handles root-level requests.

Troubleshooting

  • Verify the URL is correct (e.g. https://mcp.mixpeek.com/ingestion/mcp)
  • Check that the Authorization header format is Bearer YOUR_API_KEY
  • Restart Claude Desktop or Claude Code after changing config
  • Verify your API key at mixpeek.com/dashboard
  • Check that the key has permissions for the namespace you’re accessing
  • Make sure there are no extra spaces in the key
  • You may be calling a tool on the wrong scoped server (e.g. execute_retriever on /ingestion)
  • Check GET /tools on the scoped endpoint to see available tools
  • Use the full server (/) if you need all tools
  • Ensure --retriever-id and --namespace-id are correct
  • Verify the API key has access to that namespace
  • Check that the retriever exists: GET /v1/retrievers/{id}
  • Confirm your collection has processed documents (not just uploaded files)
  • Check that the retriever’s feature_uri matches your collection’s extractor
  • Try a broader query or remove optional filters
  • Large file uploads depend on file size and network
  • Multi-stage retrievers with LLM enrichment or reranking take more time
  • Check status.mixpeek.com for service issues

Next Steps