MCP Server - Mixpeek

The Mixpeek MCP server lets AI assistants like Claude manage your entire Mixpeek workflow — creating namespaces, uploading files, building search pipelines, and querying results — all through natural language.

What is MCP? The Model Context Protocol is an open standard that lets AI assistants connect to external tools and data sources. Instead of copy-pasting API calls, you describe what you want and Claude handles the rest.

Choose Your Server

MCP clients have context limits. Instead of loading all 43 tools, pick the scoped server that matches your workflow:

Ingestion Server

18 tools — Buckets, collections, and documents. Best for data pipelines and content upload workflows.https://mcp.mixpeek.com/ingestion/mcp

Retrieval Server

11 tools — Retrievers, agents, and search. Best for RAG applications, search UIs, and agent workflows.https://mcp.mixpeek.com/retrieval/mcp

Admin Server

14 tools — Namespaces, taxonomies, and clusters. Best for platform administration and enrichment.https://mcp.mixpeek.com/admin/mcp

Full Platform

43 tools — Everything in one server. Best for power users who need all capabilities.https://mcp.mixpeek.com/mcp

Need just one retriever? Use the Per-Retriever Server — it exposes a single typed search tool with parameters generated from your retriever’s input schema.

Setup

Add to your Claude Desktop or Claude Code config. Replace YOUR_API_KEY with your key from the Mixpeek dashboard.

Ingestion
Retrieval
Admin
Full Platform

{
  "mcpServers": {
    "mixpeek-ingestion": {
      "url": "https://mcp.mixpeek.com/ingestion/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      }
    }
  }
}

{
  "mcpServers": {
    "mixpeek-retrieval": {
      "url": "https://mcp.mixpeek.com/retrieval/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      }
    }
  }
}

{
  "mcpServers": {
    "mixpeek-admin": {
      "url": "https://mcp.mixpeek.com/admin/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      }
    }
  }
}

{
  "mcpServers": {
    "mixpeek": {
      "url": "https://mcp.mixpeek.com/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_API_KEY"
      }
    }
  }
}

Stdio (local development)

Run the server locally with an optional --scope flag:

# Full server (default)
python -m mixpeek_mcp.main --transport stdio

# Scoped servers
python -m mixpeek_mcp.main --transport stdio --scope ingestion
python -m mixpeek_mcp.main --transport stdio --scope retrieval
python -m mixpeek_mcp.main --transport stdio --scope admin

Ingestion Server — 18 tools

Manage buckets, collections, and documents. Use this server when building data ingestion pipelines.

Bucket Management (6 tools)

Buckets store your raw files (videos, images, documents) before processing.

Tool	Description
`create_bucket`	Create a new bucket for file storage and processing
`list_buckets`	List all buckets in a namespace
`get_bucket`	Get details of a specific bucket
`update_bucket`	Update bucket configuration
`delete_bucket`	Delete a bucket and all its objects
`upload_object`	Upload an object (file) to a bucket from a URL

Collection Management (7 tools)

Collections define how your data is processed — which feature extractor runs, what embeddings are generated. Each collection has exactly one feature extractor.

Tool	Description
`create_collection`	Create a collection with a feature extractor, source config, and optional taxonomy/cluster/alert applications
`list_collections`	List all collections in a namespace
`get_collection`	Get collection details
`update_collection`	Update collection configuration
`clone_collection`	Clone an existing collection with optional overrides
`trigger_collection`	Trigger the processing pipeline on bucket objects
`delete_collection`	Delete a collection and all its documents

Available feature extractors:

Extractor	Description
`text_extractor`	Text embeddings (multilingual E5)
`image_extractor`	Image embeddings (CLIP, SigLIP)
`multimodal_extractor`	Text + image joint embeddings
`face_identity_extractor`	Face detection and recognition
`document_graph_extractor`	Document structure extraction
`sentiment_classifier`	Sentiment analysis
`web_scraper`	Web page content extraction
`course_content_extractor`	Video/course content processing

Document Management (5 tools)

Documents are the processed records stored in your namespace with extracted features and embeddings.

Tool	Description
`create_document`	Create a new document in a collection
`list_documents`	List documents in a collection with filters
`get_document`	Get a specific document by ID
`update_document`	Update a document’s data
`delete_document`	Delete a document from a collection

Example prompts:

"Create a bucket called marketing-videos, then create a collection with the
multimodal extractor and trigger processing"

"Upload this CSV to my data-imports bucket: https://example.com/products.csv"

Retrieval Server — 11 tools

Search and query your data. Use this server for RAG applications, search UIs, and agent workflows.

Retriever Management (7 tools)

Retrievers are multi-stage search pipelines. Chain stages together to search, filter, rerank, and enrich results.

Tool	Description
`create_retriever`	Create a multi-stage search pipeline
`list_retrievers`	List all retrievers in a namespace
`get_retriever`	Get retriever configuration and all stages
`update_retriever`	Update retriever metadata (name, description, tags)
`clone_retriever`	Clone an existing retriever with optional modifications
`execute_retriever`	Execute a retriever with inputs and get search results
`delete_retriever`	Delete a retriever

29+ available stages across 5 categories:

Category	Stage IDs
Filter	`feature_search`, `attribute_filter`, `llm_filter`, `query_expand`, `agent_search`
Sort	`sort_relevance`, `sort_attribute`, `rerank`, `mmr`, `score_normalize`
Reduce	`limit`, `group_by`, `aggregate`, `summarize`, `sample`, `deduplicate`, `cluster`
Apply	`json_transform`, `api_call`, `web_search`, `sql_lookup`, `cross_compare`, `unwind`, `rag_prepare`
Enrich	`llm_enrich`, `document_enrich`, `taxonomy_enrich`, `code_execution`, `web_scrape`

Stages support template variables: {{INPUT.field}}, {{DOC.field}}, {{STAGE.field}}, {{CONTEXT.field}}.

Agent & Conversation (3 tools)

Conversational AI sessions with retriever-backed responses.

Tool	Description
`create_agent_session`	Create a new conversational agent session
`send_agent_message`	Send a message to an agent session and get response
`get_agent_history`	Get conversation history for an agent session

Search (1 tool)

Tool	Description
`search_namespace`	Search across all resources in a namespace (buckets, collections, retrievers, etc.)

Example prompts:

"Create a retriever that does a feature search on my product-demos collection,
reranks the top 50 down to 10 with Cohere, and adds a 2-sentence summary"

"Execute my product-search retriever with query 'red running shoes under $100'"

Admin Server — 14 tools

Manage namespaces, taxonomies, and clusters. Use this server for platform administration and data enrichment.

Namespace Management (5 tools)

Namespaces are isolated workspaces. Each namespace maps to its own vector collection in Qdrant.

Tool	Description
`create_namespace`	Create a new workspace for organizing collections and resources
`list_namespaces`	List all namespaces in your organization
`get_namespace`	Get namespace details by ID or name
`update_namespace`	Update namespace configuration
`delete_namespace`	Delete a namespace and all its resources

Taxonomy Management (5 tools)

Taxonomies are hierarchical classification systems you can apply to documents.

Tool	Description
`create_taxonomy`	Create a hierarchical classification taxonomy
`list_taxonomies`	List all taxonomies
`get_taxonomy`	Get taxonomy details
`execute_taxonomy`	Apply taxonomy classification to document data
`delete_taxonomy`	Delete a taxonomy

Cluster Management (4 tools)

Clusters group similar documents together for discovery and organization.

Tool	Description
`create_cluster`	Create a document clustering configuration
`list_clusters`	List all clusters
`execute_cluster`	Execute clustering algorithm on collection
`delete_cluster`	Delete a cluster configuration

Example prompts:

"Create a new namespace called production-catalog"

"Create an IAB taxonomy for content classification, then run it on my articles collection"

"Cluster the documents in my product-images collection into 10 groups"

Retriever Server

The Retriever MCP server is a lightweight server scoped to a single retriever. It reads your retriever’s input_schema at startup and generates a typed search tool whose parameters match exactly — so the AI assistant knows what inputs are available without any guesswork.

Tools

Tool	Description
`search`	Execute the retriever. Parameters are generated from the retriever’s `input_schema` — including the correct field names, types, required flags, enums, and descriptions. Pagination parameters (`page`, `page_size`) are added automatically.
`describe`	Returns structured metadata: retriever ID, name, collections, input fields, and stage configuration.
`explain`	Returns a human-readable explanation of the pipeline: what each stage does, in what order.

How Dynamic Schema Works

When the server starts, it fetches your retriever’s configuration and converts its input_schema into a JSON Schema for the search tool. For example, if your retriever has:

{
  "input_schema": {
    "query": { "type": "text", "required": true, "description": "Search query" },
    "category": {
      "type": "string",
      "required": false,
      "enum": ["electronics", "clothing", "home"],
      "description": "Filter by category"
    }
  }
}

The search tool will expose query (required string) and category (optional enum) as typed parameters — plus page and page_size for pagination.

If your retriever’s input_schema has a field named page or page_size, the pagination parameters are automatically renamed to _pagination_page and _pagination_page_size to avoid conflicts.

Setup

Claude Desktop (stdio)
HTTP (deployed)
Environment Variables

{
  "mcpServers": {
    "my-search": {
      "command": "mixpeek-mcp-retriever",
      "args": [
        "--retriever-id", "ret_abc123",
        "--namespace-id", "ns_xyz789",
        "--api-key", "YOUR_API_KEY"
      ]
    }
  }
}

Requires the mixpeek-mcp-retriever CLI installed:

pip install mixpeek  # includes the CLI

Run the server as an HTTP endpoint for remote access:

mixpeek-mcp-retriever \
  --retriever-id ret_abc123 \
  --namespace-id ns_xyz789 \
  --api-key YOUR_API_KEY \
  --transport http \
  --port 8081

Then connect your MCP client to http://your-host:8081.HTTP Endpoints:

Method	Path	Description
`GET`	`/health`	Health check with retriever info
`GET`	`/tools`	List all 3 tools with schemas
`POST`	`/tools/call`	Execute a tool (requires `Authorization` header)

All settings can be configured via environment variables (prefixed with RETRIEVER_MCP_):

export RETRIEVER_MCP_RETRIEVER_ID=ret_abc123
export RETRIEVER_MCP_NAMESPACE_ID=ns_xyz789
export RETRIEVER_MCP_API_KEY=YOUR_API_KEY
export RETRIEVER_MCP_TRANSPORT=stdio   # or "http"
export RETRIEVER_MCP_HTTP_PORT=8081

Then just run mixpeek-mcp-retriever with no arguments.

Example Conversation

Once connected, you can interact naturally:

You: "What does this retriever do?"
Claude: [calls describe] → "This is 'Product Search' — it searches your
products collection by text query with an optional category filter,
using feature search → attribute filter → reranking."

You: "Search for wireless headphones under electronics"
Claude: [calls search with query="wireless headphones", category="electronics"]
→ Returns top 10 matching products with scores and metadata.

You: "Explain the pipeline stages"
Claude: [calls explain] → "1. feature_search: embeds your query and finds
the top 50 matches. 2. attribute_filter: filters by category field.
3. rerank: reranks with Cohere down to the top 10."

Authentication & Security

All MCP servers use your existing Mixpeek API key with the same permissions as the REST API.

HTTP transport: Pass the API key in the Authorization: Bearer header. The server extracts it and injects it into every tool call.
Stdio transport: Set the MIXPEEK_API_KEY environment variable or pass api_key in tool arguments.
Same RBAC permissions as the REST API
Rate limiting per organization
Audit logging for all operations
TLS encryption on the hosted server

Keep your API key secure. Never commit keys to version control. For the Retriever Server, prefer environment variables over CLI arguments in production.

Architecture

┌─────────────────────────────────────────────────────────┐
│                   Claude / AI App                       │
└──────────────┬──────────────────────────────────────────┘
               │ MCP Protocol (Streamable HTTP / Stdio)
               ▼
┌─────────────────────────────────────────────────────────┐
│              Mixpeek MCP Server                         │
│  ┌──────────┐ ┌───────────┐ ┌─────────┐ ┌──────────┐  │
│  │   Full   │ │ Ingestion │ │Retrieval│ │  Admin   │  │
│  │  / (43)  │ │ /ing (18) │ │/ret (11)│ │/adm (14) │  │
│  └────┬─────┘ └─────┬─────┘ └────┬────┘ └────┬─────┘  │
│       └──────────────┴────────────┴───────────┘        │
│                    Tool Handlers                        │
└──────────────┬──────────────────────────────────────────┘
               │                    ┌──────────────────┐
               │                    │ Retriever Server  │
               │                    │ (per-retriever)   │
               │                    │ 3 tools           │
               │                    └────────┬─────────┘
               └──────────┬─────────────────┘
                          │ Direct service calls
                          ▼
               ┌──────────────────────┐
               │   Mixpeek Services   │
               │ MongoDB · Qdrant     │
               │ Redis · S3 · Ray     │
               └──────────────────────┘

The scoped servers share the same codebase and tool handlers — scoping controls which tools are registered, not how they execute. Each scoped sub-app is mounted at its path prefix (/ingestion, /retrieval, /admin) while the full server handles root-level requests.

Troubleshooting

Claude can't connect to the MCP server

Verify the URL is correct (e.g. https://mcp.mixpeek.com/ingestion/mcp)
Check that the Authorization header format is Bearer YOUR_API_KEY
Restart Claude Desktop or Claude Code after changing config

Tools return 'Unauthorized' or 'Invalid API key'

Verify your API key at mixpeek.com/dashboard
Check that the key has permissions for the namespace you’re accessing
Make sure there are no extra spaces in the key

Tool not found (404)

You may be calling a tool on the wrong scoped server (e.g. execute_retriever on /ingestion)
Check GET /tools on the scoped endpoint to see available tools
Use the full server (/) if you need all tools

Retriever Server fails to start

Ensure --retriever-id and --namespace-id are correct
Verify the API key has access to that namespace
Check that the retriever exists: GET /v1/retrievers/{id}

Search returns empty results

Confirm your collection has processed documents (not just uploaded files)
Check that the retriever’s feature_uri matches your collection’s extractor
Try a broader query or remove optional filters

Slow response times

Large file uploads depend on file size and network
Multi-stage retrievers with LLM enrichment or reranking take more time
Check status.mixpeek.com for service issues

Next Steps

Core Concepts

Understand namespaces, collections, and documents

Feature Extractors

Choose the right extractor for your data

Retriever Stages

Learn what each pipeline stage does

API Reference

Full REST API documentation

Integrations

​Choose Your Server

Ingestion Server

Retrieval Server

Admin Server

Full Platform

​Setup

​Stdio (local development)

​Ingestion Server — 18 tools

​Retrieval Server — 11 tools

​Admin Server — 14 tools

​Retriever Server

​Tools

​How Dynamic Schema Works

​Setup

​Example Conversation

​Authentication & Security

​Architecture

​Troubleshooting

​Next Steps

Core Concepts

Feature Extractors

Retriever Stages

API Reference

Choose Your Server

Setup

Stdio (local development)

Ingestion Server — 18 tools

Retrieval Server — 11 tools

Admin Server — 14 tools

Retriever Server

Tools

How Dynamic Schema Works

Setup

Example Conversation

Authentication & Security

Architecture

Troubleshooting

Next Steps