Skip to main content
The langchain-mixpeek package gives LangChain agents the ability to see video, hear audio, search images, and act on unstructured content — all through Mixpeek’s multimodal infrastructure.

Installation

pip install langchain-mixpeek

Quick Start

1. Search (Retriever)

from langchain_mixpeek import MixpeekRetriever

retriever = MixpeekRetriever(
    api_key="mxp_...",
    retriever_id="ret_abc123",
    namespace="my-namespace",
)
docs = retriever.invoke("find the red cup")
Each result is a LangChain Document with page_content and metadata (document_id, score, namespace).

2. Agent Tool

from langchain_mixpeek import MixpeekRetriever

retriever = MixpeekRetriever(
    api_key="mxp_...",
    retriever_id="ret_abc123",
    namespace="my-namespace",
)

# One line — retriever becomes an agent tool
tool = retriever.as_tool()

3. Full Toolkit (search + ingest + classify + cluster + alert)

from langchain_mixpeek import MixpeekToolkit
from langgraph.prebuilt import create_react_agent
from langchain_anthropic import ChatAnthropic

toolkit = MixpeekToolkit(
    api_key="mxp_...",
    namespace="my-namespace",
    bucket_id="bkt_abc123",
    collection_id="col_def456",
    retriever_id="ret_ghi789",
)

agent = create_react_agent(
    ChatAnthropic(model="claude-sonnet-4-20250514"),
    toolkit.get_tools(),
)

result = agent.invoke({
    "messages": [("user", "Scan these product URLs and alert me about counterfeits")]
})
The toolkit gives your agent 6 capabilities:
ToolWhat it does
mixpeek_searchSearch video, images, audio, documents by natural language
mixpeek_ingestUpload text, images, video, audio, PDFs, spreadsheets
mixpeek_processTrigger feature extraction (embedding, OCR, transcription, face detection)
mixpeek_classifyRun taxonomy classification on documents
mixpeek_clusterGroup similar documents (kmeans, dbscan, hdbscan, etc.)
mixpeek_alertSet up monitoring with webhook, Slack, or email notifications

4. VectorStore (full pipeline)

from langchain_mixpeek import MixpeekVectorStore

store = MixpeekVectorStore(
    api_key="mxp_...",
    namespace="my-namespace",
    bucket_id="bkt_abc123",
    collection_id="col_def456",
    retriever_id="ret_ghi789",
)

# Ingest any content type
store.add_texts(["product description..."])
store.add_images(["https://example.com/photo.jpg"])
store.add_videos(["https://example.com/clip.mp4"])
store.add_audio(["https://example.com/recording.mp3"])
store.add_pdfs(["https://example.com/doc.pdf"])
store.add_excel(["https://example.com/data.xlsx"])

# Trigger processing (embedding, OCR, face detection, etc.)
store.trigger_processing()

# Search
docs = store.similarity_search("red cup on the table")

# Convert to agent tools anytime
tool = store.as_tool()
toolkit = store.as_toolkit()
retriever = store.as_retriever()

5. Search-Only (minimal config)

If you only need search, skip the bucket/collection config:
store = MixpeekVectorStore.from_retriever(
    api_key="mxp_...",
    namespace="my-namespace",
    retriever_id="ret_abc123",
)
docs = store.similarity_search("red cup")

Configuration

ParameterTypeDefaultDescription
api_keystrrequiredMixpeek API key (mxp_...)
retriever_idstrrequiredRetriever ID for search (ret_...)
namespacestrrequiredNamespace to operate in
bucket_idstrrequired*Bucket for uploads (bkt_...)
collection_idstrrequired*Collection for processing (col_...)
top_kint10 / 5Max results (retriever / tool)
content_fieldstr"text"Field to use as page_content
filtersdictNoneAttribute filters (retriever only)
*Required for ingest/processing. Not needed for search-only via from_retriever().
The content_field can reference any field in your retriever results — including enrichment fields like trend_insight or brand_alignment. If the field contains a dict with a text key, the text is automatically extracted.

Examples

Brand Protection Agent

An agent that scans marketplace listings and alerts on counterfeits:
from langchain_mixpeek import MixpeekToolkit
from langgraph.prebuilt import create_react_agent
from langchain_anthropic import ChatAnthropic

toolkit = MixpeekToolkit(
    api_key="mxp_...",
    namespace="brand-protection",
    bucket_id="bkt_...",
    collection_id="col_...",
    retriever_id="ret_...",
)

# Only give the agent the tools it needs
agent = create_react_agent(
    ChatAnthropic(model="claude-sonnet-4-20250514"),
    toolkit.get_tools(actions=["search", "ingest", "process", "alert"]),
    prompt="You are a brand protection agent. Scan products and flag counterfeits.",
)

result = agent.invoke({
    "messages": [("user", "Check if these 5 Amazon listings are selling counterfeit Stanley cups")]
})

RAG Chain

Standard retrieval-augmented generation:
from langchain_core.prompts import ChatPromptTemplate
from langchain_anthropic import ChatAnthropic
from langchain_mixpeek import MixpeekRetriever

retriever = MixpeekRetriever(
    api_key="mxp_...",
    retriever_id="ret_...",
    namespace="my-namespace",
)
llm = ChatAnthropic(model="claude-sonnet-4-20250514")

prompt = ChatPromptTemplate.from_template(
    "Answer using this context:\n{context}\n\nQuestion: {question}"
)

chain = {"context": retriever, "question": lambda x: x} | prompt | llm
response = chain.invoke("what happens at 2 minutes?")

Multi-Retriever Agent

Different retrievers for different content types:
from langchain_mixpeek import MixpeekTool
from langgraph.prebuilt import create_react_agent

video_search = MixpeekTool(
    api_key="mxp_...",
    retriever_id="ret_video_archive",
    namespace="archive",
    name="search_video_archive",
    description="Search video archive for specific scenes, faces, or moments.",
)

image_search = MixpeekTool(
    api_key="mxp_...",
    retriever_id="ret_product_images",
    namespace="catalog",
    name="search_product_images",
    description="Search product image catalog by visual similarity.",
)

agent = create_react_agent(llm, [video_search, image_search])

Platform Features

The VectorStore exposes the full Mixpeek platform:

Taxonomies (document classification)

# Create a taxonomy
store.create_taxonomy(
    name="product-categories",
    config={
        "taxonomy_type": "flat",
        "retriever_id": "ret_...",
        "collection_id": "col_...",
        "input_mappings": [...],
        "enrichment_fields": [...],
    },
)

# List and execute
taxonomies = store.list_taxonomies()
results = store.execute_taxonomy("tax_abc123")

Clusters (unsupervised grouping)

# Create and run clustering
cluster = store.create_cluster(
    cluster_type="vector",
    vector_config={
        "algorithm": "kmeans",  # or dbscan, hdbscan, spectral, etc.
        "algorithm_params": {"n_clusters": 10},
    },
)
store.execute_cluster(cluster["cluster_id"])
groups = store.get_cluster_groups(cluster["cluster_id"])

Alerts (match notifications)

# Create an alert with webhook + Slack
store.create_alert(
    name="counterfeit-detection",
    notification_config={
        "channels": [
            {"channel_type": "webhook", "config": {"url": "https://..."}},
            {"channel_type": "slack", "channel_id": "#alerts"},
        ],
        "include_matches": True,
        "include_scores": True,
    },
)

# Check results
results = store.get_alert_results("alt_abc123")

Custom Plugins

# List deployed plugins
plugins = store.list_plugins()

# Check deployment status
status = store.get_plugin_status("plg_abc123")

# Test a realtime plugin
result = store.test_plugin("plg_abc123", inputs={"text": "hello"})

Tips

Selecting Toolkit Actions

Don’t give agents tools they don’t need. Use actions to scope:
# Search-only agent
toolkit.get_tools(actions=["search"])

# Ingest + search agent
toolkit.get_tools(actions=["search", "ingest", "process"])

# Full platform agent
toolkit.get_tools()  # all 6 tools

Error Handling

All toolkit tools catch exceptions and return error strings instead of crashing the agent. The retriever raises exceptions normally.

Token Efficiency

Set top_k to limit results. Large result sets waste tokens without improving quality. Start with top_k=5.

Source Code

Next Steps

MCP Server

Connect Claude directly via the Model Context Protocol

OpenAI Function Calling

Wire Mixpeek into OpenAI assistants

Feature Extractors

15+ extractors: text, image, video, audio, face, PDF, web scraper

Python SDK

Full SDK reference