Multimodal Knowledge Base

Consolidate documents, videos, images, and audio into a single searchable knowledge base with RAG capabilities. Supports natural language Q&A across all content types, with citations linking back to the exact source document, video timestamp, or image.

text

video

image

audio

Production

2.1K runs

Run in Builder

from mixpeek import Mixpeek
from openai import OpenAI

client = Mixpeek(api_key="YOUR_API_KEY")
openai = OpenAI(api_key="YOUR_OPENAI_KEY")

# Create collections for each content type
docs_col = client.collections.create(
    namespace_id="ns_your_namespace",
    name="knowledge_docs",
    extractors=["document-graph-extractor", "text-extractor"]
)

videos_col = client.collections.create(
    namespace_id="ns_your_namespace",
    name="knowledge_videos",
    extractors=["multimodal-extractor", "text-extractor"]
)

# Build unified retriever spanning all collections
retriever = client.retrievers.create(
    namespace_id="ns_your_namespace",
    name="knowledge_base",
    collection_ids=["col_knowledge_docs", "col_knowledge_videos"],
    stages=[
        {"type": "feature_search", "top_k": 50},
        {"type": "rerank", "top_k": 10},
        {"type": "rag_prepare"}
    ]
)

# Ask a question across all content
results = client.retrievers.execute(
    retriever_id=retriever["retriever_id"],
    query={"text": "What is our company policy on remote work?"}
)

context = "\n".join([
    f"[{i+1}] {doc['text']} (Source: {doc['root_object_id']}, Type: {doc['modality']})"
    for i, doc in enumerate(results["results"])
])

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": f"Answer from this knowledge base:\n{context}"},
        {"role": "user", "content": "What is our company policy on remote work?"}
    ]
)
print(response.choices[0].message.content)

Feature Extractors

Retriever Stages

rerank

Rerank documents using cross-encoder models for accurate relevance

sort

summarize

Condense multiple documents into a summary using an LLM

reduce

Multimodal Knowledge Base

Feature Extractors

Retriever Stages

Related Recipes & Resources

Metadata Enrichment Pipeline

Document RAG Pipeline

Multimodal Hybrid Search Pipeline

Multimodal RAG Pipeline

Taxonomy Enrichment Pipeline

Content Clustering Pipeline