Hierarchical Taxonomy Classification
Assigns content to multi-level taxonomies using embedding-based classification. Taxonomies are reusable control planes that define organizational structure.
"Show all educational tutorial videos classified under safe content with high confidence"
Why This Matters
Taxonomies are infrastructure—not model outputs. Once defined, they enable consistent classification, compliance tagging, and structured filtering across all content.
from mixpeek import Mixpeekclient = Mixpeek(api_key="your-api-key")# Define taxonomy structuretaxonomy = client.taxonomies.create(taxonomy_name="content_classification",hierarchy={"safe_content": {"educational": ["tutorial", "documentary"],"entertainment": ["comedy", "music"]},"review_required": {"ambiguous": ["political", "news"]}})# Classify contentresult = client.collections.classify(collection_id="my-collection",taxonomy_id=taxonomy.id,confidence_threshold=0.75)# Filter by taxonomy in retrieverresults = client.retrievers.execute(retriever_id="filtered-retriever",inputs={"query_text": "educational videos","taxonomy_path": "safe_content.educational"})
Retrieval Flow
Filter by taxonomy labels
Sort by confidence score
Feature Extractors
Feature Extractors
Image Embedding
Generate visual embeddings for similarity search and clustering
Text Embedding
Extract semantic embeddings from documents, transcripts and text content
Video Embedding
Generate vector embeddings for video content
Retriever Stages
attribute filter
Filter documents by metadata attributes
sort
Sort documents by field values
