Drift
Dataset QA, Audit & Drift Detection
Detects bias, gaps, duplication, and distribution shifts using baseline clustering snapshots. This is where infrastructure buyers lean in hard.
video
image
text
Production
29.0K runs
Deploy RecipeWhy This Matters
Drift detection is operational monitoring—not a one-time audit. Baseline clusters become the source of truth for data quality over time.
from mixpeek import Mixpeekclient = Mixpeek(api_key="your-api-key")# Create baseline cluster snapshotbaseline = client.analytics.cluster(collection_id="training_data",algorithm="hdbscan",snapshot_id="baseline_2024_q1")# Detect drift in new datadrift_report = client.analytics.drift_detection(collection_id="training_data",baseline_snapshot="baseline_2024_q1",current_period={"start": "2024-10-01","end": "2024-12-31"},alert_threshold=0.15)# Find outliers (potential novelty or data quality issues)outliers = client.retrievers.execute(retriever_id="outlier-search",inputs={"drift_score_min": 0.8,"cluster_id": None # Unassigned to any cluster})
Feature Extractors
Feature Extractors
Image Embedding
Generate visual embeddings for similarity search and clustering
752K runs
Text Embedding
Extract semantic embeddings from documents, transcripts and text content
827K runs
Video Embedding
Generate vector embeddings for video content
610K runs
Retriever Stages
Enrichment Resources
Clustering
Analytics
