Feature Extraction
Multi-tier feature extraction that decomposes content into searchable components: embeddings, transcripts, detected objects, OCR text, scene boundaries, and more. The foundation for all downstream retrieval and analysis.
"Find meeting recordings where someone discusses quarterly roadmap near a whiteboard"
Why This Matters
Raw media is unsearchable. Feature extraction transforms video, images, and audio into structured, queryable representations that power every other recipe.
import requestsAPI_URL = "https://api.mixpeek.com"headers = {"Authorization": "Bearer YOUR_API_KEY", "X-Namespace": "your-namespace"}# Create collection with feature extractorcollection = requests.post(f"{API_URL}/v1/collections", headers=headers, json={"collection_name": "enriched_media","source": {"type": "bucket", "bucket_id": "raw-media"},"feature_extractor": {"feature_extractor_name": "multimodal_extractor","version": "v1","input_mappings": {"video": "source_video"},"parameters": {"enable_transcription": True}}}).json()# Index content - extraction happens automaticallyrequests.post(f"{API_URL}/v1/buckets/raw-media/objects", headers=headers, json={"blobs": [{"property": "source_video", "url": "s3://bucket/meeting.mp4"}]})# Search across all extracted featuresresults = requests.post(f"{API_URL}/v1/retrievers/enriched-search/execute",headers=headers,json={"query": {"text": "quarterly roadmap discussion"}}).json()# Access extracted features directlyfor doc in results["documents"]:print(f"Transcript: {doc.get('transcript', '')[:200]}...")print(f"Feature URI: {doc['feature_address']}")
Feature Extractors
Image Embedding
Generate visual embeddings for similarity search and clustering
Video Embedding
Generate vector embeddings for video content
Audio Transcription
Transcribe audio content to text
Text Embedding
Extract semantic embeddings from documents, transcripts and text content
Object Detection
Identify and locate objects within images with bounding boxes
+2 more extractors
Retriever Stages
feature search
Search and filter documents by vector similarity using feature embeddings
attribute filter
Filter documents by metadata attribute values using boolean logic
