Cutsio
A collaborative video editing platform for professional post-production teams working with raw cinema camera footage across commercials, film, and episodic television.
-99%
Footage Search Time
Fully Automated
Raw Format Support
-86%
Editor Time on Search
Fully Automated
Face Identification
The Challenge
Professional video editors working with RED R3D and ARRI RAW footage had no way to search their media libraries visually. Each project involved thousands of hours of raw clips spread across cloud storage, and finding the right shot meant scrubbing through footage manually or relying on sparse, hand-typed notes. Editors spent 30-40% of their time just looking for clips instead of cutting. Converting raw cinema camera formats for preview added another bottleneck, since tools like DaVinci Resolve had to run locally on each editor's machine before footage could even be reviewed.
Pipeline Architecture
End-to-end flow from raw footage ingestion to visual search, with full audit trail at every stage.
Mux Selective Sync
Filters assets by passthrough metadata (e.g. vi:1 flag). Cutsio controls eligibility per asset; Mixpeek indexes only matching assets. Webhook + reconciliation keep the index in sync when flags change.
RAW Video Conversion
Custom extractor (v3.1.0) passes standard video through untouched. Only transcodes RED R3D (REDline) and ARRI RAW (ART CMD) to H.264 server-side — no DaVinci needed.
Multimodal Decomposition
Splits video into segments with timestamps, extracting scene-level embeddings for composition, subjects, motion, color palette, and on-screen text.
Face Identity Search
Runs face detection (SCRFD) and identity embedding (ArcFace) on each video segment. Upload a photo of a person and find every moment they appear.
Visual Search
Editors search by reference frame, natural language, or both. Results ranked by scene similarity with sub-second latency.
Audit Trail
Every ingestion, conversion, and search is logged. Producers see what was processed, when, and how, with full lineage.
Mux Integration Deep Dive
Full docsMixpeek's Mux connector goes beyond basic asset syncing — it supports metadata-driven selective indexing, real-time webhook updates, and automatic reconciliation. The integration was co-designed with Cutsio so that Cutsio remains the source of truth, Mux carries the sync hint, and Mixpeek handles the indexing.
Selective Sync via Metadata Filters
docs →Sync configs support metadata_filters on any Mux asset field, including passthrough. Cutsio sets a flag on each asset (e.g. vi:1) and Mixpeek only indexes assets that match — keeping the index lean and cost-efficient.
{ "field": "passthrough", "operator": "contains", "value": "vi:1" }Real-Time Webhook Updates
docs →When an asset's passthrough changes, the video.asset.updated webhook fires and Mixpeek re-evaluates filters immediately. No waiting for the next scheduled sync — flag changes are reflected in seconds.
Automatic Reconciliation
docs →Enable reconcile_on_sync and assets that no longer match filters (e.g. vi:0) are automatically unindexed — both on scheduled syncs and via webhook. No orphaned data, no manual cleanup.
Metadata vs. File Patterns
docs →metadata_filters and include_patterns are completely separate. File patterns match on filenames/paths, while metadata filters operate on Mux asset-level fields. Use both together or independently.
Cutsio's Production Flow
User uploads video
Cutsio app
Eligibility check
Plan / settings
Set passthrough flag
vi:1 on Mux asset
Selective sync
Mixpeek filters
Visual search ready
< 2 min end-to-end
The Solution
Mixpeek's pipeline connects directly to Cutsio's Mux video infrastructure through a selective sync that filters which assets get indexed based on Mux asset-level passthrough metadata. Cutsio sets a flag on each asset (e.g. vi:1) to control eligibility, and Mixpeek's metadata_filters only index assets that match. When the flag changes, a video.asset.updated webhook re-evaluates filters in real-time, and reconcile_on_sync cleans up assets that no longer qualify. A custom video conversion extractor (v3.1.0) passes through standard video formats untouched and only transcodes ARRI RAW and RED R3D using built-in REDline and ARRI Reference Tool CMD — no DaVinci Resolve needed. Converted footage flows into a multimodal decomposition collection that splits video into segments with timestamps, extracting scene-level embeddings for composition, subjects, motion, color palette, and on-screen text. A face_identity_extractor@v1 then runs on each segment to detect and embed faces using ArcFace. Two retrievers serve the index: a visual search retriever for text and image queries, and a face search retriever that lets editors upload a photo of a person and find every video moment where they appear.
Implementation
Cutsio's pipeline runs as three chained collections: the first handles raw format detection and transcoding (passing standard video through, transcoding R3D and ARRI RAW to MP4), the second performs multimodal feature extraction on the converted output, and the third runs face detection and identity embedding on each video segment. A Mux selective sync uses metadata_filters on the asset passthrough field to control which assets are indexed — Cutsio remains the source of truth, Mux carries the sync hint, and Mixpeek handles the indexing. The video.asset.updated webhook re-evaluates filters when passthrough changes, and reconcile_on_sync ensures assets that lose eligibility are automatically unindexed. The custom conversion extractor (v3.1.0) eliminated the external DaVinci dependency by leveraging RED and ARRI's native CLI tools pre-installed in the Mixpeek engine base image. End-to-end, a raw R3D file uploaded to Mux becomes visually and facially searchable within minutes.
Results
Before and after Mixpeek
-99%
Footage Search Time
Fully Automated
Raw Format Support
-86%
Editor Time on Search
Fully Automated
Face Identification
Customer testimonial
"Our editors used to lose half their day hunting for the right take. Now they describe what they need and the exact frame comes back in seconds. The fact that it handles RED and ARRI raw natively, without DaVinci in the loop, was the unlock."
Get Similar Results
See how Mixpeek can deliver measurable impact for your Media & Entertainment organization. Book a personalized demo to discuss your specific challenges.