Back to All Case Studies
Media & Entertainment
Startup

CutsioCutsio

A collaborative video editing platform for professional post-production teams working with raw cinema camera footage across commercials, film, and episodic television.

-99%

Footage Search Time

Fully Automated

Raw Format Support

-86%

Editor Time on Search

Fully Automated

Face Identification

The Challenge

Professional video editors working with RED R3D and ARRI RAW footage had no way to search their media libraries visually. Each project involved thousands of hours of raw clips spread across cloud storage, and finding the right shot meant scrubbing through footage manually or relying on sparse, hand-typed notes. Editors spent 30-40% of their time just looking for clips instead of cutting. Converting raw cinema camera formats for preview added another bottleneck, since tools like DaVinci Resolve had to run locally on each editor's machine before footage could even be reviewed.

Pipeline Architecture

End-to-end flow from raw footage ingestion to visual search, with full audit trail at every stage.

Mux Selective Sync

Filters assets by passthrough metadata (e.g. vi:1 flag). Cutsio controls eligibility per asset; Mixpeek indexes only matching assets. Webhook + reconciliation keep the index in sync when flags change.

RAW Video Conversion

Custom extractor (v3.1.0) passes standard video through untouched. Only transcodes RED R3D (REDline) and ARRI RAW (ART CMD) to H.264 server-side — no DaVinci needed.

Multimodal Decomposition

Splits video into segments with timestamps, extracting scene-level embeddings for composition, subjects, motion, color palette, and on-screen text.

Face Identity Search

Runs face detection (SCRFD) and identity embedding (ArcFace) on each video segment. Upload a photo of a person and find every moment they appear.

Visual Search

Editors search by reference frame, natural language, or both. Results ranked by scene similarity with sub-second latency.

Audit Trail

Every ingestion, conversion, and search is logged. Producers see what was processed, when, and how, with full lineage.

Mux Integration Deep Dive

Full docs

Mixpeek's Mux connector goes beyond basic asset syncing — it supports metadata-driven selective indexing, real-time webhook updates, and automatic reconciliation. The integration was co-designed with Cutsio so that Cutsio remains the source of truth, Mux carries the sync hint, and Mixpeek handles the indexing.

Selective Sync via Metadata Filters

docs →

Sync configs support metadata_filters on any Mux asset field, including passthrough. Cutsio sets a flag on each asset (e.g. vi:1) and Mixpeek only indexes assets that match — keeping the index lean and cost-efficient.

{ "field": "passthrough", "operator": "contains", "value": "vi:1" }

Real-Time Webhook Updates

docs →

When an asset's passthrough changes, the video.asset.updated webhook fires and Mixpeek re-evaluates filters immediately. No waiting for the next scheduled sync — flag changes are reflected in seconds.

Automatic Reconciliation

docs →

Enable reconcile_on_sync and assets that no longer match filters (e.g. vi:0) are automatically unindexed — both on scheduled syncs and via webhook. No orphaned data, no manual cleanup.

Metadata vs. File Patterns

docs →

metadata_filters and include_patterns are completely separate. File patterns match on filenames/paths, while metadata filters operate on Mux asset-level fields. Use both together or independently.

Cutsio's Production Flow

1

User uploads video

Cutsio app

2

Eligibility check

Plan / settings

3

Set passthrough flag

vi:1 on Mux asset

4

Selective sync

Mixpeek filters

5

Visual search ready

< 2 min end-to-end

The Solution

Mixpeek's pipeline connects directly to Cutsio's Mux video infrastructure through a selective sync that filters which assets get indexed based on Mux asset-level passthrough metadata. Cutsio sets a flag on each asset (e.g. vi:1) to control eligibility, and Mixpeek's metadata_filters only index assets that match. When the flag changes, a video.asset.updated webhook re-evaluates filters in real-time, and reconcile_on_sync cleans up assets that no longer qualify. A custom video conversion extractor (v3.1.0) passes through standard video formats untouched and only transcodes ARRI RAW and RED R3D using built-in REDline and ARRI Reference Tool CMD — no DaVinci Resolve needed. Converted footage flows into a multimodal decomposition collection that splits video into segments with timestamps, extracting scene-level embeddings for composition, subjects, motion, color palette, and on-screen text. A face_identity_extractor@v1 then runs on each segment to detect and embed faces using ArcFace. Two retrievers serve the index: a visual search retriever for text and image queries, and a face search retriever that lets editors upload a photo of a person and find every video moment where they appear.

Implementation

Cutsio's pipeline runs as three chained collections: the first handles raw format detection and transcoding (passing standard video through, transcoding R3D and ARRI RAW to MP4), the second performs multimodal feature extraction on the converted output, and the third runs face detection and identity embedding on each video segment. A Mux selective sync uses metadata_filters on the asset passthrough field to control which assets are indexed — Cutsio remains the source of truth, Mux carries the sync hint, and Mixpeek handles the indexing. The video.asset.updated webhook re-evaluates filters when passthrough changes, and reconcile_on_sync ensures assets that lose eligibility are automatically unindexed. The custom conversion extractor (v3.1.0) eliminated the external DaVinci dependency by leveraging RED and ARRI's native CLI tools pre-installed in the Mixpeek engine base image. End-to-end, a raw R3D file uploaded to Mux becomes visually and facially searchable within minutes.

Results

Before and after Mixpeek

-99%

Footage Search Time

Before35 min avg
After< 15 sec

Fully Automated

Raw Format Support

BeforeManual DaVinci conversion
AfterNative R3D + ARRI decode

-86%

Editor Time on Search

Before35% of workday
After5% of workday

Fully Automated

Face Identification

BeforeManual scrubbing
AfterUpload photo → instant results

Customer testimonial

"Our editors used to lose half their day hunting for the right take. Now they describe what they need and the exact frame comes back in seconds. The fact that it handles RED and ARRI raw natively, without DaVinci in the loop, was the unlock."

Rish Chandna

Founder, Cutsio

video search
raw footage
post-production
RED R3D
ARRI RAW
visual search
face search
media asset management
metadata filtering
selective sync

Get Similar Results

See how Mixpeek can deliver measurable impact for your Media & Entertainment organization. Book a personalized demo to discuss your specific challenges.