text_extractor if you need semantic search over the recovered text.
View extractor details at api.mixpeek.com/v1/collections/features/extractors/scrolling_text_extractor_v1 or fetch programmatically with
GET /v1/collections/features/extractors/{feature_extractor_id}.Pipeline Steps
- Sample frames — extract frames at
fpsframes per second. - Phase correlation — scan
strip_height-pixel strips to measure per-frame pixel shift and detect motion. - Classify bands — a band counts as scrolling when shift exceeds
min_shift_pxand at leastconsistency_ratioof frame pairs agree. - Crop — crop each detected band with
padpixels of padding above and below. - Stitch — reconstruct the full scrolling content as a panorama image per band.
- VLM OCR — read the panorama with a vision language model (Gemini).
- Output — combined text plus per-band metadata (axis, direction, shift).
When to Use
| Use Case | Description |
|---|---|
| News tickers | Recover the full crawl text from a horizontally scrolling ticker |
| End credits | Capture vertically scrolling credit rolls |
| Compliance disclaimers | Extract fast-scrolling legal/disclaimer banners for audit |
| Sports/finance banners | Read scrolling score or price strips |
When NOT to Use
| Scenario | Recommended Alternative |
|---|---|
| Static on-screen text / captions | text_extractor on transcription, or a frame OCR extractor |
| Spoken-word transcription | A transcription/audio extractor |
| Semantic search over recovered text | Chain text_extractor on the scrolling_text field |
| Non-video inputs | This extractor is video-only |
Input Schema
| Field | Type | Required | Description |
|---|---|---|---|
video | string | Yes | URL or path to the video file. Populated from input_mappings. |
Output Schema
| Field | Type | Description |
|---|---|---|
scrolling_text | string | null | Combined, deduplicated text from all detected scrolling bands |
scroll_bands | object[] | null | Per-band details: axis, direction, shift_per_frame, text |
bands_detected | integer | null | Number of scrolling text bands detected in the video |
Parameters
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
fps | float | 5.0 | 1.0–30.0 | Frame sampling rate. Higher values improve detection for fast-scrolling text but increase processing time |
strip_height | integer | 40 | 10–200 | Height (px) of each scanning strip used for phase correlation. Should roughly match the scrolling text band height |
min_shift_px | float | 2.0 | 0.5–20.0 | Minimum per-frame pixel shift to consider a strip ‘scrolling’. Lower detects slower text; higher filters noise |
consistency_ratio | float | 0.6 | 0.3–1.0 | Fraction of frame pairs that must show consistent shift for a band to count as scrolling (0.6 = 60%) |
pad | integer | 8 | 0–50 | Pixel padding above/below the detected band when cropping for stitching |
Configuration Examples
Performance & Costs
| Metric | Value |
|---|---|
| Cost | 30 credits per minute of video (frame extraction + stitching + VLM OCR) |
| External API | Google Gemini (VLM OCR) |
| Tradeoff | Higher fps improves fast-scroll accuracy at the cost of processing time |
Vector Index
This extractor produces payload-only output — no vector index. The recovered text lives in thescrolling_text field. To make it semantically searchable, run text_extractor against scrolling_text.
Limitations
- Video only: Accepts video inputs exclusively.
- No embedding: Output is payload-only; semantic search requires chaining a text extractor.
- Band-height sensitivity:
strip_heightshould approximate the actual band height for reliable detection. - VLM dependency: OCR quality depends on Gemini VLM availability and panorama clarity.

