Scrolling Text Extractor

Built-in extractor names are a deprecated alias — collections are now created by picking features. This pipeline is selected with features: ["onscreen_text"]. Existing feature_extractor configs keep working; see the migration guide.

Extract on-screen text from your own video

Create a managed namespace and run this pipeline on your own files — recover scrolling and static on-screen text, OCR it, and make it searchable in minutes.

View on GitHub

Runnable reference for this extractor — inputs, parameters, output fields, embedding models, and copy-paste examples. Auto-generated from the live registry.

The scrolling text extractor recovers scrolling or marquee text from video — tickers, lower-third banners, end credits, and legal disclaimers — that no single frame ever shows in full. It detects scrolling bands via phase correlation, stitches frames panorama-style to reconstruct the complete text, then OCRs the panorama with a vision language model (Gemini). Output is payload-only (no vector); pair it with text_extractor if you need semantic search over the recovered text.

View extractor details at api.mixpeek.com/v1/collections/features/extractors/scrolling_text_extractor_v1 or fetch programmatically with GET /v1/collections/features/extractors/{feature_extractor_id}.

Pipeline Steps

Sample frames — extract frames at fps frames per second.
Phase correlation — scan strip_height-pixel strips to measure per-frame pixel shift and detect motion.
Classify bands — a band counts as scrolling when shift exceeds min_shift_px and at least consistency_ratio of frame pairs agree.
Crop — crop each detected band with pad pixels of padding above and below.
Stitch — reconstruct the full scrolling content as a panorama image per band.
VLM OCR — read the panorama with a vision language model (Gemini).
Output — combined text plus per-band metadata (axis, direction, shift).

When to Use

Use Case	Description
News tickers	Recover the full crawl text from a horizontally scrolling ticker
End credits	Capture vertically scrolling credit rolls
Compliance disclaimers	Extract fast-scrolling legal/disclaimer banners for audit
Sports/finance banners	Read scrolling score or price strips

When NOT to Use

Scenario	Recommended Alternative
Static on-screen text / captions	`text_extractor` on transcription, or a frame OCR extractor
Spoken-word transcription	A transcription/audio extractor
Semantic search over recovered text	Chain `text_extractor` on the `scrolling_text` field
Non-video inputs	This extractor is video-only

Input Schema

Field	Type	Required	Description
`video`	string	Yes	URL or path to the video file. Populated from `input_mappings`.

{
  "video": "s3://my-bucket/clips/newscast.mp4"
}

Supported input types: VIDEO.

Output Schema

Field	Type	Description
`scrolling_text`	string \| null	Combined, deduplicated text from all detected scrolling bands
`scroll_bands`	object[] \| null	Per-band details: `axis`, `direction`, `shift_per_frame`, `text`
`bands_detected`	integer \| null	Number of scrolling text bands detected in the video

{
  "scrolling_text": "BREAKING: Markets rally as inflation cools ...",
  "bands_detected": 1,
  "scroll_bands": [
    {
      "axis": "horizontal",
      "direction": "right_to_left",
      "shift_per_frame": 6.4,
      "text": "BREAKING: Markets rally as inflation cools ..."
    }
  ]
}

Parameters

Parameter	Type	Default	Range	Description
`fps`	float	`5.0`	1.0–30.0	Frame sampling rate. Higher values improve detection for fast-scrolling text but increase processing time
`strip_height`	integer	`40`	10–200	Height (px) of each scanning strip used for phase correlation. Should roughly match the scrolling text band height
`min_shift_px`	float	`2.0`	0.5–20.0	Minimum per-frame pixel shift to consider a strip ‘scrolling’. Lower detects slower text; higher filters noise
`consistency_ratio`	float	`0.6`	0.3–1.0	Fraction of frame pairs that must show consistent shift for a band to count as scrolling (0.6 = 60%)
`pad`	integer	`8`	0–50	Pixel padding above/below the detected band when cropping for stitching

Configuration Examples

{
  "feature_extractor": {
    "feature_extractor_name": "scrolling_text_extractor",
    "version": "v1",
    "input_mappings": {
      "video": "video_url"
    },
    "parameters": {}
  }
}

{
  "feature_extractor": {
    "feature_extractor_name": "scrolling_text_extractor",
    "version": "v1",
    "input_mappings": {
      "video": "video_url"
    },
    "parameters": {
      "fps": 15.0,
      "strip_height": 30,
      "min_shift_px": 5.0
    }
  }
}

{
  "feature_extractor": {
    "feature_extractor_name": "scrolling_text_extractor",
    "version": "v1",
    "input_mappings": {
      "video": "video_url"
    },
    "parameters": {
      "fps": 3.0,
      "strip_height": 120,
      "min_shift_px": 1.0,
      "consistency_ratio": 0.5
    }
  }
}

Performance & Costs

Metric	Value
Cost	Billed per video minute (frame extraction + stitching + VLM OCR) — see Billing & Pricing; rates come from `GET /v1/billing/pricing`
External API	Google Gemini (VLM OCR)
Tradeoff	Higher `fps` improves fast-scroll accuracy at the cost of processing time

Vector Index

This extractor produces payload-only output — no vector index. The recovered text lives in the scrolling_text field. To make it semantically searchable, run text_extractor against scrolling_text.

Limitations

Video only: Accepts video inputs exclusively.
No embedding: Output is payload-only; semantic search requires chaining a text extractor.
Band-height sensitivity: strip_height should approximate the actual band height for reliable detection.
VLM dependency: OCR quality depends on Gemini VLM availability and panorama clarity.

Get started

Connect your data

Extract features

Build retrievers

Enrich & organize

Integrate & operate

Resources

Scrolling Text Extractor

Extract on-screen text from your own video

View on GitHub

Pipeline Steps

When to Use

When NOT to Use

Input Schema

Output Schema

Parameters

Configuration Examples

Performance & Costs

Vector Index

Limitations

Extract on-screen text from your own video

View on GitHub

​Pipeline Steps

​When to Use

​When NOT to Use

​Input Schema

​Output Schema

​Parameters

​Configuration Examples

​Performance & Costs

​Vector Index

​Limitations

​Related

Pipeline Steps

When to Use

When NOT to Use

Input Schema

Output Schema

Parameters

Configuration Examples

Performance & Costs

Vector Index

Limitations

Related