> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mixpeek.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Scrolling Text Extractor

> Extract scrolling/marquee text from video using phase-correlation band detection, panoramic stitching, and VLM OCR

<Card title="View on GitHub" icon="github" href="https://github.com/mixpeek/mixpeek-extractors/blob/main/extractors/scrolling_text_extractor/README.md" horizontal>
  Runnable reference for this extractor — inputs, parameters, output fields, embedding models, and copy-paste examples. Auto-generated from the live registry.
</Card>

The scrolling text extractor recovers **scrolling or marquee text** from video — tickers, lower-third banners, end credits, and legal disclaimers — that no single frame ever shows in full. It detects scrolling bands via phase correlation, stitches frames panorama-style to reconstruct the complete text, then OCRs the panorama with a vision language model (Gemini). Output is payload-only (no vector); pair it with `text_extractor` if you need semantic search over the recovered text.

<Note>
  View extractor details at [api.mixpeek.com/v1/collections/features/extractors/scrolling\_text\_extractor\_v1](https://api.mixpeek.com/v1/collections/features/extractors/scrolling_text_extractor_v1) or fetch programmatically with `GET /v1/collections/features/extractors/{feature_extractor_id}`.
</Note>

## Pipeline Steps

1. **Sample frames** — extract frames at `fps` frames per second.
2. **Phase correlation** — scan `strip_height`-pixel strips to measure per-frame pixel shift and detect motion.
3. **Classify bands** — a band counts as scrolling when shift exceeds `min_shift_px` and at least `consistency_ratio` of frame pairs agree.
4. **Crop** — crop each detected band with `pad` pixels of padding above and below.
5. **Stitch** — reconstruct the full scrolling content as a panorama image per band.
6. **VLM OCR** — read the panorama with a vision language model (Gemini).
7. **Output** — combined text plus per-band metadata (axis, direction, shift).

## When to Use

| Use Case                   | Description                                                      |
| -------------------------- | ---------------------------------------------------------------- |
| **News tickers**           | Recover the full crawl text from a horizontally scrolling ticker |
| **End credits**            | Capture vertically scrolling credit rolls                        |
| **Compliance disclaimers** | Extract fast-scrolling legal/disclaimer banners for audit        |
| **Sports/finance banners** | Read scrolling score or price strips                             |

## When NOT to Use

| Scenario                            | Recommended Alternative                                     |
| ----------------------------------- | ----------------------------------------------------------- |
| Static on-screen text / captions    | `text_extractor` on transcription, or a frame OCR extractor |
| Spoken-word transcription           | A transcription/audio extractor                             |
| Semantic search over recovered text | Chain `text_extractor` on the `scrolling_text` field        |
| Non-video inputs                    | This extractor is video-only                                |

## Input Schema

| Field   | Type   | Required | Description                                                     |
| ------- | ------ | -------- | --------------------------------------------------------------- |
| `video` | string | **Yes**  | URL or path to the video file. Populated from `input_mappings`. |

```json theme={null}
{
  "video": "s3://my-bucket/clips/newscast.mp4"
}
```

Supported input types: **VIDEO**.

## Output Schema

| Field            | Type              | Description                                                      |
| ---------------- | ----------------- | ---------------------------------------------------------------- |
| `scrolling_text` | string \| null    | Combined, deduplicated text from all detected scrolling bands    |
| `scroll_bands`   | object\[] \| null | Per-band details: `axis`, `direction`, `shift_per_frame`, `text` |
| `bands_detected` | integer \| null   | Number of scrolling text bands detected in the video             |

```json theme={null}
{
  "scrolling_text": "BREAKING: Markets rally as inflation cools ...",
  "bands_detected": 1,
  "scroll_bands": [
    {
      "axis": "horizontal",
      "direction": "right_to_left",
      "shift_per_frame": 6.4,
      "text": "BREAKING: Markets rally as inflation cools ..."
    }
  ]
}
```

## Parameters

| Parameter           | Type    | Default | Range    | Description                                                                                                        |
| ------------------- | ------- | ------- | -------- | ------------------------------------------------------------------------------------------------------------------ |
| `fps`               | float   | `5.0`   | 1.0–30.0 | Frame sampling rate. Higher values improve detection for fast-scrolling text but increase processing time          |
| `strip_height`      | integer | `40`    | 10–200   | Height (px) of each scanning strip used for phase correlation. Should roughly match the scrolling text band height |
| `min_shift_px`      | float   | `2.0`   | 0.5–20.0 | Minimum per-frame pixel shift to consider a strip 'scrolling'. Lower detects slower text; higher filters noise     |
| `consistency_ratio` | float   | `0.6`   | 0.3–1.0  | Fraction of frame pairs that must show consistent shift for a band to count as scrolling (0.6 = 60%)               |
| `pad`               | integer | `8`     | 0–50     | Pixel padding above/below the detected band when cropping for stitching                                            |

## Configuration Examples

<CodeGroup>
  ```json Default Detection theme={null}
  {
    "feature_extractor": {
      "feature_extractor_name": "scrolling_text_extractor",
      "version": "v1",
      "input_mappings": {
        "video": "video_url"
      },
      "parameters": {}
    }
  }
  ```

  ```json Fast Ticker (high fps, thin band) theme={null}
  {
    "feature_extractor": {
      "feature_extractor_name": "scrolling_text_extractor",
      "version": "v1",
      "input_mappings": {
        "video": "video_url"
      },
      "parameters": {
        "fps": 15.0,
        "strip_height": 30,
        "min_shift_px": 5.0
      }
    }
  }
  ```

  ```json Slow Credits Roll (tall band, sensitive) theme={null}
  {
    "feature_extractor": {
      "feature_extractor_name": "scrolling_text_extractor",
      "version": "v1",
      "input_mappings": {
        "video": "video_url"
      },
      "parameters": {
        "fps": 3.0,
        "strip_height": 120,
        "min_shift_px": 1.0,
        "consistency_ratio": 0.5
      }
    }
  }
  ```
</CodeGroup>

## Performance & Costs

| Metric           | Value                                                                     |
| ---------------- | ------------------------------------------------------------------------- |
| **Cost**         | 30 credits per minute of video (frame extraction + stitching + VLM OCR)   |
| **External API** | Google Gemini (VLM OCR)                                                   |
| **Tradeoff**     | Higher `fps` improves fast-scroll accuracy at the cost of processing time |

## Vector Index

This extractor produces **payload-only output** — no vector index. The recovered text lives in the `scrolling_text` field. To make it semantically searchable, run `text_extractor` against `scrolling_text`.

## Limitations

* **Video only**: Accepts video inputs exclusively.
* **No embedding**: Output is payload-only; semantic search requires chaining a text extractor.
* **Band-height sensitivity**: `strip_height` should approximate the actual band height for reliable detection.
* **VLM dependency**: OCR quality depends on Gemini VLM availability and panorama clarity.

## Related

* [Feature Extractors Overview](/processing/feature-extractors)
* [Text Extractor](/processing/extractors/text)
* [Universal Extractor](/processing/extractors/universal)
