NEWManaged multimodal retrieval.Explore platform →
    Models/Captioning/Kwai-Keye/Keye-VL-8B-Preview
    HFScene CaptioningApache 2.0

    Keye-VL-8B-Preview

    by Kwai-Keye

    Short-video VLM with temporal precision via 3D positional encoding

    38Kdl/month
    8Bparams
    Identifiers
    Model ID
    Kwai-Keye/Keye-VL-8B-Preview
    Feature URI
    mixpeek://video_extractor@v1/kwai_keye_vl_8b_v1

    Overview

    Keye-VL is a multimodal VLM specifically engineered for short-form video understanding while maintaining general vision-language abilities. Built by Kuaishou (operator of one of the world's largest short-video platforms), it uses 3D RoPE for unified text/image/video processing with one-to-one correspondence between position encoding and absolute time.

    Trained on 600B+ tokens with video emphasis, Keye-VL excels at understanding the dominant content format of the modern internet: short clips.

    Architecture

    8B parameter model built on Qwen3-8B + SigLIP vision encoder. Uses 3D RoPE (Rotary Position Embedding) for unified spatial-temporal encoding, enabling precise temporal grounding in video content.

    Mixpeek SDK Integration

    mixpeek.ingest.from_url(
    url="s3://ugc/short-clip.mp4",
    collection="short_videos",
    feature_extractors=[{
    "type": "caption",
    "model": "mixpeek://video_extractor@v1/kwai_keye_vl_8b_v1"
    }]
    )

    Capabilities

    • Short-video understanding
    • Temporal grounding
    • Image understanding
    • Video QA
    • Scene classification
    • Action recognition

    Use Cases on Mixpeek

    Short-form video content analysis
    Social media video indexing
    Ad creative understanding
    UGC content moderation

    Performance

    Input SizeVariable
    GPU Latency~150ms per clip (A100)
    GPU Throughput~7 clips/sec
    GPU MemoryModel dependent

    Specification

    FrameworkHF
    OrganizationKwai-Keye
    FeatureScene Captioning
    Outputtext
    Modalitiesvideo, image
    RetrieverSemantic Search
    Parameters8B
    LicenseApache 2.0
    Downloads/mo38K

    Build a pipeline with Keye-VL-8B-Preview

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio