NEWManaged multimodal retrieval.Explore platform →
    Models/Captioning/WorldSeek-AI/WorldSeek-Omni-2B-Preview
    HFScene CaptioningApache 2.0

    WorldSeek-Omni-2B-Preview

    by WorldSeek-AI

    Compact any-to-any omni model for text, image, video, and audio perception

    21dl/month
    2B previewparams
    Identifiers
    Model ID
    WorldSeek-AI/WorldSeek-Omni-2B-Preview
    Feature URI
    mixpeek://video_extractor@v1/worldseek_omni_2b_preview_v1

    Overview

    WorldSeek Omni 2B Preview is a compact any-to-any model that combines text, image, video, and audio inputs. It is built from Qwen language and ASR components and is positioned for multimodal understanding rather than a single isolated extraction task.

    On Mixpeek, it is relevant for agent perception workflows that need one compact model to inspect a retrieved image, listen to a clip, or reason over a short video segment before deciding the next tool call.

    Architecture

    Transformer-based any-to-any model with Qwen and Qwen3-ASR base components. The model card lists text, image, video, and audio tags with Apache 2.0 licensing.

    Mixpeek SDK Integration

    import { Mixpeek } from "mixpeek";
    const mx = new Mixpeek({ apiKey: "API_KEY" });
    await mx.collections.ingest({
    collection_id: "agent-observations",
    source: { url: "https://example.com/support-call-with-screen-share.mp4" },
    feature_extractors: [{
    feature: "scene_caption",
    model: "WorldSeek-AI/WorldSeek-Omni-2B-Preview"
    }]
    });

    Capabilities

    • Text, image, video, and audio input support
    • Compact 2B-class omni model
    • Any-to-any task framing
    • Apache 2.0 license

    Use Cases on Mixpeek

    Agent perception over retrieved mixed-media evidence
    Audio-visual content inspection after retrieval
    Compact multimodal reasoning for short clips and screenshots

    Specification

    FrameworkHF
    OrganizationWorldSeek-AI
    FeatureScene Captioning
    Outputtext
    Modalitiesvideo, image
    RetrieverSemantic Search
    Parameters2B preview
    LicenseApache 2.0
    Downloads/mo21

    Research Paper

    WorldSeek Omni 2B Preview

    arxiv.org

    Build a pipeline with WorldSeek-Omni-2B-Preview

    Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.

    Open Studio