NEWWhy single embeddings fail for video.Read the post →

    AI Model Hub

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    9,588 models available

    Showing 78497872 of 9,588 models

    Object Detection

    James332/cppe5_use_data_finetuning

    104
    transformers
    Object Detection

    Charliesgt/pollencounter_detr_resnet50_benchmark

    104
    transformers
    Object Detection

    Francis51/detr-finetuned-VOC-v1

    104
    transformers
    Object Detection

    Hsueh1001/try

    104
    transformers
    Image Segmentation

    saberzl/SIDA-13B-description

    104
    3
    Image Segmentation

    JaesonGu/segformer-b0-finetuned-segments-chargers-2-15

    104
    transformers
    Image Segmentation

    SatwikKambham/segformer-b0-finetuned-suim

    104
    transformers
    Text To Video

    guoyww/animatediff-sparsectrl-rgb

    104
    8
    diffusers
    Text To Video

    Khanbby/HunyuanVideo

    104
    1
    Audio Classification

    Kibalama/urban_sounds_classification_Model

    104
    transformers
    Audio Classification

    tiantiaf/voxlect-english-dialect-whisper-large-v3

    104
    2
    transformers
    Text To Audio

    mradermacher/CiSiMi-GGUF

    104
    transformers
    Video Classification

    Nikeytas/videomae-crime-detector-ultra-v1

    103
    1
    Audio To Audio

    hahmadraz/sepformer-libri4mix

    103
    4
    speechbrain
    Visual Question Answering

    gaianet/MiniCPM-Llama3-V-2_5-GGUF

    103
    3
    Visual Question Answering

    google/pix2struct-widget-captioning-base

    103
    6
    transformers
    Document Question Answering

    Chan-yeong/donut-sroie-company-sample-demo

    103
    Object Detection

    nsugianto/tblstructrecog_finetuned_tbltransstrucrecog_v2_s1_253s

    103
    transformers
    Object Detection

    openvision/yolo26-n

    103
    3
    yolov26
    Object Detection

    Charliesgt/pollen_detr

    103
    transformers
    Image Feature Extraction

    plhery/mobileclip2-onnx

    103
    3
    transformers.js
    Image Feature Extraction

    timm/sam2_hiera_tiny.fb_r896

    103
    timm
    Video Classification

    nateraw/videomae-base-finetuned-ucf101

    103
    2
    transformers
    Text To Video

    Kevin-thu/StoryMem

    103
    93
    328 / 400