HFText Embeddings
sentence-transformers/all-MiniLM-L6-v2
1024-dim vector↓ 195.7M
HFVisual Embeddings
openai/clip-vit-large-patch14
768-dim vector↓ 28.6M
HFAudio Embeddings
laion/clap-htsat-fused
512-dim vector↓ 20.7M
HFSpeaker Diarization
pyannote/speaker-diarization-3.1
speaker segments↓ 10.9M
HFText Embeddings
BAAI/bge-m3
1024-dim vector↓ 8.2M
HFText Embeddings
BAAI/bge-large-en-v1.5
1024-dim vector↓ 7.1M
HFTranscription
distil-whisper/distil-large-v3
text + timestamps↓ 4.8M
HFTranscription
openai/whisper-large-v3
text + timestamps↓ 4.7M
HFSegmentation
facebook/sam-vit-huge
mask + label↓ 3.2M
HFTable Extraction
microsoft/table-transformer-detection
table JSON↓ 3.0M
HFVisual Embeddings
facebook/dinov2-large
768-dim vector↓ 2.8M
HFScene Captioning
Qwen/Qwen3-VL-8B-Instruct
text↓ 2.8M
HFText Embeddings
Qwen/Qwen3-VL-Embedding-2B
1024-dim vector↓ 2.4M
HFText Embeddings
Qwen/Qwen3-Embedding-0.6B
1024-dim vector↓ 2.1M
HFScene Captioning
google/gemma-4-4b-it
text↓ 1.9M
HFScene Captioning
google/paligemma2-3b-mix-448
text↓ 1.8M
HFSegmentation
facebook/sam2.1-hiera-large
mask + label↓ 1.8M
HFScene Captioning
OpenGVLab/InternVL3-8B
text↓ 1.6M
HFVisual Embeddings
jinaai/jina-embeddings-v4
768-dim vector↓ 1.5M
HFObject Detection
IDEA-Research/grounding-dino-base
bbox + label↓ 1.5M
HFText Embeddings
nomic-ai/nomic-embed-text-v2-moe
1024-dim vector↓ 1.4M
HFDepth Estimation
depth-anything/Depth-Anything-V2-Large
depth map↓ 1.4M
HFScene Captioning
microsoft/Florence-2-large
text↓ 1.3M
HFVisual Embeddings
google/siglip-base-patch16-224
768-dim vector↓ 1.2M
HFVisual Embeddings
google/siglip2-giant-opt-patch16-384
768-dim vector↓ 1.2M
HFVisual Embeddings
laion/CLIP-ViT-bigG-14-laion2B-39B-b160k
768-dim vector↓ 890K
HFOCR
lightonai/LightOnOCR-2-1B
text + bbox↓ 730K
HFVisual Embeddings
BAAI/EVA02-CLIP-L-14-336
768-dim vector↓ 620K
HFObject Detection
google/owlvit-large-patch14
bbox + label↓ 580K
HFDocument Structure
microsoft/layoutlmv3-base
structure tokens↓ 565K
HFOCR
microsoft/trocr-large-printed
text + bbox↓ 554K
HFOCR
zai-org/GLM-OCR
text + bbox↓ 520K
HFDepth Estimation
apple/DepthPro
depth map↓ 520K
HFScene Captioning
Salesforce/blip2-opt-2.7b
text↓ 516K
PyTorchVisual Embeddings
facebook/dinov3-large
768-dim vector↓ 450K
HFObject Detection
roboflow/rf-detr-base
bbox + label↓ 420K
NeMoTranscription
nvidia/parakeet-tdt-0.6b-v3
text + timestamps↓ 420K
PyTorchSegmentation
facebook/sam3
mask + label↓ 420K
HFVisual Embeddings
apple/AIMv2-large-patch14-native
768-dim vector↓ 380K
PyTorchObject Detection
AILab-CVC/YOLO-World-L
bbox + label↓ 320K
HFCode Extraction
microsoft/codebert-base
code + language↓ 261K
HFObject Detection
facebook/detr-resnet-50
bbox + label↓ 246K
HFDocument Structure
naver-clova-ix/donut-base
structure tokens↓ 216K
HFFace Detection
isidentical/auraface-v1
face embedding↓ 180K
HFTranscription
usefulsensors/moonshine-streaming-medium
text + timestamps↓ 180K
PyTorchAnomaly Detection
amazon/patchcore-resnet50
anomaly score + map↓ 180K
HFCode Extraction
Salesforce/codet5p-110m-embedding
code + language↓ 154K
HFAudio Embeddings
facebook/encodec_24khz
512-dim vector↓ 112K
HFObject Detection
hustvl/yolos-tiny
bbox + label↓ 107K
HFTranscription
facebook/wav2vec2-large-960h
text + timestamps↓ 37K
C++/PythonVector Indexing
facebook/faiss
index + results↓ 39.5K★
PyTorchObject Detection
ultralytics/yolov8n
bbox + label↓ —
PyTorchObject Detection
ultralytics/yolo11n
bbox + label↓ -
PyTorchObject Detection
ultralytics/yolo26n
bbox + label↓ -
HFFace Detection
deepinsight/retinaface-r50
face embedding↓ —
HFFace Detection
timesformer/facenet-pytorch
face embedding↓ —
PyTorchOCR
PaddlePaddle/paddleocr
text + bbox↓ —
PyTorchSegmentation
netflix/void-model
mask + label↓ —
C++/PythonVector Indexing
google/scann
index + results↓ -