CLIP-ViT-bigG-14-laion2B-39B-b160k
by laion
Open-source CLIP trained on 2B image-text pairs at giant scale
laion/CLIP-ViT-bigG-14-laion2B-39B-b160kmixpeek://image_extractor@v1/laion_openclip_bigG_v1Overview
OpenCLIP is the open-source reproduction of CLIP by the LAION/ML Foundations community. This ViT-bigG/14 variant was trained on LAION-2B (2 billion image-text pairs), achieving up to 85.4% ImageNet zero-shot accuracy — surpassing OpenAI's original CLIP.
On Mixpeek, OpenCLIP provides the highest-accuracy open-weight visual embeddings for text-to-image and image-to-image retrieval at scale.
Architecture
Vision Transformer (ViT-bigG/14) with ~1.8B vision parameters. Trained with contrastive learning on LAION-2B dataset for 39B samples seen. Produces 1280-dim embeddings projected to shared vision-text space.
Mixpeek SDK Integration
import { Mixpeek } from "mixpeek";
const mx = new Mixpeek({ apiKey: "API_KEY" });
// Managed: create a collection over a bucket; Mixpeek runs this model's extractor
const collection = await mx.collections.create({
namespace_id: "my-namespace",
collection_name: "my-collection",
source: { type: "bucket", bucket_ids: ["bkt_your_bucket"] },
feature_extractor: {
feature_extractor_name: "image_embedding",
version: "v1",
parameters: { model_id: "laion/CLIP-ViT-bigG-14-laion2B-39B-b160k" },
},
});Capabilities
- 85.4% ImageNet zero-shot accuracy
- Trained on 2B open image-text pairs
- 1280-dimensional dense embeddings
- Strongest open-weight CLIP variant
- Supports both ViT and ConvNeXt backbones
Use Cases on Mixpeek
Benchmarks
| Dataset | Metric | Score | Source |
|---|---|---|---|
| ImageNet zero-shot | Top-1 Accuracy | 80.1% | Schuhmann et al., 2022 — Table 9 |
| VTAB+ (avg 35 tasks) | Accuracy | 75.3% | Schuhmann et al., 2022 — Table 10 |
Performance
2.5B params — largest open CLIP variant
Common Pipeline Companions
Explore on Mixpeek
Compare alternatives in this category
Hand-picked tools & platforms compared
Deep-dive technical guide
See how Mixpeek runs models as extractors
Store & search embeddings at scale
Usage-based pricing for pipelines
Compare models, APIs & infrastructure
Specification
Research Paper
Reproducible Scaling Laws for Contrastive Language-Image Learning
arxiv.orgBuild a pipeline with CLIP-ViT-bigG-14-laion2B-39B-b160k
Add this model to a processing pipeline alongside other extractors. Combine with retrieval stages for end-to-end search.
Open Studio