Molmo2-O-7B

Name: Molmo2-O-7B
Author: allenai

by allenai

177Kdl/month

26likes

HuggingFace Run on your data

Identifier

Model ID

allenai/Molmo2-O-7B

Tags

transformerssafetensorsmolmo2image-text-to-textmultimodalolmomolmoconversationalcustom_codeendataset:allenai/Molmo2-Capdataset:allenai/Molmo2-VideoCapQAdataset:allenai/Molmo2-VideoSubtitleQAdataset:allenai/Molmo2-AskModelAnythingdataset:allenai/Molmo2-VideoPointdataset:allenai/Molmo2-VideoTrackdataset:allenai/Molmo2-MultiImageQAdataset:allenai/Molmo2-SynMultiImageQAdataset:allenai/Molmo2-MultiImagePointbase_model:allenai/Olmo-3-7B-Instructbase_model:finetune:allenai/Olmo-3-7B-Instructlicense:apache-2.0region:us

Use Molmo2-O-7B on Mixpeek

Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval in Mixpeek Studio.

Open Studio

How It Runs on Mixpeek

On Mixpeek, Molmo2-O-7B runs as a managed extractor inside a processing pipeline. Point a bucket of image text to text data at it, and Mixpeek handles GPU provisioning, batching, retries, and writing the outputs into a vector store you can query.

Extractor outputs land in the Mixpeek Vector Store (MVS), where you can combine them with retrieval, reranking, and filter stages to build end-to-end search and agent-perception pipelines, no model-serving infrastructure to maintain.