168dl/month
8likes
Identifier
Model ID
DAMO-NLP-SG/VideoLLaMA3-2B-ImageTags
transformerssafetensorsvideollama3_qwen2text-generationmulti-modallarge-language-modelvideo-language-modelvisual-question-answeringcustom_codeendataset:lmms-lab/LLaVA-OneVision-Datadataset:allenai/pixmo-docsdataset:HuggingFaceM4/Docmatixdataset:lmms-lab/LLaVA-Video-178Kdataset:ShareGPT4Video/ShareGPT4Videoarxiv:2501.13106arxiv:2406.07476arxiv:2306.02858base_model:Qwen/Qwen2.5-1.5B-Instructbase_model:finetune:Qwen/Qwen2.5-1.5B-Instructlicense:apache-2.0region:us
Use VideoLLaMA3-2B-Image on Mixpeek
Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.
Open Pipeline BuilderSpecification
OrganizationDAMO-NLP-SG
TaskVisual Question Answering
Librarytransformers
Licenseapache-2.0
Downloads/mo168
Likes8
View on HuggingFace
See model card, files, and community discussion