1Kdl/month
16likes
Identifier
Model ID
DAMO-NLP-SG/VideoLLaMA2.1-7B-AVTags
transformerssafetensorsvideollama2_qwen2text-generationAudio-visual Question AnsweringAudio Question Answeringmultimodal large language modelvisual-question-answeringendataset:lmms-lab/ClothoAQAdataset:Loie/VGGSoundarxiv:2406.07476arxiv:2306.02858license:apache-2.0endpoints_compatibleregion:us
Use VideoLLaMA2.1-7B-AV on Mixpeek
Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.
Open Pipeline BuilderSpecification
OrganizationDAMO-NLP-SG
TaskVisual Question Answering
Librarytransformers
Licenseapache-2.0
Downloads/mo1K
Likes16
View on HuggingFace
See model card, files, and community discussion