775dl/month
17likes
Identifier
Model ID
sbintuitions/sarashina2.2-vision-3bTags
transformerssafetensorssarashina2_visiontext-generationmultimodalvision-languageimage-to-textcustom_codejaenarxiv:2404.07824arxiv:2403.19454arxiv:2410.17250arxiv:2007.00398arxiv:2104.12756base_model:sbintuitions/sarashina2.2-3b-instruct-v0.1base_model:finetune:sbintuitions/sarashina2.2-3b-instruct-v0.1license:mitregion:us
Use sarashina2.2-vision-3b on Mixpeek
Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.
Open Pipeline BuilderSpecification
Organizationsbintuitions
TaskImage To Text
Librarytransformers
Licensemit
Downloads/mo775
Likes17
View on HuggingFace
See model card, files, and community discussion