NEWAgents can now see video via MCP.Try it now →
    Models/Image To Text/sbintuitions/sarashina2.2-vision-3b
    Image To Texttransformersmit

    sarashina2.2-vision-3b

    by sbintuitions

    775dl/month
    17likes
    Identifier
    Model ID
    sbintuitions/sarashina2.2-vision-3b

    Tags

    transformerssafetensorssarashina2_visiontext-generationmultimodalvision-languageimage-to-textcustom_codejaenarxiv:2404.07824arxiv:2403.19454arxiv:2410.17250arxiv:2007.00398arxiv:2104.12756base_model:sbintuitions/sarashina2.2-3b-instruct-v0.1base_model:finetune:sbintuitions/sarashina2.2-3b-instruct-v0.1license:mitregion:us

    Use sarashina2.2-vision-3b on Mixpeek

    Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.

    Open Pipeline Builder