NEWAgents can now see video via MCP.Try it now →
    Models/Image Text To Text/HuggingFaceTB/SmolVLM-500M-Instruct
    Image Text To Texttransformersapache-2.0

    SmolVLM-500M-Instruct

    by HuggingFaceTB

    92Kdl/month
    192likes
    Identifier
    Model ID
    HuggingFaceTB/SmolVLM-500M-Instruct

    Tags

    transformersonnxsafetensorsidefics3image-text-to-textconversationalendataset:HuggingFaceM4/the_cauldrondataset:HuggingFaceM4/Docmatixarxiv:2504.05299base_model:HuggingFaceTB/SmolLM2-360M-Instructbase_model:quantized:HuggingFaceTB/SmolLM2-360M-Instructlicense:apache-2.0endpoints_compatibleregion:us

    Use SmolVLM-500M-Instruct on Mixpeek

    Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.

    Open Pipeline Builder