NEWAgents can now see video via MCP.Try it now →
    Models/Image Text To Text/stepfun-ai/Step3-VL-10B

    Step3-VL-10B

    by stepfun-ai

    273Kdl/month
    405likes
    Identifier
    Model ID
    stepfun-ai/Step3-VL-10B

    Tags

    safetensorsstep_roboticsimage-text-to-textconversationalcustom_codearxiv:2601.09668base_model:stepfun-ai/Step3-VL-10B-Basebase_model:finetune:stepfun-ai/Step3-VL-10B-Baselicense:apache-2.0region:us

    Use Step3-VL-10B on Mixpeek

    Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.

    Open Pipeline Builder