NEWAgents can now see video via MCP.Try it now →
    Models/Visual Question Answering/ivelin/donut-refexp-combined-v1
    Visual Question Answeringtransformersagpl-3.0

    donut-refexp-combined-v1

    by ivelin

    Identifier
    Model ID
    ivelin/donut-refexp-combined-v1

    Tags

    transformerspytorchvision-encoder-decoderimage-text-to-textui refexpvisual-question-answeringendataset:ivelin/rico_refexp_combinedlicense:agpl-3.0endpoints_compatibleregion:us

    Use donut-refexp-combined-v1 on Mixpeek

    Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.

    Open Pipeline Builder