NEWAgents can now see video via MCP.Try it now →
    Models/Visual Question Answering/google/pix2struct-ai2d-base
    Visual Question Answeringtransformersapache-2.0

    pix2struct-ai2d-base

    by google

    Identifier
    Model ID
    google/pix2struct-ai2d-base

    Tags

    transformerspytorchsafetensorspix2structimage-text-to-textvisual-question-answeringenfrrodemultilingualarxiv:2210.03347license:apache-2.0region:us

    Use pix2struct-ai2d-base on Mixpeek

    Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.

    Open Pipeline Builder