NEWAgents can now see video via MCP.Try it now →
    Models/Image To Text/ydshieh/vit-gpt2-coco-en
    Image To Texttransformers

    vit-gpt2-coco-en

    by ydshieh

    Identifier
    Model ID
    ydshieh/vit-gpt2-coco-en

    Tags

    transformerspytorchtfjaxtensorboardsafetensorsvision-encoder-decoderimage-text-to-textimage-to-textendpoints_compatibleregion:us

    Use vit-gpt2-coco-en on Mixpeek

    Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.

    Open Pipeline Builder