NEWAgents can now see video via MCP.Try it now →

    Text To Audio Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    200 models available

    Showing 124 of 200 models

    Text To Audio

    facebook/musicgen-medium

    1.4M
    158
    transformers
    Text To Audio

    facebook/musicgen-small

    108K
    484
    transformers
    Text To Audio

    ACE-Step/Ace-Step1.5

    46K
    723
    transformers
    Text To Audio

    facebook/musicgen-large

    44K
    526
    transformers
    Text To Audio

    stabilityai/stable-audio-open-1.0

    21K
    1,450
    stable-audio-tools
    Text To Audio

    ixxan/mms-tts-uig-script_arabic-UQSpeech

    11K
    3
    transformers
    Text To Audio

    ACE-Step/acestep-5Hz-lm-4B

    8K
    46
    transformers
    Text To Audio

    facebook/musicgen-melody-large

    7K
    32
    transformers
    Text To Audio

    ACE-Step/acestep-v15-xl-turbo

    7K
    137
    transformers
    Text To Audio

    ACE-Step/acestep-5Hz-lm-0.6B

    7K
    13
    transformers
    Text To Audio

    slseanwu/MIDI-LLM_Llama-3.2-1B

    6K
    29
    transformers
    Text To Audio

    eustlb/higgs-audio-v2-generation-3B-base

    6K
    1
    transformers
    Text To Audio

    facebook/musicgen-melody

    5K
    251
    transformers
    Text To Audio

    ACE-Step/acestep-v15-xl-sft

    5K
    70
    transformers
    Text To Audio

    ACE-Step/acestep-captioner

    4K
    46
    transformers
    Text To Audio

    espnet/fastspeech2_conformer

    4K
    7
    transformers
    Text To Audio

    OpenMOSS-Team/MOSS-SoundEffect

    4K
    49
    Text To Audio

    HeartMuLa/HeartMuLa-oss-3B-happy-new-year

    4K
    27
    Text To Audio

    declare-lab/mustango

    4K
    41
    transformers
    Text To Audio

    stabilityai/stable-audio-open-small

    3K
    254
    stable-audio-tools
    Text To Audio

    Marvis-AI/marvis-tts-250m-v0.1

    3K
    73
    transformers
    Text To Audio

    riffusion/riffusion-model-v1

    3K
    648
    diffusers
    Text To Audio

    ACE-Step/acestep-v15-base

    3K
    60
    transformers
    Text To Audio

    ACE-Step/acestep-v15-sft

    3K
    49
    transformers
    1 / 9