NEWVectors or files. Pick a path.Start →

    Reinforcement Learning Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    215 models available

    Showing 97120 of 215 models

    Reinforcement Learning

    mradermacher/Vero-MiMo-7B-i1-GGUF

    445
    2
    transformers
    Reinforcement Learning

    ValueFX9507/Tifa-Deepsex-14b-CoT-Q8

    438
    186
    transformers
    Reinforcement Learning

    PaulVialard/ppo-Huggy

    430
    ml-agents
    Reinforcement Learning

    mradermacher/LongTraceRL-4B-GGUF

    421
    transformers
    Reinforcement Learning

    mradermacher/BetaCeti-Beta-4B-Prime1-i1-GGUF

    419
    transformers
    Reinforcement Learning

    thiagobelin/ppo-LunarLander-v3

    418
    stable-baselines3
    Reinforcement Learning

    mradermacher/SocialR1-8B-i1-GGUF

    412
    1
    transformers
    Reinforcement Learning

    mradermacher/AutoBM-Seed-Coder-8B-R-GGUF

    403
    transformers
    Reinforcement Learning

    yhyk1971/Tifa-Deepsex-14b-CoT-Q8

    396
    transformers
    Reinforcement Learning

    rezvan98/trading-agent-rl

    392
    stable-baselines3
    Reinforcement Learning

    sb3/ppo-PongNoFrameskip-v4

    390
    1
    stable-baselines3
    Reinforcement Learning

    Adilbai/stock-trading-rl-agent

    376
    147
    stable-baselines3
    Reinforcement Learning

    mradermacher/CIM-Qwen2.5-VL-7B-GGUF

    376
    1
    transformers
    Reinforcement Learning

    mradermacher/ReForm-SFT-3B-i1-GGUF

    375
    transformers
    Reinforcement Learning

    mradermacher/StarPO-1.7B-i1-GGUF

    375
    transformers
    Reinforcement Learning

    persadian/CropSeek-LLM

    370
    3
    transformers
    Reinforcement Learning

    MooreThreads/MusaCoder-27B

    368
    37
    Reinforcement Learning

    AllIllusion/LunarLander-v3

    366
    stable-baselines3
    Reinforcement Learning

    mradermacher/RLinf-math-7B-i1-GGUF

    361
    1
    transformers
    Reinforcement Learning

    kerenOrr/ppo-LunarLander-v2

    359
    stable-baselines3
    Reinforcement Learning

    mradermacher/LongWriter-Zero-32B-i1-GGUF

    358
    2
    transformers
    Reinforcement Learning

    ZhenghaiXue/Qwen2.5-7B-SimpleTIR

    358
    1
    Reinforcement Learning

    mradermacher/HER-32B-absolute-heresy-i1-GGUF

    352
    1
    transformers
    Reinforcement Learning

    jasonyandell/zeb-42

    349
    5 / 9