NEWVectors or files. Pick a path.Start →

    Reinforcement Learning Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    215 models available

    Showing 7396 of 215 models

    Reinforcement Learning

    ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q8

    549
    200
    transformers
    Reinforcement Learning

    mradermacher/VPR-Minesweeper-GGUF

    543
    transformers
    Reinforcement Learning

    mradermacher/winning-wedding-planner-7b-GGUF

    540
    1
    transformers
    Reinforcement Learning

    mradermacher/Agent-STAR-RL-3B-i1-GGUF

    534
    transformers
    Reinforcement Learning

    mradermacher/KnowRL-Nemotron-1.5B-GGUF

    534
    transformers
    Reinforcement Learning

    sb3/ppo-BreakoutNoFrameskip-v4

    532
    stable-baselines3
    Reinforcement Learning

    mradermacher/ChineseErrorCorrector4-4B-GGUF

    531
    transformers
    Reinforcement Learning

    mradermacher/Agent-STAR-RL-1.5B-i1-GGUF

    526
    transformers
    Reinforcement Learning

    mradermacher/TinyResearcher-GGUF

    510
    transformers
    Reinforcement Learning

    mradermacher/Vero-Qwen25-7B-GGUF

    501
    transformers
    Reinforcement Learning

    mradermacher/Tifa-Deepsex-14b-CoT-i1-GGUF

    501
    14
    transformers
    Reinforcement Learning

    mradermacher/AgentHijack-Agent-GGUF

    500
    1
    transformers
    Reinforcement Learning

    mradermacher/ProtoCycle-7B-GGUF

    499
    1
    transformers
    Reinforcement Learning

    Sepolian/qwen2.5-0.5B-math

    492
    transformers
    Reinforcement Learning

    Open-Reasoner-Zero/Open-Reasoner-Zero-1.5B

    488
    transformers
    Reinforcement Learning

    Ding-Qiang/ppo-CarRacing-v2

    481
    stable-baselines3
    Reinforcement Learning

    mradermacher/Arctic-AWM-8B-i1-GGUF

    474
    transformers
    Reinforcement Learning

    mradermacher/Pluto-i1-GGUF

    473
    transformers
    Reinforcement Learning

    sb3/ppo-CartPole-v1

    472
    stable-baselines3
    Reinforcement Learning

    PKU-Alignment/beaver-7b-v3.0

    469
    safe-rlhf
    Reinforcement Learning

    mradermacher/Wolf-Rayet-2B-Prime3-i1-GGUF

    469
    transformers
    Reinforcement Learning

    Open-Reasoner-Zero/Open-Reasoner-Zero-7B

    456
    34
    transformers
    Reinforcement Learning

    mradermacher/ToolOmni-Qwen3-4B-GGUF

    449
    1
    transformers
    Reinforcement Learning

    mradermacher/MINT-empathy-Qwen3-1.7B-GGUF

    446
    transformers
    4 / 9