NEWVectors or files. Pick a path.Start →

    Reinforcement Learning Models

    Browse AI models for multimodal decomposition and recomposition pipelines — plug any model into your extractors.

    215 models available

    Showing 124 of 215 models

    Reinforcement Learning

    HumanCompatibleAI/ppo-seals-CartPole-v0

    45K
    17
    stable-baselines3
    Reinforcement Learning

    HumanCompatibleAI/ppo-Pendulum-v1

    19K
    6
    stable-baselines3
    Reinforcement Learning

    mradermacher/AReaL-SEA-235B-A22B-i1-GGUF

    15K
    transformers
    Reinforcement Learning

    TianheWu/VisualQuality-R1-7B

    9K
    13
    Reinforcement Learning

    mradermacher/Aryabhata-2.0-i1-GGUF

    5K
    1
    transformers
    Reinforcement Learning

    mradermacher/SpatialThinker-30B-i1-GGUF

    4K
    transformers
    Reinforcement Learning

    mradermacher/GoLongRL-4B-i1-GGUF

    3K
    transformers
    Reinforcement Learning

    mradermacher/TinyResearcher-i1-GGUF

    3K
    transformers
    Reinforcement Learning

    mradermacher/Vero-Qwen35-9B-Base-i1-GGUF

    3K
    transformers
    Reinforcement Learning

    mradermacher/ChineseErrorCorrector4-4B-i1-GGUF

    3K
    transformers
    Reinforcement Learning

    mradermacher/Vero-Qwen35-9B-i1-GGUF

    3K
    transformers
    Reinforcement Learning

    mradermacher/Reflector-Internalizing-Safety-Llama-3.1-8B-RL-i1-GGUF

    2K
    1
    transformers
    Reinforcement Learning

    sb3/ppo-LunarLanderContinuous-v2

    2K
    stable-baselines3
    Reinforcement Learning

    mradermacher/Aryabhata-2.0-GGUF

    2K
    transformers
    Reinforcement Learning

    mradermacher/LongTraceRL-30B-i1-GGUF

    2K
    transformers
    Reinforcement Learning

    mradermacher/DeepHermes-Egregore-v1-RLAIF-8b-Atropos-i1-GGUF

    2K
    transformers
    Reinforcement Learning

    mradermacher/GALAX-i1-GGUF

    2K
    transformers
    Reinforcement Learning

    sb3/sac-BipedalWalkerHardcore-v3

    2K
    stable-baselines3
    Reinforcement Learning

    mradermacher/LongTraceRL-4B-i1-GGUF

    2K
    transformers
    Reinforcement Learning

    ValueFX9507/Tifa-Deepsex-14b-CoT-GGUF-Q4

    1K
    838
    transformers
    Reinforcement Learning

    nicklashansen/newt

    1K
    2
    Reinforcement Learning

    infly/inf-retriever-v1-pro

    1K
    7
    Reinforcement Learning

    mradermacher/MediX-R1-8B-i1-GGUF

    1K
    1
    transformers
    Reinforcement Learning

    mradermacher/MediX-R1-2B-i1-GGUF

    1K
    transformers
    1 / 9