NEWAgents can now see video via MCP.Try it now →
    Models/Reinforcement Learning/PKU-Alignment/beaver-7b-v1.0-reward

    beaver-7b-v1.0-reward

    by PKU-Alignment

    958dl/month
    17likes
    Identifier
    Model ID
    PKU-Alignment/beaver-7b-v1.0-reward

    Tags

    safe-rlhfsafetensorsllamareinforcement-learning-from-human-feedbackreinforcement-learningbeaversafetyai-safetydeepspeedrlhfalpacaendataset:PKU-Alignment/PKU-SafeRLHFarxiv:2302.13971arxiv:2307.04657arxiv:2310.12773region:us

    Use beaver-7b-v1.0-reward on Mixpeek

    Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval, all through the Mixpeek pipeline builder.

    Open Pipeline Builder