469dl/month
Identifier
Model ID
PKU-Alignment/beaver-7b-v3.0Tags
safe-rlhfsafetensorsllamareinforcement-learning-from-human-feedbackreinforcement-learningrlhfsafetyai-safetydeepspeedbeaveralpacaendataset:PKU-Alignment/PKU-SafeRLHFarxiv:2302.13971arxiv:2307.04657arxiv:2310.12773region:us
Use beaver-7b-v3.0 on Mixpeek
Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval in Mixpeek Studio.
Open StudioHow It Runs on Mixpeek
On Mixpeek, beaver-7b-v3.0 runs as a managed extractor inside a processing pipeline. Point a bucket of reinforcement learning data at it, and Mixpeek handles GPU provisioning, batching, retries, and writing the outputs into a vector store you can query.
Extractor outputs land in the Mixpeek Vector Store (MVS), where you can combine them with retrieval, reranking, and filter stages to build end-to-end search and agent-perception pipelines, no model-serving infrastructure to maintain.
Specification
OrganizationPKU-Alignment
TaskReinforcement Learning
Librarysafe-rlhf
Downloads/mo469
View on HuggingFace
See model card, files, and community discussion