848dl/month
2likes
Identifier
Model ID
PKU-Alignment/beaver-7b-unified-costTags
safe-rlhfsafetensorsllamareinforcement-learning-from-human-feedbackreinforcement-learningbeaversafetyai-safetydeepspeedrlhfalpacaendataset:PKU-Alignment/PKU-SafeRLHFarxiv:2302.13971arxiv:2307.04657arxiv:2310.12773region:us
Use beaver-7b-unified-cost on Mixpeek
Build multimodal processing pipelines with this model and others. Extract features, run inference, and set up retrieval in Mixpeek Studio.
Open StudioHow It Runs on Mixpeek
On Mixpeek, beaver-7b-unified-cost runs as a managed extractor inside a processing pipeline. Point a bucket of reinforcement learning data at it, and Mixpeek handles GPU provisioning, batching, retries, and writing the outputs into a vector store you can query.
Extractor outputs land in the Mixpeek Vector Store (MVS), where you can combine them with retrieval, reranking, and filter stages to build end-to-end search and agent-perception pipelines, no model-serving infrastructure to maintain.
Specification
OrganizationPKU-Alignment
TaskReinforcement Learning
Librarysafe-rlhf
Downloads/mo848
Likes2
View on HuggingFace
See model card, files, and community discussion