Deploy model to Ray object store
Models
Deploy model to Ray object store
Pre-load model weights into the Ray object store for fast access by plugins.
This operation:
- Downloads the model archive from S3
- Deserializes weights based on format (safetensors, pytorch, etc.)
- Stores weights in Ray object store for zero-copy sharing
After deployment, plugins can load the model instantly using:
from engine.models.loader import load_namespace_model
weights = load_namespace_model("model_id")
Note: This is optional - models are also loaded on-demand when plugins first request them. Use this endpoint to pre-warm the cache.
POST
Deploy model to Ray object store
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Response
Successful Response
Response model for model deployment.
Deploying a model pre-loads the weights into the Ray object store, making them available for zero-copy access by plugins.

