Why Self-Host Mixpeek?
Organizations with strict data governance, regulatory requirements, or performance demands choose self-hosted deployment for complete control.
Complete Data Sovereignty
Your data never leaves your infrastructure. All processing, embedding, and indexing happens within your own VPC or on-premise environment.
Regulatory Compliance
Meet HIPAA, SOC 2, GDPR, FedRAMP, and industry-specific compliance requirements by keeping data within your controlled perimeter.
Network Latency Elimination
Process data locally without round-trips to external APIs. Achieve sub-millisecond embedding lookups and real-time processing throughput.
Full Configuration Control
Customize GPU allocation, model selection, scaling parameters, and pipeline configuration to match your exact workload requirements.
Deployment Architecture
Four steps to running Mixpeek on your own infrastructure.
Deploy Mixpeek Engine
Run the Mixpeek processing engine on your Kubernetes cluster or bare-metal GPU servers. The engine handles model serving, feature extraction, and batch orchestration via Ray.
Connect Your Storage
Point Mixpeek at your existing object storage -- S3, GCS, MinIO, or any S3-compatible endpoint. Data stays in your buckets; Mixpeek reads and processes in place.
Configure Pipelines
Define collections with feature extractors for your content types. Choose embedding models, classification taxonomies, and extraction parameters through the API.
Index and Retrieve
Processed embeddings are stored in your Qdrant instance. Build retriever pipelines with filter, search, and rerank stages -- all running within your infrastructure.
Security & Compliance
Self-hosted Mixpeek gives you full control over your security posture. No data leaves your perimeter, and you manage access, encryption, and audit logging on your terms.
- All data processed and stored within your infrastructure perimeter
- No external API calls for model inference or embedding generation
- Full audit logging of all pipeline operations and data access
- Role-based access control for API endpoints and namespaces
- Encryption at rest and in transit using your own key management
- Air-gapped deployment option for classified or highly sensitive environments
Compliance Ready
Full Platform Capabilities
Self-hosted deployments include the complete Mixpeek feature set -- the same capabilities available in Mixpeek Cloud.
Multimodal Processing
Extract features from text, images, video, audio, and documents using the same pipeline architecture as Mixpeek Cloud.
- Vision transformers for image and video
- Speech recognition for audio content
- OCR and layout analysis for documents
- Custom model plugins for domain-specific tasks
Vector Storage
Store and query embeddings in your own Qdrant deployment with full control over collection configuration and scaling.
- Dense, sparse, and multi-vector support
- Configurable indexing parameters
- Horizontal scaling across nodes
- Snapshot and backup management
Distributed Processing
Ray-based distributed compute handles batch processing across your GPU cluster with automatic scaling and fault recovery.
- Multi-GPU model serving
- Automatic batch orchestration
- Stalled job detection and recovery
- Configurable concurrency limits
Self-Hosted vs Managed Cloud
Choose the deployment model that fits your requirements.
| Feature | Self-Hosted | Managed Cloud |
|---|---|---|
| Data Location | Your infrastructure exclusively | Mixpeek Cloud (multi-tenant or dedicated) |
| Compliance | You control the full compliance posture | SOC 2 Type II, GDPR compliant |
| GPU Management | You provision and manage GPU resources | Fully managed by Mixpeek |
| Scaling | Manual or custom autoscaling on your cluster | Automatic scaling based on workload |
| Updates | You control upgrade timing and versions | Continuous updates with zero downtime |
| Network Latency | Local network -- sub-millisecond | Internet round-trip to cloud endpoints |
Same API, Your Infrastructure
Deploy with Helm, then use the same Mixpeek SDK and API you already know. Just point to your internal endpoint.
# Deploy Mixpeek Engine on your Kubernetes cluster
helm repo add mixpeek https://charts.mixpeek.com
helm install mixpeek-engine mixpeek/engine \
--set storage.endpoint=s3://your-bucket \
--set qdrant.host=qdrant.internal:6334 \
--set gpu.enabled=true
# Configure via the same Mixpeek SDK
from mixpeek import Mixpeek
# Point to your self-hosted instance
client = Mixpeek(
api_key="YOUR_INTERNAL_KEY",
base_url="https://mixpeek.internal.yourcompany.com"
)
# Create a namespace and collection -- same API as cloud
namespace = client.namespaces.create(
namespace_name="internal-documents",
vector_config={"dimensions": 768}
)
collection = client.collections.create(
namespace_id=namespace.id,
collection_name="contracts",
extractors=[
{"type": "text_embedding", "model": "e5-large-v2"},
{"type": "ocr", "model": "doctr"},
]
)Frequently Asked Questions
What infrastructure do I need to self-host Mixpeek?
Mixpeek self-hosted runs on Kubernetes (GKE, EKS, AKS, or on-premise) with GPU nodes for model inference. Minimum requirements include a Kubernetes cluster with at least one GPU node (NVIDIA T4 or better), a Qdrant instance for vector storage, and S3-compatible object storage. The exact GPU count depends on your processing volume and latency requirements.
Which GPU types does the self-hosted engine support?
Mixpeek supports NVIDIA GPUs including T4, A10G, L4, A100, and H100. The engine automatically detects available VRAM and configures model batching accordingly. For high-throughput deployments, A100 or H100 GPUs are recommended. CPU-only inference is supported for lighter workloads like text embedding and classification.
How does licensing work for self-hosted deployments?
Self-hosted Mixpeek is licensed per-node with annual contracts. The license includes access to all feature extractors, retriever pipeline capabilities, and API endpoints. Custom model plugins and priority support are included. Contact our sales team for pricing based on your deployment size.
Can I run Mixpeek in an air-gapped environment?
Yes. Mixpeek supports fully air-gapped deployment. All model weights, container images, and dependencies can be pre-loaded into your environment. No outbound internet access is required for operation. Model updates are delivered as offline packages for manual deployment.
How are updates and upgrades handled?
Self-hosted customers receive container image updates through a private registry or offline delivery. You control when to apply updates and can test in a staging environment before promoting to production. Rolling updates on Kubernetes ensure zero downtime during upgrades.
What support is included with self-hosted deployments?
Self-hosted licenses include deployment assistance, architecture review, and ongoing technical support. Enterprise plans include a dedicated solutions engineer, custom model integration support, and SLA-backed response times. Training and documentation are provided for your operations team.
Can I use both self-hosted and managed cloud together?
Yes. Hybrid deployments are supported where sensitive data is processed on your self-hosted infrastructure while less sensitive workloads use Mixpeek Cloud. Both environments use the same API and SDK, making migration between them straightforward.
How does self-hosted Mixpeek integrate with my existing ML infrastructure?
Mixpeek's plugin system lets you register custom feature extractors that call your existing model endpoints. The engine integrates with your GPU cluster through Ray, and the API layer connects to your existing storage and database infrastructure. Monitoring data can be exported to your observability stack via standard endpoints.
