Mixpeek Logo
    Login / Signup
    Your Infrastructure, Your Control

    Self-Hosted Multimodal Data Warehouse

    Deploy the full Mixpeek multimodal data warehouse on your own infrastructure. Same tiered storage economics, same multi-stage retrieval pipelines, complete data sovereignty and regulatory compliance.

    Why Self-Host Mixpeek?

    Organizations with strict data governance, regulatory requirements, or performance demands choose self-hosted deployment for complete control.

    Complete Data Sovereignty

    Your data never leaves your infrastructure. All processing, embedding, and indexing happens within your own VPC or on-premise environment.

    Regulatory Compliance

    Meet HIPAA, SOC 2, GDPR, FedRAMP, and industry-specific compliance requirements by keeping data within your controlled perimeter.

    Network Latency Elimination

    Process data locally without round-trips to external APIs. Achieve sub-millisecond embedding lookups and real-time processing throughput.

    Full Configuration Control

    Customize GPU allocation, model selection, scaling parameters, and pipeline configuration to match your exact workload requirements.

    Deployment Architecture

    Four steps to running Mixpeek on your own infrastructure.

    1

    Deploy Mixpeek Engine

    Run the Mixpeek processing engine on your Kubernetes cluster or bare-metal GPU servers. The engine handles model serving, feature extraction, and batch orchestration via Ray.

    2

    Connect Your Storage

    Point Mixpeek at your existing object storage -- S3, GCS, MinIO, or any S3-compatible endpoint. Data stays in your buckets; Mixpeek reads and processes in place.

    3

    Configure Pipelines

    Define collections with feature extractors for your content types. Choose embedding models, classification taxonomies, and extraction parameters through the API.

    4

    Index and Retrieve

    Processed embeddings are stored in your Qdrant instance. Build retriever pipelines with filter, search, and rerank stages -- all running within your infrastructure.

    Security & Compliance

    Self-hosted Mixpeek gives you full control over your security posture. No data leaves your perimeter, and you manage access, encryption, and audit logging on your terms.

    • All data processed and stored within your infrastructure perimeter
    • No external API calls for model inference or embedding generation
    • Full audit logging of all pipeline operations and data access
    • Role-based access control for API endpoints and namespaces
    • Encryption at rest and in transit using your own key management
    • Air-gapped deployment option for classified or highly sensitive environments

    Compliance Ready

    HIPAASupported
    SOC 2Supported
    GDPRSupported
    FedRAMPSupported
    ISO 27001Supported

    Full Platform Capabilities

    Self-hosted deployments include the complete Mixpeek feature set -- the same capabilities available in Mixpeek Cloud.

    Multimodal Processing

    Extract features from text, images, video, audio, and documents using the same pipeline architecture as Mixpeek Cloud.

    • Vision transformers for image and video
    • Speech recognition for audio content
    • OCR and layout analysis for documents
    • Custom model plugins for domain-specific tasks

    Vector Storage

    Store and query embeddings in your own Qdrant deployment with full control over collection configuration and scaling.

    • Dense, sparse, and multi-vector support
    • Configurable indexing parameters
    • Horizontal scaling across nodes
    • Snapshot and backup management

    Distributed Processing

    Ray-based distributed compute handles batch processing across your GPU cluster with automatic scaling and fault recovery.

    • Multi-GPU model serving
    • Automatic batch orchestration
    • Stalled job detection and recovery
    • Configurable concurrency limits

    Self-Hosted vs Managed Cloud

    Choose the deployment model that fits your requirements.

    FeatureSelf-HostedManaged Cloud
    Data LocationYour infrastructure exclusivelyMixpeek Cloud (multi-tenant or dedicated)
    ComplianceYou control the full compliance postureSOC 2 Type II, GDPR compliant
    GPU ManagementYou provision and manage GPU resourcesFully managed by Mixpeek
    ScalingManual or custom autoscaling on your clusterAutomatic scaling based on workload
    UpdatesYou control upgrade timing and versionsContinuous updates with zero downtime
    Network LatencyLocal network -- sub-millisecondInternet round-trip to cloud endpoints

    Same API, Your Infrastructure

    Deploy with Helm, then use the same Mixpeek SDK and API you already know. Just point to your internal endpoint.

    self_hosted_setup.py
    # Deploy Mixpeek Engine on your Kubernetes cluster
    helm repo add mixpeek https://charts.mixpeek.com
    helm install mixpeek-engine mixpeek/engine \
      --set storage.endpoint=s3://your-bucket \
      --set qdrant.host=qdrant.internal:6334 \
      --set gpu.enabled=true
    
    # Configure via the same Mixpeek SDK
    from mixpeek import Mixpeek
    
    # Point to your self-hosted instance
    client = Mixpeek(
        api_key="YOUR_INTERNAL_KEY",
        base_url="https://mixpeek.internal.yourcompany.com"
    )
    
    # Create a namespace and collection -- same API as cloud
    namespace = client.namespaces.create(
        namespace_name="internal-documents",
        vector_config={"dimensions": 768}
    )
    
    collection = client.collections.create(
        namespace_id=namespace.id,
        collection_name="contracts",
        extractors=[
            {"type": "text_embedding", "model": "e5-large-v2"},
            {"type": "ocr", "model": "doctr"},
        ]
    )

    Frequently Asked Questions

    What infrastructure do I need to self-host Mixpeek?

    Mixpeek self-hosted runs on Kubernetes (GKE, EKS, AKS, or on-premise) with GPU nodes for model inference. Minimum requirements include a Kubernetes cluster with at least one GPU node (NVIDIA T4 or better), a Qdrant instance for vector storage, and S3-compatible object storage. The exact GPU count depends on your processing volume and latency requirements.

    Which GPU types does the self-hosted engine support?

    Mixpeek supports NVIDIA GPUs including T4, A10G, L4, A100, and H100. The engine automatically detects available VRAM and configures model batching accordingly. For high-throughput deployments, A100 or H100 GPUs are recommended. CPU-only inference is supported for lighter workloads like text embedding and classification.

    How does licensing work for self-hosted deployments?

    Self-hosted Mixpeek is licensed per-node with annual contracts. The license includes access to all feature extractors, retriever pipeline capabilities, and API endpoints. Custom model plugins and priority support are included. Contact our sales team for pricing based on your deployment size.

    Can I run Mixpeek in an air-gapped environment?

    Yes. Mixpeek supports fully air-gapped deployment. All model weights, container images, and dependencies can be pre-loaded into your environment. No outbound internet access is required for operation. Model updates are delivered as offline packages for manual deployment.

    How are updates and upgrades handled?

    Self-hosted customers receive container image updates through a private registry or offline delivery. You control when to apply updates and can test in a staging environment before promoting to production. Rolling updates on Kubernetes ensure zero downtime during upgrades.

    What support is included with self-hosted deployments?

    Self-hosted licenses include deployment assistance, architecture review, and ongoing technical support. Enterprise plans include a dedicated solutions engineer, custom model integration support, and SLA-backed response times. Training and documentation are provided for your operations team.

    Can I use both self-hosted and managed cloud together?

    Yes. Hybrid deployments are supported where sensitive data is processed on your self-hosted infrastructure while less sensitive workloads use Mixpeek Cloud. Both environments use the same API and SDK, making migration between them straightforward.

    How does self-hosted Mixpeek integrate with my existing ML infrastructure?

    Mixpeek's plugin system lets you register custom feature extractors that call your existing model endpoints. The engine integrates with your GPU cluster through Ray, and the API layer connects to your existing storage and database infrastructure. Monitoring data can be exported to your observability stack via standard endpoints.

    Deploy Mixpeek On Your Infrastructure

    Get complete data sovereignty with the full Mixpeek multimodal AI platform running on your own servers. Talk to our team about self-hosted deployment.