Mixpeek Logo
    Schedule Demo
    Back to Careers

    ML Infrastructure Engineer (Ray & Inference Focus)

    EngineeringRemoteFull-time$180,000 - $250,000

    Join our team to build and scale the ML infrastructure that powers Mixpeek's multimodal data processing platform. You'll be responsible for developing robust and high-performance data pipelines using Ray, deploying and optimizing inference services (e.g., Triton, TVM), and enhancing search relevance for our cutting-edge retrieval systems.

    Responsibilities

    • Design, build, and optimize large-scale ML data pipelines using Ray for multimodal data processing
    • Develop, deploy, and manage high-performance inference services using frameworks like NVIDIA Triton Inference Server or Apache TVM
    • Develop and optimize feature extraction workflows for various media types, leveraging distributed computing
    • Implement MLOps practices for model deployment, monitoring, and lifecycle management with a focus on Ray-based workflows
    • Build scalable infrastructure for model training, fine-tuning, and inference
    • Collaborate with data scientists and ML engineers to productionize models and enhance their performance
    • Focus on relevance tuning and optimization of retrieval systems, including vector search and hybrid search approaches
    • Implement monitoring, logging, and alerting for ML systems, data pipelines, and inference services
    • Optimize retrieval performance for large-scale feature stores and vector databases

    Requirements

    • 5+ years of experience in ML infrastructure, MLOps, or data engineering with a focus on production systems
    • Strong proficiency in Python and experience with ML frameworks (PyTorch, TensorFlow)
    • Deep experience with Ray (Ray Core, Ray Data, Ray Tune, Ray Serve) for building distributed applications
    • Hands-on experience with inference serving frameworks (e.g., NVIDIA Triton Inference Server, Apache TVM, KServe)
    • Experience with cloud platforms (AWS, GCP, Azure) and their ML/data services
    • Knowledge of containerization and orchestration (Docker, Kubernetes)
    • Understanding of distributed systems, scalable architectures, and data-intensive applications
    • Experience with feature stores, vector databases (e.g., Qdrant, Weaviate, Pinecone), and search relevance tuning
    • Strong problem-solving, system design, and communication skills

    Benefits

    • Competitive salary and equity
    • Remote-first work environment
    • Health, dental, and vision insurance
    • 401(k) matching
    • Unlimited PTO
    • Learning and development budget
    • Home office setup allowance