Back to Careers
ML Infrastructure Engineer (Ray & Inference Focus)
EngineeringRemoteFull-time$180,000 - $250,000
Join our team to build and scale the ML infrastructure that powers Mixpeek's multimodal data processing platform. You'll be responsible for developing robust and high-performance data pipelines using Ray, deploying and optimizing inference services (e.g., Triton, TVM), and enhancing search relevance for our cutting-edge retrieval systems.
Responsibilities
- Design, build, and optimize large-scale ML data pipelines using Ray for multimodal data processing
- Develop, deploy, and manage high-performance inference services using frameworks like NVIDIA Triton Inference Server or Apache TVM
- Develop and optimize feature extraction workflows for various media types, leveraging distributed computing
- Implement MLOps practices for model deployment, monitoring, and lifecycle management with a focus on Ray-based workflows
- Build scalable infrastructure for model training, fine-tuning, and inference
- Collaborate with data scientists and ML engineers to productionize models and enhance their performance
- Focus on relevance tuning and optimization of retrieval systems, including vector search and hybrid search approaches
- Implement monitoring, logging, and alerting for ML systems, data pipelines, and inference services
- Optimize retrieval performance for large-scale feature stores and vector databases
Requirements
- 5+ years of experience in ML infrastructure, MLOps, or data engineering with a focus on production systems
- Strong proficiency in Python and experience with ML frameworks (PyTorch, TensorFlow)
- Deep experience with Ray (Ray Core, Ray Data, Ray Tune, Ray Serve) for building distributed applications
- Hands-on experience with inference serving frameworks (e.g., NVIDIA Triton Inference Server, Apache TVM, KServe)
- Experience with cloud platforms (AWS, GCP, Azure) and their ML/data services
- Knowledge of containerization and orchestration (Docker, Kubernetes)
- Understanding of distributed systems, scalable architectures, and data-intensive applications
- Experience with feature stores, vector databases (e.g., Qdrant, Weaviate, Pinecone), and search relevance tuning
- Strong problem-solving, system design, and communication skills
Benefits
- Competitive salary and equity
- Remote-first work environment
- Health, dental, and vision insurance
- 401(k) matching
- Unlimited PTO
- Learning and development budget
- Home office setup allowance