Back to all positions
ML Infrastructure Engineer
EngineeringSan FranciscoFull-time$200,000 - $280,000
About the Role
Join our ML Infrastructure team to build the systems that train, deploy, and serve our AI models at scale. You'll work at the intersection of machine learning and systems engineering.
What You Will Do
- Build and maintain ML training pipelines and infrastructure
- Design model serving systems for low-latency inference
- Implement monitoring and observability for ML systems
- Optimize GPU utilization and reduce training costs
- Develop tools for model versioning and experiment tracking
- Collaborate with researchers to productionize new models
What We Are Looking For
- 4+ years of experience in ML engineering or infrastructure
- Strong proficiency in Python and experience with PyTorch or JAX
- Experience with ML training frameworks and distributed training
- Familiarity with model serving (TensorRT, ONNX, vLLM)
- Experience with Kubernetes and container orchestration
- Understanding of ML fundamentals and neural network architectures
Nice to Have
- Experience with LLM fine-tuning and deployment
- Background in systems programming (C++, Rust)
- Experience with multi-GPU and multi-node training
- Contributions to ML open-source projects
Benefits
- Competitive salary and meaningful equity
- Premium health, dental, and vision insurance
- Unlimited PTO with encouraged minimum
- Hybrid work with SF office access
- $5,000 annual learning & development budget
- Conference attendance and speaking opportunities
Apply for this position
By applying, you agree to our Privacy Policy