Back to all positions

ML Infrastructure Engineer

EngineeringSan FranciscoFull-time$200,000 - $280,000

About the Role

Join our ML Infrastructure team to build the systems that train, deploy, and serve our AI models at scale. You'll work at the intersection of machine learning and systems engineering.

What You Will Do

  • Build and maintain ML training pipelines and infrastructure
  • Design model serving systems for low-latency inference
  • Implement monitoring and observability for ML systems
  • Optimize GPU utilization and reduce training costs
  • Develop tools for model versioning and experiment tracking
  • Collaborate with researchers to productionize new models

What We Are Looking For

  • 4+ years of experience in ML engineering or infrastructure
  • Strong proficiency in Python and experience with PyTorch or JAX
  • Experience with ML training frameworks and distributed training
  • Familiarity with model serving (TensorRT, ONNX, vLLM)
  • Experience with Kubernetes and container orchestration
  • Understanding of ML fundamentals and neural network architectures

Nice to Have

  • Experience with LLM fine-tuning and deployment
  • Background in systems programming (C++, Rust)
  • Experience with multi-GPU and multi-node training
  • Contributions to ML open-source projects

Benefits

  • Competitive salary and meaningful equity
  • Premium health, dental, and vision insurance
  • Unlimited PTO with encouraged minimum
  • Hybrid work with SF office access
  • $5,000 annual learning & development budget
  • Conference attendance and speaking opportunities

Apply for this position

By applying, you agree to our Privacy Policy