Staff Machine Learning Engineer, Training Runtime Performance

Nuro, Inc..com

235k - 352k USD/year

Office

Mountain View, California (HQ)

Full Time

Who We Are

Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world’s most scalable driver, combining cutting-edge AI with automotive-grade hardware. Nuro licenses its core technology, the Nuro Driver™, to support a wide range of applications, from robotaxis and commercial fleets to personally owned vehicles. With technology proven over years of self-driving deployments, Nuro gives the automakers and mobility platforms a clear path to AVs at commercial scale—empowering a safer, richer, and more connected future.

About The Role

We are seeking a highly experienced Staff Software Engineer to join our ML Infrastructure team, focusing on optimizing training runtime efficiency and input pipelines for model training, evaluation, and distillation workloads. In this role you will enable models to train faster and more efficiently – accelerating our self-driving roadmap of commercial and personal mobility.

About The Work

In this role you will contribute to the overall ML Infrastructure strategy, particularly in areas related to runtime efficiency and goodput. Specifically, your areas of responsibility will include:

Collaborate with ML practitioners and other infrastructure teams to understand their needs and integrate optimized input pipelines seamlessly into their workflows.
Detect, diagnose, and resolve performance bottlenecks across training, eval, and model distillation workflows.
Optimize training performance, resource utilization, and ensure consistent, reproducible model training outcomes.
Optimize input data pipelines to increase runtime goodput, ensuring accelerators maximize their "time on task" and minimize idle cycles.
Champion best practices for robust, reproducible, and debuggable ML experimentation.

About You

B.S./M.S./Ph.D. in Computer Science, Electrical Engineering, or related technical field (or equivalent experience).
4+ years of professional experience in ML infrastructure, distributed training, or ML systems engineering, scaling models on multi-node, multi-accelerator clusters.
Understanding of training, evaluation, and distillation workflows for billion-parameter models
Expert-level knowledge in distributed systems and (remote) Python
Strong skills in profiling, debugging, and optimizing quantized workloads.
Experience with ML compilers and strategies to reduce startup overhead
Familiarity with model distillation and efficient inference workflows.

Bonus Points

Previous contributions to open source ML infra projects or research publications in ML systems.
Hands-on experience with Foundation Model infrastructure
Highly proficient in C++, distributed systems, ML framework internals (e.g., NCCL, Horovod, DeepSpeed, Ray)

At Nuro, your base pay is one part of your total compensation package. For this position, the reasonably expected base pay range is between $235,030 and $352,290 for the level at which this job has been scoped. Your base pay will depend on several factors, including your experience, qualifications, education, location, and skills. In the event that you are considered for a different level, a higher or lower pay range would apply. This position is also eligible for an annual performance bonus, equity, and a competitive benefits package.

At Nuro, we celebrate differences and are committed to a diverse workplace that fosters inclusion and psychological safety for all employees. Nuro is proud to be an equal opportunity employer and expressly prohibits any form of workplace discrimination based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other legally protected characteristics. #li-dnp