Senior DevOps Engineer - AI and AV Infrastructure
NVIDIA.com
Office
China, Shanghai
Full Time
NVIDIA has become the platform upon which every new AI-powered application is built. From healthcare research applications to autonomous vehicles, or voice-recognition systems, the need for advanced perception and cognitive capabilities is exploding... and NVIDIA is right in the center of this revolution. We are seeking a motivated Senior DevOps Engineer to join our Autonomous Vehicle Infrastructure organization, focusing on building, deploying, and operating validation platforms at scale. In this role, you will work with internal teams and external partners to integrate distributed systems, manage large-scale data pipelines, and operationalize next-generation validation workflows for autonomous driving.
This role offers a chance to start from the ground up: standing up new vendor-provided platforms, validating integration paths, and ensuring infrastructure is reliable, secure, and production-ready. You will combine hands-on engineering, infrastructure deployment, and workflow automation to help scale our AV validation ecosystem.
What You’Ll Be Doing:
- Deploy and operationalize vendor-provided platforms in our service cloud, starting with proof-of-concept environments to validate dependencies, workflows, and performance.
- Build and maintain distributed infrastructure that supports large-scale log ingestion, data processing, and scenario validation at scale.
- Automate workflows and pipelines using Python, Bash, and Bazel to ensure reproducibility, efficiency, and reliable distributed execution.
- Integrate simulation and drive logs (e.g., parquet, world model data) with validation platforms, ensuring seamless end-to-end coverage analysis.
- Provide visualization and reporting capabilities to surface validation results, coverage metrics, and actionable insights for developers and stakeholders.
- Define and manage access controls, monitoring, and security policies to ensure compliance while enabling smooth collaboration across internal and vendor teams.
- Partner closely with internal teams and external vendors to troubleshoot issues, refine SLAs, and continuously improve operational reliability and scalability.
What We Need To See:
- BS/MS in Computer Science or Engineering (or equivalent experience) or BS/MS in STEM related field
- 5+ years of professional experience in infrastructure, distributed systems, or platform engineering.
- Hands-on experience with Linux systems, Kubernetes/Docker, and CI/CD pipelines.
- Strong scripting/development skills in Python, Bash, and exposure in C++ and/or GoLang.
- Familiarity with Bazel build/test automation frameworks.
- Experience in data/log ingestion workflows and distributed compute/storage systems.
- Strong debugging, problem-solving, and communication skills to work across internal and vendor teams.
- Ways to Stand Out from the Crowd:
- Prior experience with scenario-based validation platforms or AV simulation ecosystems. Experience with Foretellix is an added advantage.
- Background in large-scale distributed systems or GPU/CPU cluster deployments. Strong knowledge of logging/monitoring/alerting frameworks (Prometheus, Grafana, ELK stack, etc.).
- Experience working directly with external vendors to integrate platforms and operationalize SLAs.
- Contributions to open-source projects in infrastructure automation, data pipelines, or validation tooling.
- Proactive use of AI/ML techniques to accelerate log analysis, coverage metrics, or integration workflows.
With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and our engineering teams are growing fast in some of the hottest innovative fields: Deep Learning, Artificial Intelligence, and Autonomous Vehicles. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for great people like you to help us with the next wave of validation and tooling for autonomous driving solutions. If you're passionate about autonomous vehicles, we would love to hear from you!
Senior DevOps Engineer - AI and AV Infrastructure
Office
China, Shanghai
Full Time
September 24, 2025