
Technical Lead - System Validation Architect
Graphcore
Posted about 2 hours ago
About us
Graphcore is one of the world’s leading innovators in Artificial Intelligence compute.
It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry.
As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone.
Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation.
Job Summary
We are seeking a Technical Lead – System Validation Architect to lead the architecture and execution of Linux-based validation frameworks for Arm-based data center SoCs. This role will define validation strategy, test coverage, and methodology across CPU, memory, interconnect, and high-speed I/O subsystems. You will provide technical leadership in validation architecture, automation, benchmarking, and debug to ensure robust system quality and scalability.
The Team
The Systems Validation Architecture team is responsible for defining and enabling scalable validation methodologies for Graphcore’s next-generation AI compute platforms. The team collaborates closely with hardware, firmware, and systems engineering groups to deliver comprehensive validation coverage and high-quality system enablement.
Responsibilities and Duties
- Define end-to-end validation strategy and coverage model:
- Functional, stress, performance, and corner-case testing
- Translate hardware specifications into structured, parameterized test plans
- Guide the team in:
- Selecting appropriate tools.
- Defining workload models and parameter configurations
- Establish standards for:
- Test case definition (parameters, metrics, pass/fail criteria)
- Result validation and reporting
- Experience with multi-core and parallel programming, including workload scaling and CPU affinity management
- Review Python-based automation, orchestration, and analysis
- Collaborate with hardware, firmware, and system teams to debug issues
Candidate Profile
Essential:
- Strong knowledge of Arm SoC architecture and Linux systems.
- 8+ years of experience in system validation, performance engineering, or low-level systems development.
- Deep understanding of CPU architecture, cache coherency, memory systems (DDR, HBM, NUMA), and high-speed I/O technologies such as PCIe.
- Proven ability to define validation strategies, coverage models, and validation methodologies.
- Hands-on experience using and tuning benchmarking tools such as stress-ng, fio, and iperf.
- Strong Python programming skills for process automation, system coordination, and data examination.
- Experience working with performance analysis software including perf and PMU counters.
- Strong analytical, problem-solving, and ability to collaborate in multi-functional environments.
Desirable:
- Experience working with large-scale or data center systems.
- Strong programming skills in C/C++ and Python for system-level development.
Job details
Jobr Assistant extension
Get the extension →