All jobs

Debug Validation Engineer — Multiple Levels

Graphcore4h ago
United KingdomOnsiteFull-time
  • About us
  • At Graphcore, we’re building the future of AI compute.
  • We’re a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale.
  • As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem.
  • To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world.
  • We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence.
  • Job Summary
  • Reporting to Senior Director of Post Silicon Validation, the Debug Validation Engineer will drive post-silicon debug and validation activities for next-generation AI compute silicon and systems. You will lead teams passionate about identifying, reproducing, analyzing
  • resolving complex silicon, firmware
  • system-level issues during bring-up, characterization
  • product readiness. This role combines deep technical debugging expertise with strong cross-functional collaboration across multiple engineering fields.
  • The Team
  • The Post-Silicon Debug and Validation team manages bring-up, fault diagnosis
  • validation of Graphcore silicon and systems. Our team participates throughout the entire product lifecycle, supporting initial silicon bring-up, subsystem validation, system integration
  • production readiness tasks. We coordinate closely with hardware, firmware, software
  • systems teams to examine complex failures, develop debug strategies
  • advance validation infrastructure.
  • Responsibilities and Duties
  • Lead post-silicon debugging and validation efforts for AI compute silicon and platform technologies
  • Contribute to debug and validation activities across multiple projects and achievements
  • Analyze and address intricate silicon, firmware, software, and system-level problems during bring-up and validation
  • Develop structured debug methodologies and failure analysis processes to improve issue resolution efficiency
  • Work in close partnership with architecture, RTL, firmware, software, and systems engineering groups to determine root causes and carry out corrective measures
  • Drive debug of CPU, memory, interconnect, and high-speed I/O subsystems under functional, stress, and workload conditions
  • Develop and improve automated debug, regression, and validation infrastructure using Python and related technologies
  • Analyze logs, traces, telemetry, and hardware data to isolate and characterize system failures and performance issues
  • Support development of validation tests, debug tooling, and custom diagnostics to improve coverage and observability
  • Define validation metrics, debug workflows, and reporting standards to ensure consistent and repeatable analysis
  • Communicate technical risks, status, and recommendations clearly to engineering leadership and cross-functional collaborators
  • Support silicon readiness reviews and contribute to product quality and release decisions
  • Contribute to continuous improvement of debug methodologies, validation infrastructure, and engineering workflows
  • Candidate Profile
  • Essential:
  • Strong experience in bare metal environments
  • Strong understanding of SoC and platform architectures
  • Expertise in debug infrastructure and post-silicon debug methodologies
  • Strong programming skills in Python, C, or debug scripting languages such as CMM or equivalent experience
  • Highly motivated self-starter with a collaborative and team-oriented approach
  • Ability to collaborate across teams and programming languages to uncover root causes of deep and complex issues
  • Experience of the post-silicon validation process applied in digital ASIC environments
  • Strong Linux and Python experience
  • Outstanding communication skills and the ability to collaborate effectively to solve complex problems
  • Excellent problem-solving, analytical, and diagnostic skills
  • Deep knowledge of scan, DFT, JTAG, and trace infrastructure
  • Strong debug skills including fault tree analysis, failure isolation, fishbone methodologies, and system-level debug techniques
  • Capability to operate autonomously on technically intricate debug and validation tasks spanning hardware, firmware, and software areas
  • Desirable
  • Understanding of DFT flows from insertion through post-silicon validation
  • Experience developing tooling for parsing and analyzing debug data, including scan dump parsing
  • Driver-level experience with one or more of the following technologies: PCIe, Ethernet, Memory technologies including LPDDR, DDR, and HBM, Peripheral interfaces such as I2C, I3C, and SPI
  • Experience using CoreSight and similar debug infrastructure including CTI, ETx, DStream, JLink, Lauterbach, ATB, and STM or equivalent experience
  • Strong understanding of mixed-signal components like PLLs, high-speed PHYs, and IC control/communication protocols
  • Experience with Arm CPU architectures, system IP, and associated debug tooling
  • Experience with AMBA protocols
  • Understanding of ML applications and associated workloads
  • Experience in characterization, failure analysis, test development, statistical analysis, and customer support
  • Benefits
  • In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences
  • we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.

Required skills

PythonCLinuxdebug scriptingSoC architecturepost-silicon validationdebug infrastructurefault tree analysisfailure isolationsystem-level debugPCIeEthernetmemory technologiesI2CJTAG
Posted on JobRush — the end-to-end AI job-search platform.