
Hardware Data Center Operations Technician, Senior
Crusoe
Posted about 3 hours ago
Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens — to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building the future, faster.
We're in the midst of the greatest industrial revolution of our time. The demand for AI compute is boundless, and power is a bottleneck. We're solving that — with an energy-first approach that makes AI infrastructure better for the world and faster for the people innovating with AI.
We're looking for problem-solving, opportunity-finding teammates with a sense of urgency, who believe in the scale of our ambition and thrive on a path not fully paved — people who want to grow their careers alongside a team of experts across energy, manufacturing, data center construction, and cloud services.
If you want to do the most meaningful work of your career, help our customers and partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at Crusoe.
About This Role:
This role is critical to ensuring industry-leading reliability and uptime for our cloud platform, directly impacting our ability to deliver innovative solutions to our customers. You'll be involved in exciting projects, from supporting the burn-in/stress testing of new hardware to troubleshooting complex server issues and collaborating with vendors. The ideal candidate is a highly skilled and experienced technician with a deep understanding of server hardware, a passion for problem-solving, and a commitment to maintaining peak performance in a fast-paced environment. This is a full-time position.
What You’ll Be Working On:
Troubleshooting & Repair: Diagnose and resolve hardware failures in complex GPU-based servers (both air and liquid-cooled), ensuring minimal downtime.
Hardware Testing & Qualification: Collaborate with the Infrastructure Systems team to support burn-in/stress testing of new hardware and resolve any issues that arise. Support the qualification of new hardware.
Vendor Management: Open and manage support tickets with hardware vendors, serve as the datacenter liaison for vendor support personnel, and maintain a hardware issue tracker.
Inventory Management: Maintain an accurate spares inventory and replenish stock as needed to ensure quick repairs.
Deployment Support: Assist the Cloud Deployments team with racking and cabling servers, contributing to the efficient expansion of our infrastructure.
Documentation & Communication: Maintain detailed records of hardware issues and resolutions, and communicate effectively with internal teams and vendors.
Physical Demands: Work in a physically challenging environment (sound/vibration/thermal) and be able to lift 50 lbs.
On-Call Support: Provide occasional after-hours support to address critical issues.
What You’ll Bring to the Team:
Server Hardware Expertise: Possess significant experience diagnosing and repairing complex GPU-based servers (both air and liquid-cooled).
Technical Proficiency: Demonstrate a deep understanding of server hardware, BMC-based manageability, BIOS settings, and firmware deployment.
Datacenter Experience: Have four or more years of hands-on experience working in a datacenter environment.
Networking Knowledge: Familiarity with Infiniband switches and network topology.
Linux Skills: Basic Linux system administration expertise.
Problem-Solving Abilities: Excellent analytical and problem-solving skills to effectively troubleshoot hardware issues.
Communication Skills: Strong organizational, time management, and communication skills.
Education: Associates Degree or equivalent experience in an IT-related field.
Bonus Points:
Experience with other high-performance computing (HPC) technologies.
Relevant certifications (e.g., CompTIA Server+, CCNA).
Experience with scripting languages (e.g., Python, Bash).
Knowledge of datacenter infrastructure management (DCIM) tools.
Experience working in a fast-growing startup environment.
Familiarity with various cooling systems used in data centers.
Experience with liquid cooling systems.
Benefits:
Crusoe offers a comprehensive benefits package designed to support well-being and financial security. This includes full social security coverage, contributions to provident, trade union, and pension funds, with options for additional pensions. Employees also have optional access to Global Life Insurance and private health insurance. Crusoe provides generous leave policies, including maternity, paternity, parental, and sick leave, ensuring you have the support you need at every stage of life.
Compensation:
Compensation will be paid as salary or hourly.
Job details
Jobr Assistant extension
Get the extension →