About Mindbeam

We are building the next-generation AI infrastructure for open source and enterprise. Our work is deeply research-oriented and passionate about developing ground-breaking innovations to take state-of-the-art AI applications to the next level.

Mission

Push the boundaries of performance by developing custom kernels and low-level optimizations for next-generation AI workloads.

Role Expectations

• Design and implement custom GPU/accelerator kernels to maximize performance.

• Profile, benchmark, and optimize critical ML workloads.

• Collaborate with researchers to translate algorithmic advances into efficient, production-ready code.

• Stay current with hardware advancements (CUDA, ROCm, TPU) to inform kernel design.

• Document and share best practices for low-level optimization.

Background

• Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, or related field—or equivalent experience.

• 2+ years of experience in GPU programming, parallel computing, or systems-level optimization.

• Strong coding skills in C++, CUDA, or similar languages.

• Familiarity with ML frameworks and their low-level backends.

• Experience optimizing workloads for distributed and heterogeneous compute environments.

• Comfort with profiling tools and performance diagnostics.

About You

You are detail-oriented, performance-obsessed, and excited by the challenge of squeezing out every ounce of compute efficiency. You enjoy working at the intersection of algorithms and hardware, and you thrive in a collaborative environment where bold ideas are encouraged.

Machine Learning Engineer - Kernels

Other open roles at Mindbeam(4)