Callosum logo

Cloud Systems & Resource Orchestration - Member of Technical Staff

Posted about 1 month ago

OfficeLondonSE

About Us

Artificial intelligence scaled on a bet - that bigger models, more identical chips, and more data would keep delivering. As problems grow more complex and the requirements of intelligence more diverse, that bet is breaking down. The next era belongs to heterogeneous intelligence: diverse models on diverse chips, each with distinct strengths, co-evolving into systems of capability unreachable by any single model or accelerator.

Callosum is the Intelligent Systems company. We built the infrastructure to make that possible. Our co-evolution engine optimises simultaneously across workflows, agents, and silicon. We launched in early 2026 showing orders of magnitude improvements in performance and a shift in the cost-performance frontier that no single chip or model provider can provide.

We believe intelligence comes from the system, not the model.

We are scientists and engineers solving what others consider impossible. If you thrive on hard problems, and are passionate and energised by the scale of the challenge, we'd love to hear from you.

About the Role

Callosum believes that orders of magnitude improvements in AI systems will come through application-aware orchestration across heterogeneous hardware. We are building that vision: infrastructure that treats the full landscape of compute as a unified, co-evolving system, evolved beyond GPUs. Current orchestration stacks were built for the homogeneous world - naive to the strengths of new chips and blind to the demands of modern multi-agent workflows.

This role defines how Callosum addresses this problem at the cloud and cluster level, transforming a fragmented compute ecosystem into a unified, exploitable resource pool. We are building the novel paradigm of orchestration that understands accelerator-specific constraints and capabilities. Your work is what makes heterogeneous compute intelligent at scale: every chip placed precisely and allocated efficiently in a stack that is resource-aware and diversity-native.

What You’ll Build

  • Design and build multi-cloud orchestration systems that abstract provider-specific differences behind a unified deployment and scheduling layer

  • Extend Kubernetes - particularly Dynamic Resource Allocation (DRA) — to be aware of heterogeneous accelerator topologies and capabilities, and multi-agent AI workflows

  • Implement intelligent load balancing and placement strategies across cloud providers, regions, and hardware types

  • Build control plane systems that enable efficient allocation and management of heterogeneous accelerator capacity while preserving the ability to exploit hardware-specific strengths

  • Collaborate with an Accelerator Systems Software engineer to surface low-level scheduling primitives into the orchestration layer

What You Bring

  • Strong experience with Kubernetes internals - custom controllers, schedulers, device plugins, CRDs, and the DRA framework

  • You've built or operated multi-cloud infrastructure and have a detailed understanding of the networking, storage, and compute differences between major providers

  • Familiarity with GPU/accelerator resource management in cluster environments (e.g. MIG, time-slicing, device plugins, topology-aware scheduling)

  • Experience with infrastructure-as-code, fleet management, and the reliability engineering required to keep large-scale heterogeneous systems running

Job details
Workplace
Office
Location
London
Experience
SE

Callosum - Co-evolving chips & intelligence

Key team members

Dominic Jacquesson MBE

Dominic Jacquesson MBE

Danyal Akarca

Danyal Akarca

Jonathan Cornford

Jonathan Cornford

Jascha Achterberg

Jascha Achterberg

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups