Diffractive Labs logo

Data Engineer

Posted 9 days ago

RemoteLondon

What We're Looking For

Diffractive Labs is building the infrastructure to accelerate the discovery of novel magnetic materials through closed-loop AI and physical experimentation. As our Data Engineer, you will own the data architecture that powers this research.

This role requires a builder who understands both large-scale machine learning pipelines and the messy reality of physical lab data. You will serve as the critical link between experimental results generated at the bench and the models evaluating them, ensuring our research team always has the exact, high-quality datasets required to push the frontier of materials science. We are looking for someone with a rigorous, experimental mindset who thrives in an interdisciplinary environment and operates with a high degree of technical ownership.

What You'll Do

  • Drive the overarching data architecture across our training stack, mapping out data requirements with ML researchers and evaluating new external sources to fill knowledge gaps.

  • Design and deploy the ingestion pipelines that capture physical experimental data directly from our wet lab instruments and feed it seamlessly into our model training workflows.

  • Construct robust, reproducible systems for processing, standardizing, and versioning diverse scientific corpora, creating a highly reliable foundation for the research team.

  • Develop custom evaluation datasets and reinforcement learning environments specifically calibrated for the properties and behaviors of magnetic materials.

  • Build internal tooling that allows machine learning researchers and physical scientists to effectively query, inspect, and audit the data feeding into pretraining, midtraining, and RL runs.

  • Continuously integrate emerging techniques in synthetic data generation, data selection, and data-efficient training into our production systems.

Skills & Qualifications

  • 3+ years of engineering experience focused on large-scale data pipelines, ideally within an applied ML, scientific, or LLM training environment.

  • High proficiency in Python and modern workflow orchestration frameworks (e.g., Dagster, Airflow, Prefect, or similar).

  • Demonstrated experience with dataset lineage, versioning, and reproducibility tooling (such as DVC, Delta Lake, or custom equivalents).

  • A track record of collaborating directly with machine learning researchers, translating complex modeling needs into scalable pipeline architecture and back again.

  • Strong DevOps fundamentals, including hands-on experience with containerization (Docker, Kubernetes) and CI/CD deployment.

Nice to Have

  • Prior experience processing and structuring data from physical laboratory instrumentation, computational simulations, or multimodal scientific sources.

  • A background in curating datasets for domain-specific continued pretraining or instruction tuning.

  • An academic or practical background in physics, materials science, or chemistry.

Why Join Us

You are building the foundation for breakthroughs in magnetic materials that will directly influence the future of energy and computing hardware.

Diffractive is building the AI Material Scientist that autonomously learns from real-world experimentation to push the boundaries of scientific discovery. We're early, moving fast, and working on problems that genuinely matter.

You'll join a small, high-calibre team where your work has real impact from day one. We're London-based with a flexible approach to how and where you work. We offer competitive salary, generous equity and benefits. You'll have a real stake in what you build and in the company's overall success.

How to Apply

If you're excited about this role and believe you could thrive in it, we'd encourage you to apply even if you may not align with every part of the job description.

Diffractive is an equal opportunities employer. We are committed to creating an inclusive environment for all employees and welcome applications from people of all backgrounds, experiences, and identities.

If you require any adjustments or accommodations at any point during the interview process please let us know - we will be happy to help.

Hit the apply button below to submit your application. We are looking forward to hearing from you!

Job details
Workplace
Remote
Location
London
Diffractive Labs logo
Diffractive Labs
View company page

Building the AI Scientist - an autonomous system that learns from real-world experimentation to accelerate materials discovery and scientific research. Creating AI-driven laboratory automation and intelligent discovery systems.

Employees
1

Key team members

Adam Bell

Adam Bell

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals โ€” no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups