About Patronus AI

Patronus AI is a frontier lab developing simulation research and infrastructure to accelerate progress toward human-aligned AGI. We are on a mission to simulate all of the world’s intelligence.

We are the team behind some of the earliest and most influential research in AI evaluation like FinanceBench, Lynx, SimpleSafetyTests, CopyrightCatcher, Humanity’s Last Exam, and more. We are formerly AI researchers and engineers from companies like Meta AI, Amazon AGI, and Google. Our customers include foundation model labs and Fortune 500 enterprises like Adobe. We are backed by top-tier investors like Lightspeed Venture Partners, Notable Capital, Stanford University, Noam Brown, Gokul Rajaram, and more.

Responsibilities

As a Strategic Project Lead at Patronus AI, you will lead the delivery of high-quality simulations that define how AI systems are trained, evaluated, and improved. You will work at the intersection of reinforcement learning, scalable oversight, and real-world workflow simulation, building environments and simulation data that directly influence how frontier models are developed, stress-tested, and deployed.

This is a highly autonomous role. You will lead a team in building simulations of impactful real-world workflows, owning project execution, quality standards, and customer alignment from requirements through delivery. You will work across reward design, tool simulations, behavior analysis, QA processes, and automated review tooling, helping set the standard for robust, high-quality environments.

Your work will inform how frontier labs design, train, and improve the next generation of agents for long-horizon tasks, progressing our path toward safe, human-aligned general intelligence.

In this role, you will:

Lead end-to-end delivery of high-quality simulations and environments for impactful real-world workflows, from translating customer requirements into concrete build targets through final delivery.
Stay close to the details and maintain a clear view of open tasks, delivery gaps, quality risks, technical blockers, and timeline confidence.
Serve as the primary delivery interface with customers, aligning on sample tasks, incorporating feedback, communicating progress, and managing expectations week to week.
Partner with technical, research, and QA leads to prioritize what to build, improve, or review next based on delivery needs, model behavior, quality gaps, and customer requirements.
Define and uphold objective quality standards for simulations, tasks, environments, and deliverables, ensuring each shipment meets requirements for correctness, difficulty, diversity, volume, and readiness. Build and maintain automated QA and review tooling that upholds these standards.
Analyze model behavior, task quality, and failure modes to understand what separates strong environments from weak ones, then translate those insights into better reward design, task generation, QA processes, and review workflows.
Turn learnings from strategic projects into scalable systems, processes, and tooling that improve how Patronus builds, evaluates, and delivers simulation environments for frontier AI systems.

Qualifications

“The number one qualification to succeed in this machine learning course is gumption” - John Lafferty, CS Professor at Yale

Above all, we look for a proactive mindset, willingness to learn, relentless drive, and passion for engineering and product. You are a great fit if you have a background in the following:

BS, MS, or equivalent experience in Computer Science, Machine Learning, Engineering, Mathematics, or another technical / quantitative field.
Strong technical fluency, including comfort using AI tools, writing or reviewing code, and analyzing data or model outputs.
Excellent organization and execution skills, with the ability to manage tasks, timelines, quality reviews, customer requirements, and cross-functional stakeholders.
Strong eye for quality and detail, with a bias toward catching edge cases, inconsistencies, and subtle failure modes.
Clear written and verbal communication skills, including the ability to translate customer needs into concrete technical requirements.
Good character, integrity, and respect for others!

Benefits

Competitive salary and equity packages
15 days of paid vacation per annum
Parental & sick leave
Health, dental, and vision insurance plans
401(k) plan + matching
In-office lunch & dinner
Sponsored personal tax accounting
Whoop band
Monthly meal stipend
Monthly health and wellness stipend
Equinox membership
Fun global offsites!

Patronus AI is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or other legally protected characteristics.

By clicking ‘Apply’, you agree to Greenhouse's Terms of Service and Privacy Policy .

By clicking 'Apply', you agree to Patronus AI, Inc. Privacy Policy.

Strategic Project Lead

About Patronus AI

Responsibilities

Qualifications

Benefits

Other open roles at Patronus AI, Inc.(6)