CM

Applied AI Research Engineer

Posted about 4 hours ago

RemoteBoston Hub

About Code Metal

Code Metal is redefining code translation for mission-critical industries, helping defense partners move more quickly and reliably from algorithm to silicon. Our platform accelerates deployment of DSP, RF, communications, and embedded signal processing algorithms onto heterogeneous compute targets, including GPUs, FPGAs, ASICs, and edge SoCs. We also support automotive, aerospace, and semiconductor partners deploying complex algorithms onto constrained hardware with speed and rigor.

About the Role

We're building next-generation AI systems that help military planners explore, compare, and evaluate operational courses of action. Our work combines frontier language models, simulation, planning, and verification into human-in-the-loop decision-support systems for defense applications. As an Applied AI Research Engineer, you’ll focus on human machine teaming and agentic AI to build systems that allow warfighters, planners, analysts, and decision-makers to explore operational choices with speed, confidence, and control.

This role focuses on designing and building agentic AI systems – not chatbots. You'll develop multi-agent workflows, fine-tune and evaluate models, build retrieval pipelines, experiment with post-training techniques, and integrate AI with simulation and planning software. You'll work closely with AI researchers, software engineers, and defense experts to turn research ideas into production-ready capabilities. The goal is to make complex planning, wargaming, adjudication, and analysis workflows faster, more explainable, and more trustworthy.

Research Areas of Interest

An incomplete list of ongoing and near-term directions:

  • Human-machine teaming for AI-assisted course-of-action development, comparison, critique, refinement, and operational decision support

  • Agentic planning systems that integrate language models with simulation, doctrine retrieval, external tools, structured outputs, and deterministic verification

  • Adapting and optimizing foundation models through fine-tuning, post-training, distillation, reinforcement learning, and rigorous evaluation for planning and decision-support tasks

  • Multi-agent AI systems for Red/Blue planning, control-cell support, adjudication, branch-and-sequel analysis, and collaborative planning workflows

  • Building reliable AI systems using self-correction, structured reasoning, constraint-aware generation, verification, and robust tool use

  • Learning from human expertise through planner feedback, preferences, approvals, synthetic data generation, and human-in-the-loop improvement

  • Trustworthy AI for high-consequence applications, with an emphasis on explainability, provenance, traceability, auditability, uncertainty estimation, and model behavior analysis

What You’ll Do

  • Design and build agentic AI systems for planning, decision support, and human-machine teaming

  • Develop AI pipelines that integrate foundation models, retrieval, simulation, external tools, and deterministic software

  • Design, run, and analyze experiments to evaluate model and agent performance, reliability, traceability, latency, cost, and user trust

  • Fine-tune, distill, and evaluate foundation models for domain-specific planning, reasoning, and decision-support tasks

  • Build datasets, retrieval pipelines, automated benchmarks, and experiment infrastructure to support continuous model improvement and reproducible research

  • Partner with software engineers to transition research prototypes into scalable AI services

  • Collaborate with domain experts to translate operational workflows into AI-enabled capabilities while ensuring AI outputs remain explainable, reviewable, and under human control

Why Code Metal?

  • Mission with impact: Build AI systems that help users reason through high-consequence operational decisions.

  • AI beyond demos: Work on systems where models are paired with software, verification, simulation, guardrails, and human oversight.

  • Greenfield research: Explore ambitious ideas in GenAI, RL, agentic workflows, evaluation, and human-machine teaming.

  • Small-team velocity: Move quickly from research question to prototype to user-facing capability.

  • Real users: See your work tested by planners, analysts, engineers, and operational stakeholders.

Must-Have Credentials

  • Bachelor's or Master's degree in Computer Science, Machine Learning, Engineering, Mathematics, Physics, or a related technical field, or equivalent practical experience.

  • 3+ years building AI, machine learning, or applied research systems.

  • Strong Python engineering skills.

  • Experience with PyTorch and modern LLM tooling (Transformers, vLLM, Hugging Face, etc.).

  • Experience building or deploying agentic AI systems, tool-calling workflows, or multi-step reasoning pipelines.

  • Experience fine-tuning, evaluating, or serving language models.

  • Experience with retrieval-augmented generation, embeddings, vector search, or knowledge retrieval systems.

  • Strong understanding of experiment design, benchmarking, and model evaluation.

  • Ability to move quickly from research prototype to production-quality implementation.

  • Eligible to obtain a U.S. security clearance.

Benefits

  • Pay depends on experience, but we strive to be at the upper end of the salary range

  • Health care plan with 100% premium coverage, including medical, dental, and vision

  • 401k with 5% matching

  • Paid Time Off (uncapped vacation, plus sick and public holidays)

  • Flexible hybrid or remote work arrangement

  • Relocation assistance for qualifying employees

Wage Transparency - The salary range for this role is not a guarantee of compensation or salary, as the final offer amount may vary based on factors including, but not limited to, individual proficiency, skills, experience, and location.


We are an equal opportunity employer. US Citizenship may be required for certain project assignments involving security clearance.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Job details
Workplace
Remote
Location
Boston Hub
Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups