
AI Scientist
Noon
Posted about 14 hours ago
About Noon
We are on a mission to reinvent how designers work in the AI era. We’re backed by top investors including First Round, Chemistry, Homebrew, Scribble and senior leaders from OpenAI, Meta, Google, Ramp, Stripe and more. We’re building the next-generation AI design tool for product teams.
About the Role
We're hiring an AI Scientist to decide what AI we need and how to build it, when to fine-tune, distill, route, prompt, or just call an API. You'll own our AI architecture, evaluation, retrieval, and agent design, and run the experiments that tell us what's actually true for our product. Your core job: stop us from spending six months on the wrong thing.
The frontier moves weekly. You'll read the papers, benchmark what matters, and tell us 3 - 6 months early when something changes our roadmap and when something we're worried about is just hype.
What You'll Do
AI architecture and the reasoning behind it: fine-tune vs. RAG vs. prompt vs. frontier API, SLM vs. large model, in-house vs. vendor inference. Make the calls, document them, revisit as the landscape shifts.
Experiments. Turn "we think X is better" into "we have evidence X is better."
The evaluation framework: benchmarks, execution-based verifiers, LLM-as-judge, behavioral regression, eval data.
Prompt, context, and retrieval layers across every AI feature.
Agent flow and tool design: orchestration, tool taxonomy, contracts.
Model adaptation when experiments call for it: data curation, synthetic data, SFT, LoRA/QLoRA, DPO/RLHF, deployment.
The loyal skeptic role: audit what we ship and flag where we're over- or under-engineering.
Must-Have Requirements
8+ years engineering, 3+ deep in LLMs and modern ML.
Track record of structured experiments that drove real architectural decisions
You've designed eval frameworks for generative models, not just used benchmarks.
Strong data instincts: acquisition, curation, synthetic generation.
Solid grounding in fine-tuning (SFT, LoRA/QLoRA, distillation, preference optimization) and the judgment to know when each is the right call. Production experience is a plus; deep current understanding is the bar.
Built agentic systems with tool calling and designed retrieval pipelines.
Deep familiarity with the current LLM landscape and a track record of calling shifts early.
Nice to Have
Published research, open-source, or public writing in LLM/ML.
Multimodal, code-generation, or structured-output experience.
Synthetic data generation at scale.
Shipped AI inside a product, not just research.
Benefits
Salary: $300,000-$400,000 base salary
Job details
Jobr Assistant extension
Get the extension →