Instructure logo

Senior Applied Data Scientist, Retrieval and Semantic Systems

Instructure

Posted about 19 hours ago

At Instructure, we believe in the power of people to grow and succeed throughout their lives. Our goal is to amplify that power by creating intuitive products that simplify learning and personal development, facilitate meaningful relationships, and inspire people to go further in their education and careers.
We do this by giving smart, creative, passionate people opportunities to create awesome. And that's where you come in:

Our team builds AI-native capabilities, reusable AI systems, and shared infrastructure that power multiple products and workflows across the platform.

We are looking for a Senior Applied Data Scientist to own retrieval and semantic systems end to end, as a core, reusable capability that multiple AI products depend on. You will own the full retrieval vertical: vector store selection and operation, indexing and refresh pipelines, semantic and hybrid retrieval, reranking, and the evaluation systems that prove relevance is good and stays good. You will own retrieval-specific architecture and its day-to-day operation, while our infrastructure owner provides the underlying cloud, cluster, and CI substrate and our AI Platform engineers provide the general MLOps and service scaffolding you build on.

You will work closely with product, engineering, and research partners to turn advanced AI ideas into reliable product capabilities used at scale.

Important note on scope: This is a deep individual-contributor specialist role. We are looking for someone who has owned a retrieval system in production, not someone who has only used a vector database in a prototype. Retrieval evaluation is central to this role: if you cannot measure relevance and catch regressions before they reach users, the system is not done.

What You'll Do

  • Design, build, and ship production retrieval systems that power AI product capabilities across multiple products

  • Own vector store selection and operation, including scalability, latency, reliability, cost, and multi-tenant design

  • Build and operate indexing and refresh pipelines: chunking, embedding generation, backfills, deletes, and versioned indices

  • Implement semantic and hybrid retrieval: embeddings, similarity search, lexical and vector combination, metadata filtering, and reranking

  • Own retrieval evaluation as a first-class system: gold sets, offline relevance metrics, slice analysis, drift detection, and regression gates that block bad changes from shipping

  • Make and defend the core tradeoffs of the domain: relevance against latency against cost against operational complexity

  • Partner with AI Platform and infrastructure engineers on deployment, observability, and reliability, and with product and research partners on relevance requirements

What You'll Need

  • 6+ years of experience building and shipping production machine learning or applied AI systems

  • Proven ownership of a retrieval system in production, including vector store selection and operation

  • Strong Python skills and experience building services and APIs (for example, FastAPI or similar)

  • Solid grounding in embeddings, approximate nearest neighbor search, and retrieval and ranking systems

  • Experience designing indexing and refresh strategies, with data quality controls and safe backfills

  • Demonstrated ability to define and run retrieval evaluation: building gold sets, choosing relevance metrics, analyzing failures by slice, and preventing regressions

  • Strong tradeoff judgment across relevance, latency, cost, and operational complexity

It Would Be a Bonus If You Had

  • Experience with hybrid retrieval (lexical and vector), learning to rank, or domain-specific reranking

  • Experience integrating graph-structured context or knowledge graphs into retrieval

  • Experience with evaluation and observability for LLM and retrieval systems, including drift, failure analysis, and regression prevention

  • Experience with AWS-native retrieval and indexing architectures

  • Experience in edtech, content, curriculum, or skills modeling

Growth & Impact - In This Role, You'll Be Expected To

In this role, you will own retrieval and semantic search as a core differentiator that many AI products build on. You will set how retrieval is architected, operated, and evaluated at Instructure, and you will be the person accountable for relevance being good, measurable, and durable as the system and its content evolve.

Why Join Us

Join us and help shape the future of education by turning cutting-edge AI into reliable product capabilities.

At Instructure, we're on a mission to help educators and students learn together, anytime, anywhere, and however works best. You'll join our research-driven team tackling education's biggest challenges with cutting-edge technology. Our projects have included making sense of unstructured feedback, applying large language models to save teachers' time and improve student experiences, classifying partner networks for smarter recommendations, and detecting fraud to protect resources for real learners.

We value diversity, creativity, and passion, and invest in our teams through mentorship, hack weeks, internal conferences, and a culture where innovation thrives. Here, you'll have the chance to build the next generation of LMS features that make a real impact on students and teachers, and do it in a collaborative, supportive environment that encourages experimentation and growth.

Get in on all the awesome at Instructure!

We offer competitive, meaningful benefits in every country where we operate. While they vary by location, here's a general idea of what you can expect:

  • Competitive compensation, plus all full-time employees participate in our ownership program - because everyone should have a stake in our success.

  • Flexible work culture. Our remote, hybrid and in-office collaboration spaces vary by role, team and location.

  • Generous time off, including local holidays and our annual “Dim the Lights” period in late December, when teams are encouraged to step back and recharge based on departmental needs.

  • Comprehensive wellness programs and mental health support

  • Learning and development resources, including professional development tools and tuition reimbursement, to support your growth

  • The technology and tools you need to do your best work

  • Motivosity employee recognition program

  • A culture rooted in inclusivity, support, and meaningful connection

We believe in hiring great people and treating them right. The more diverse we are, the better our ideas and outcomes.

Instructure is an Equal Opportunity Employer. We comply with applicable employment and anti-discrimination laws in every country where we operate.

Want to see the full job description?

Sign in to view the complete details and apply to this position.

Job details

Workplace

Hybrid

Location

Budapest, Hungary

Experience

SE

Similar

Jobr Assistant extension

Get the extension →