AI Production Engineer
Siili Solutions.com
Office
Tampere, Finland
Full Time
Make Your Story Real
At Siili, we don't just build AI – we run it. We've seen firsthand that capabilities needed to run a compelling proof of concept are fundamentally different from those required to operate AI reliably at scale. That's why we're building our Managed AI practice: to bridge the gap between "it works in the PoC" and "it works in our business."
As an AI Production Engineer, your work focuses on operating AI systems in production — not building experimental models or prototypes. You will ensure that existing AI applications stay reliable, accurate and cost-efficient over time. You'll work as part of our Managed Services business line, alongside developers, data, automation and cloud experts. Your focus will be on production AI: building AI systems, monitoring model performance, detecting drift, optimizing inference, and responding when things break.
What You'Ll Do:
- Monitor and maintain production AI systems, ensuring reliability, performance, and quality under SLAs.
- Implement end-to-end AI solutions to production deployment
- Implement model monitoring solutions: drift detection, performance tracking, and automated alerting.
- Optimize LLM applications for production: latency, throughput, token usage, and cost efficiency.
- Build and maintain evaluation frameworks to catch quality degradation before users do.
- Troubleshoot AI-specific incidents: hallucinations, embedding quality issues, retrieval failures, and unexpected model behavior.
- Implement retraining pipelines and manage model versioning in production environments.
- Optimize RAG systems: chunking strategies, embedding refresh, vector database performance, and retrieval quality.
- Collaborate with client teams to improve their AI systems based on production insights.
- Document operational runbooks and contribute to Managed AI best practices.
- Drive continuous operations improvements by identifying recurring issues, automating routine tasks, and ensuring long-term system reliability.
- Monitor and maintain production AI systems, ensuring reliability, performance, and quality under SLAs.
- Implement end-to-end AI solutions to production deployment
- Implement model monitoring solutions: drift detection, performance tracking, and automated alerting.
- Optimize LLM applications for production: latency, throughput, token usage, and cost efficiency.
- Build and maintain evaluation frameworks to catch quality degradation before users do.
- Troubleshoot AI-specific incidents: hallucinations, embedding quality issues, retrieval failures, and unexpected model behavior.
- Implement retraining pipelines and manage model versioning in production environments.
- Optimize RAG systems: chunking strategies, embedding refresh, vector database performance, and retrieval quality.
What We'Re Looking For:
We are looking for an AI Production Engineer who understands what it takes to run AI in production – not just build it.
We hope you recognize yourself in most of these:
- 2+ years of experience in AI/ML engineering, with hands-on production deployment and operations experience.
- Strong understanding of ML concepts and the ways models fail in production: drift, degradation, edge cases.
- Experience with LLM applications in production: prompt management, evaluation, monitoring, and optimization.
- Knowledge of MLOps practices: CI/CD for ML, model versioning, A/B testing, and staged rollouts.
- Familiarity with RAG architectures and their operational challenges: embeddings, vector databases, retrieval tuning.
- Hands-on experience with monitoring and observability tools.
- Ability to diagnose and resolve production issues under pressure.
- Good communication skills – able to explain AI behavior to non-technical stakeholders and write clear incident reports.
- Curiosity about why models behave unexpectedly and drive to prevent it from happening again.
- Fluent in Finnish and English.
- LLMs and LLM middleware: GPT, Claude, open-source models, LiteLLM, vLLM
- Agent technologies: MCP, agent/workflow frameworks (LangGraph, LlamaIndex, Haystack, Magentic, Strands)
- Platforms, e.g. Azure AI Foundry, Amazon Bedrock, Snowflake and Databricks
- Cloud: Azure, AWS, GCP
- Vector databases, e.g. Pinecone, pgvector
- Monitoring: Prometheus, Grafana, Datadog, LLM-specific observability (Langfuse, Phoenix)
- Experience with LLM applications in production: prompt management, evaluation, monitoring, and optimization.
- Knowledge of MLOps practices: CI/CD for ML, model versioning, A/B testing, and staged rollouts.
- Familiarity with RAG architectures and their operational challenges: embeddings, vector databases, retrieval tuning.
- Hands-on experience with monitoring and observability tools.
- Ability to diagnose and resolve production issues under pressure.
- Curiosity about why models behave unexpectedly and drive to prevent it from happening again.
- Fluent in Finnish and English.
- LLMs and LLM middleware: GPT, Claude, open-source models, LiteLLM, vLLM
- Agent technologies: MCP, agent/workflow frameworks (LangGraph, LlamaIndex, Haystack, Magentic, Strands)
- Platforms, e.g. Azure AI Foundry, Amazon Bedrock, Snowflake and Databricks
- Cloud: Azure, AWS, GCP
- Vector databases, e.g. Pinecone, pgvector
- Monitoring: Prometheus, Grafana, Datadog, LLM-specific observability (Langfuse, Phoenix)
Relevant Technologies:
Python
- Containers and orchestration basics
What We Offer:
- A technically strong and collaborative community: You'll work alongside experienced data professionals who support each other, share insights, and aim for excellence in everything we do. Our culture values respect, autonomy, and peer-driven improvement.
- Deep AI production expertise: You'll develop specialized skills in AI operations that most engineers never get – because most companies don't have production AI systems yet. You’ll be shaping how AI is run in the real world.
- Real-world impact, not just buzzwords: You'll help deliver and run scalable, meaningful Data & AI solutions with clear ties to business strategy – not just experimental prototypes.
- Continuous learning, backed by top partnerships: We invest in your growth through mentoring, training, and hands-on access to the latest tools via our partnerships with Microsoft, Databricks, Nvidia and more.
- Practical benefits: Flexible hybrid work, well-being support, hobby clubs, daycare support for sick children, and a share option program.
- Company with a mission: We have a clear vision and purpose, and unique opportunity for you to make your mark as we shape the Data & AI story for a leading IT service company.
- Mental well-being at the core: We know that thriving at work requires more than technical skills. That’s why we offer support through services like Auntie and Focus Tiger, provide in-house coaching, and ensure that all our leaders have completed Brain-Friendly Leadership training with the NeuroLeadership Institute. You’ll have access to tools and support that help you take care of your mind as well as your career – because real impact starts with well-being.
- A culture shaped by Joy, Ambition, Responsibility and a Humane touch: At Siili, we believe that great results stem from meaningful work, psychological safety, and the courage to act responsibly. You’ll be trusted, supported and encouraged to grow as your authentic self – while contributing to a more sustainable digital future.
Ready to "make AI real"?
At Siili, we don’t conduct interviews; we have meetings. These meetings ensure that both our goals and expectations are aligned, laying the foundation for a promising future together.
We welcome applicants from diverse backgrounds, including different ages, genders, cultures, and minority groups.
Please let us know if we can adapt our meeting to suit your needs. We strive to consider individual needs and neurodiversity in our recruitment process, making meetings as comfortable as possible for everyone.
Apply today, and join us on this exciting journey!
At Siili, you're not just building models. You’re making AI real – in production, at scale, with impact.
