This job was posted more than 40 days ago and might be expired.
RemoteSão Paulo, State of São Paulo, Brazil

We are seeking a skilled AI/MLOps Engineer to join the innovative team at 99x Brazil. In this role, you will be responsible for designing, deploying, and maintaining scalable machine learning infrastructure and pipelines that enable rapid development and reliable deployment of AI models. You will work closely with data scientists, engineers, and product managers to ensure seamless integration of AI capabilities into production systems.

You will play a crucial part in automating ML workflows, monitoring model performance, and optimizing resource utilization in cloud environments. Join us to help drive the future of AI-powered solutions in a fast-paced, collaborative environment.

Responsibilities

  • Design and maintain monitoring and observability solutions for AI applications and ML pipelines
  • Track logs, metrics, and traces using tools such as CloudWatch, Datadog, or similar platforms
  • Develop evaluation and testing frameworks for prompts, models, and AI workflows
  • Perform regression testing and quality validation for LLM-based systems
  • Manage prompt experimentation, versioning, and A/B testing processes
  • Debug AI workflows, including model outputs, orchestration pipelines, and infrastructure failures
  • Support deployment, scaling, and maintenance of AI/ML infrastructure in production environments
  • Collaborate with engineering and product teams to improve system reliability and performance
  • Analyze production data and user feedback to drive continuous improvement of AI systems
  • Contribute to operational best practices, documentation, and incident response processes

Requirements

    • Experience with DevOps, SRE, MLOps, or AI infrastructure engineering
    • Strong understanding of monitoring and observability concepts
    • Hands-on experience with tools such as Datadog, CloudWatch, Grafana, Prometheus, or similar
    • Experience supporting AI/ML or LLM-based applications in production
    • Familiarity with prompt engineering, model evaluation, and experimentation workflows
    • Knowledge of cloud platforms such as AWS, Azure, or Google Cloud
    • Experience troubleshooting distributed systems and production pipelines
    • Proficiency in Python, scripting, or automation tooling
    • Strong analytical and problem-solving skills
    • Excellent communication and collaboration abilities

Nice to Have

    • Experience with LLM orchestration frameworks
    • Familiarity with vector databases and RAG architectures
    • Experience with CI/CD pipelines for ML systems
    • Knowledge of Kubernetes, Docker, and infrastructure-as-code tools
    • Experience with AI governance, security, or compliance practices

Benefits

  • Your pick when it comes to employment models: CLT/PJ/Cooperativa;
  • We provide resources for you to grow and learn on the job, including online courses, mentoring, and the latest-gen laptops;
  • A fully remote work environment with flexible working hours;
  • Bonus for any referrals that we end up hiring;

Job details
Workplace
Remote
Location
São Paulo, State of São Paulo, Brazil
99x Brazil (formerly Nextly) logo
99x Brazil (formerly Nextly)
View company page

99x helps ambitious companies innovate and scale with expert teams, global delivery, and AI‑enhanced tools for digital products, web solutions, and software development.

Employees
740
Industry
IT Services and IT Consulting
Headquarters
Oslo
Specialties
Software Product Engineering, Mobile Enablement, User Experience, Product Quality Automation, and Lean Startup-as-a-Service

Key team members

Odd Sverre Østlie

Odd Sverre Østlie

Eiliv Mæhle Liljevik

Eiliv Mæhle Liljevik

Trine-Lise Jensen

Trine-Lise Jensen

mano sekaram

mano sekaram

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups