This job was posted more than 40 days ago and might be expired.
Gramian Consulting Group logo

AI Evaluation Engineer - Mathematics & Algorithms

Posted about 2 months ago

RemoteBrazil

About Us

Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions. With a strong background in software engineering and leadership, we help companies build high-performing teams by matching them with professionals who truly fit their needs.

Role overview

We are looking for a highly analytical and computationally strong professional with a solid research background in mathematics or quantitative fields.

In this role, you will design advanced benchmark tasks for multi-agent AI systems, focusing on complex mathematical reasoning, algorithmic problem-solving, and verifiable computational outputs. You will contribute by crafting challenging problems, building validation systems, and structuring tasks that require decomposition into coordinated sub-solutions.

Commitments Required: 8 hours per day with an overlap of 4 hours with PST.

Employment type: Contractor assignment (no medical/paid leave)

Duration of contract: 4 weeks+

Location: Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Indonesia, Kenya, Nigeria,Turkey, Vietnam

Interview: take home assessment (60min) + short interview

Responsibilities

  • Design and build multi-agent benchmark tasks requiring multi-step mathematical reasoning and algorithmic problem-solving
  • Create complex, decomposable problems across domains such as:
    • Competition mathematics
    • Numerical analysis
    • Combinatorial optimization
    • Statistical inference
  • Develop verification scripts to validate:
    • Numerical outputs (with tolerance thresholds)
    • Proof correctness and logical steps
    • Algorithmic outputs and constraints
  • Write clear, structured problem statements with precise notation and defined outputs
  • Design task decomposition strategies for parallel or multi-agent execution
  • Implement computational solutions and validation pipelines using Python
  • Work with containerized environments (Docker) for reproducibility and evaluation

Requirements

  • 5+ years in mathematics, quantitative research, or computational science
  • Strong Python skills for scientific computing (NumPy, SciPy, SymPy or similar)
  • Experience solving or designing complex mathematical / algorithmic problems
  • Ability to create precise, verifiable outputs (no subjective problems)
  • Experience with mathematical proofs or formal reasoning
  • Familiarity with AI benchmarks or evaluation frameworks (e.g., SWE-bench)
  • Comfortable working with Docker environments
  • Solid understanding of numerical methods (precision, convergence, error bounds)
Job details
Workplace
Remote
Location
Brazil
Gramian Consulting Group logo
Gramian Consulting Group
View company page

Gramian Consulting is your partner for accessing the engineering capabilities you need—delivered in the model that fits your business, from staff augmentation and talent recruiting to Build-Operate-Transfer (BOT). We combine the perspective of a software engineer, the rigor of a technical recruiter, and the vision of a business builder, so you get experts who understand your challenges and deliver results the right way. This blend is our signature advantage in providing top-quality services, fast and reliably.

Key team members

Emmanuel Yawson

Emmanuel Yawson

Pauline Perry

Pauline Perry

Emad Hassan

Emad Hassan

Aleksandra Šarac

Aleksandra Šarac

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups