Master Thesis Project - 2026

Modulai.com

Office

Göteborg, Sweden

Full Time

1. Reinforcement Learning For Large Language Models (Llms)

Background & Description

Modulai is offering a master’s thesis opportunity focused on applying Reinforcement Learning (RL) to improve the capabilities of large language models (LLMs). Reinforcement learning has been pivotal in aligning LLMs with human preferences. Recent works show its potential extends further, enabling models to acquire advanced problem-solving strategies and adapt to complex tasks.

Recent advancements highlight the transformative role of RL in LLM post-training:

DeepSeekMath explored how reinforcement learning can enable models to handle multi-step mathematical reasoning. It also introduced a novel RL method, Group Relative Policy Optimization (GRPO).

Tulu 3 introduced a family of fully-open post-trained models, leveraging Supervised Fine-tuning (SFT), Direct Preference Optimization (DPO), and a novel technique dubbed Reinforcement Learning with Verifiable Rewards (RLVR).

ReTool introduces reinforcement learning for tool use, showing how LLMs can learn to combine text-based reasoning and code interpreters for complex tasks.

This project aims to investigate RL approaches for improving LLMs in specialized domains (such as reasoning and tool use). You will explore open-weight models, implement RL methods inspired by the latest research, and evaluate how reinforcement learning impacts model capabilities. Through this work, you will contribute to the growing understanding of how RL can shape the next generation of LLMs.

ML techniques and tools

Open-weight LLMs
Reinforcement learning for LLMs
Python, PyTorch, Git, Hugging Face

References

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models: https://arxiv.org/abs/2402.03300

Tulu 3: Pushing Frontiers in Open Language Model Post-Training: https://arxiv.org/abs/2411.15124

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs: https://arxiv.org/abs/2504.11536

2. Vision-Language-Action Models For Intelligent Robotics Control (Sthlm)

Background & Description
We also offer a master's thesis project in the emerging field of Vision-Language-Action (VLA) models for robotics. VLA models unify computer vision, natural language processing, and robotic control into end-to-end systems, enabling robots to understand visual scenes, interpret human instructions, and execute tasks without manual programming.

Recent research (e.g., Liang et al., 2024) shows that VLA models can perform complex tasks such as “pick up the red mug from the cluttered table.” This thesis invites students to explore and advance these models, contributing to one of the most actively researched directions in AI-powered robotics.

The project scope will be flexible and tailored to the student’s interests and research findings. Students will work with state-of-the-art robotic hardware, GPU clusters, and receive guidance from experts in AI and robotics.

ML Techniques and Tools

Python, PyTorch, Git, Hugging Face
Vision-language-action models (multi-modal AI)
Computer vision and natural language processing methods
Real-time control systems and robotic integration

References

Liang et al., Vision-Language-Action Models for Robotics, 2024.
arXiv:2406.09246

3. Open Application Within Applied Machine Learning

Applied Machine Learning projects encompass a wide range of domains, including healthcare, finance, natural language processing, computer vision, and more. This open application invites students to choose projects aligned with their interests and career goals. Do you have an idea - let us know what it's about by describing it.

Required Skills

Finishing a master's in machine learning or a master's in another field but with courses in machine learning and programming added

Please Include The Following In Your Application:

*Suitable candidates will be called to one interview before making a final decision.
The last date for application will be the 31th of October, but if suitable candidates apply, the process will end beforehand.

About Modulai

Modulai’s clients range from startups to multinational companies. They all share that machine learning is central to how they operate, compete, and create value.

Our services range from advisory projects and feasibility studies to end-to-end development and refinement of machine learning systems and products.

We use state-of-the-art techniques, always focusing on maximizing business impact, delivering solutions in areas such as credit risk, fraud detection, dynamic pricing, recommendation systems, computer vision, natural language processing, opportunity spotting, logistics optimization, up-sell, cross-sales, smart building optimization, predictive maintenance, and route planning.

Other

When doing a master thesis project at Modulai, you are invited to all team activities such as daily stand-ups, weekly learning breakfasts, monthly AWs, and other team activities. We look forward to having you as part of our team!