Summer Intern 2026- Software (Alarm system monitoring)
ALMA Observatory.com
Office
Vitacura, Región Metropolitana, CL
Internship
Description
Tittle: Alarm system for the Monitoring Data System (MDS) and the supporting services related to MDS
Purpose:
Study alternatives and implement a diverse alarm system for the MDS on its Kubernetes environment using Terraform for its deployment (or alternatively, use the Rancher user interface), as well as study alternatives and implement an alarm system for the ActiveMQ and Monitoring-Consumer services (supporting services of the MDS) that live in virtual machines and are deployed using Docker Compose. Finally, study possible improvements for the service of the MDS Data Gap Detector. Every specific objective must be thoroughly documented.
Description:
The Monitoring Data System (MDS) is a complex system that is responsible for storing and making available to the users the ALMA device's monitoring data. This system is delicate and there is currently a requirement for the Software Team of monitoring the different aspects of this system, including: Status of the Kubernetes cluster and the worker nodes that compose it, and status of the pods that are deployed in the Kubernetes (being the most important pods the ones that are related to the Database, implemented in Cassandra), for what is requested to implement alarms. Also, MDS has some supporting services, to be specific: Monitoring- Consumer, responsible for formatting the monitoring data so it can be inserted into the Cassandra database, and ActiveMQ, which is the queue service that queues the formatted monitoring data; these services are also required to be monitored and to send alarms when the servers go down or when the resources usage is too high, such as RAM and disk space. Finally, there is a service called Data Gap Detector, that monitors if the monitoring data queried on the MDS's API presents any data gap, and reports when this occurs; this application may have some improvements, and it is necessary to evaluate what improvements it could have.
Objectives:
General Objective:
Implement a variety of alarms to be deployed in the different environments related to the MDS. These alarms will allow the maintainers of those systems to be alerted of possible problems with their resources' usage and the status of the services.
Specific Objectives:
Implement and thoroughly document the alarm system for the MDS, to live in its Kubernetes environment, using the infrastructure-as-code tool Terraform. Investigate and thoroughly document different alternatives for implementing an alarm system for the MDS's supporting applications, which live in virtual machines and are implemented using Docker containers. Implement and thoroughly document an alarm system that monitors the status of the supporting applications of the MDS. Study possible improvements for the service of the Data Gap Detector, related to the MDS system.
Deliverables:
The Expected Deliverables Are:
Terraform scripts with the implementation of a variety of alarms for the MDS system in its testing environment, as well as for its "Failover" environment. Documentation of the investigation of different alternatives for the implementation of an alarm system for the supporting applications of the MDS. Implementation and documentation of the alarm system that monitors the status of the supporting applications of the MDS. Documentation of possible improvements for the Data Gap Detector service.
Deadline for applications: Friday, October 24th, 2025 at 12 pm Chile time.
Requirements
Student Technical Background
Required:
Basic knowledge of Linux operating systems and the command line.
Understanding of containerization concepts (e.g., Docker).
Ability to read and write technical documentation in English
Desirable:
Previous experience with Docker Compose.
Previous experience with Kubernetes.
Knowledge about the Terraform tool.
Summer Intern 2026- Software (Alarm system monitoring)
Office
Vitacura, Región Metropolitana, CL
Internship
October 2, 2025