Singtel logo

Cloud Engineer (Senior)

Singtel

Posted 3 days ago

About this role

Be a part of something BIG!

 

We are seeking a highly skilled and detail-driven Cloud Engineer (Senior) to be responsible for ensuring the reliability, availability, and operational excellence of cloud platforms supporting mission-critical telco services. The role includes leading on-call incident response, driving service recovery, and improving operational resilience through automation and continuous improvement.

 

Make an Impact by:

 

Cloud Operations & Reliability

  • Operate and maintain production cloud platforms to meet telco-grade availability and performance targets.

  • Proactively identify operational risks and prevent incidents.

  • Ensure operational readiness of cloud platforms for 24x7 support.

 

On-Call & Incident Leadership

  • Participate in and lead scheduled on-call rotations.

  • Act as incident lead for high-severity or complex cloud incidents.

  • Drive service restoration within agreed SLAs and MTTR targets.

  • Coordinate incident response across cloud, network, security, and application teams.

  • Ensure clear and timely communication to stakeholders.

 

Incident, Problem & Change Management

  • Perform root cause analysis and implement corrective and preventive actions.

  • Reduce incident recurrence through operational improvements.

  • Review and approve standard operational changes within delegated authority.

 

Automation & Continuous Improvement 

  • Improve on-call effectiveness through automation, self-healing, and alert optimization.

  • Enhance runbooks and operational documentation based on on-call learnings.

  • Drive operational readiness for new services prior to production release.

 

Security, Compliance & Governance

  • Ensure on-call actions comply with security, regulatory, and change controls.
  • Support audits and vulnerability remediation related to cloud operations.

 

Mentorship & Collaboration

  • Provide on-call guidance and escalation support to junior engineers.
  • Share operational best practices and lessons learned across teams.

Incident Management

  • Managing major incidents impacting customer-critical telco services.

  • Balancing rapid service recovery with strict change and security controls.

  • Troubleshooting complex hybrid or network-integrated cloud environments.

 

Decision-Making Authority

  • Lead on-call incident response and recovery actions.
  • Approve and implement low-risk operational changes.
  • Recommend improvements to on-call processes, tooling, and architecture.
  • Escalate major risks, outages, and compliance issues.

 

Skills for Success:

  • Bachelor’s degree in IT, Computer Science, Engineering, or equivalent experience.
  • 3–6 years of experience in cloud operations or infrastructure roles.

  • Strong hands-on experience with AWS, Azure, or GCP in production environments.

  • Proficiency in IAC tools such as Terraform, Bicep, and CloudFormation to standardize configuration of Cloud resources.
  • Proven ability to monitor, troubleshoot, and resolve complex cloud platform issues, leveraging logs metrics, and alerts across multi-cloud environments.
  • Solid understanding of cloud networking, security, and IAM.

  • Experience with ITSM processes and on-call operations.

 

Your career growth starts here. Apply Now!

 

Job details

Workplace

Office

Location

Kuala Lumpur, Malaysia

Job type

Full Time

Similar

Company

Jobr Assistant extension

Get the extension →