Site Reliability Engineer (Junior/Middle)
EPOS.com
Office
Ho Chi Minh City, Ho Chi Minh City, Vietnam
Full Time
About Us
Established in 2009, Floating Cube Studios - EPOS Vietnam serves as the technical hub of EPOS Singapore, a leading provider of cutting-edge Point-of-Sale (POS) and SaaS solutions. Backed by Ant International — a global leader in digital payments and financial technology — we play a pivotal role in developing innovative, scalable, and user-centric digital solutions that power EPOS products. Our technologies enable thousands of SMEs in Singapore, and with ongoing expansion plans across Asia, to digitize and grow their operations through cost-effective and reliable platforms.
At Floating Cube Studios - EPOS Vietnam, we cultivate a collaborative and dynamic culture driven by innovation and a passion for transforming businesses through technology. Join us to shape the future of digital solutions across Asia and beyond!
Key Contributions
- Design and operate scalable, resilient, and distributed systems across on-premises and/or cloud environments.
- Manage resource provisioning, utilization, capacity planning, and cost optimization.
- Build and maintain observability systems (metrics, logs, traces) to ensure high availability and fast incident detection.
- Analyze performance issues and drive improvements in reliability, latency, and system efficiency
- Partner with development teams to enhance stability, performance, and deployment quality.
- Develop automation tools, pipelines, and APIs to streamline operational workflows.
- Containerize and automate applications/services to improve consistency and deployment speed
- Implement security, compliance, and configuration-management standards across environments.
- Build and maintain CI/CD pipelines to ensure reliable and repeatable software releases.
- Perform incident response, troubleshooting, root-cause analysis, and maintain clear operational documentation.
Requirements
- Bachelor’s Degree in Computer Science / Information Technology / Software Development or related fields.
Must Have:
- At least 2+ years of hands-on experience in a relevant SRE role.
- Solid experience running Kubernetes and cloud platforms (AWS/GCP/Azure) at scale.
- Strong proficiency with containers and orchestration technologies (Docker, Kubernetes).
- Skilled in Infrastructure as Code and automation tooling (Terraform, Ansible).
- Strong hands-on experience building and maintaining monitoring, logging, and alerting systems (Prometheus, Grafana, ELK, or similar).
- Hands-on expertise with microservices observability, log analysis, and monitoring tools (ELK Stack, Prometheus, Grafana, ClickHouse).
- Practical skills in at least one scripting language (Python, Bash, or Go) for automation and tooling.
- Familiarity with CI/CD pipelines and the ability to maintain/integrate tools such as GitLab CI, Jenkins, or GitHub Actions.
- Good/Fluent English communication is mandatory.
Benefits
We are a multinational, product-driven company specializing in proprietary POS solutions — developing in-house and delivering directly to our worldwide customers.
Benefits
- Recognition & Rewards:
- Performance Bonus (subject to the company’s business results and the employee’s performance evaluation)
- Biannual Performance Review and Salary Adjustment
- Comprehensive Insurance Coverage:
- Full government public insurance contributions based on gross salary
Premium health insurance
- Annual health check
- Clear career development and growth structure; Training sessions and Learning workshops
- 14 days of annual leave and one additional day of leave for every year of service
- Laptop/MacBook and top-notch facilities are provided based on each role
- Agile/Scrum-based internal workflows for efficient and collaborative development
- Company trips, parties and regular team-building activities; Weekly happy hour, coffee, snacks, and board games
- Overseas travel opportunities based on the individual performance and policies for each evaluation period
Working Environment & Culture
- International Workplace: English-speaking environment
- Positive and Open-Minded Culture: Engineers are encouraged to propose innovative solutions that enhance productivity and code quality
- 1-on-1 Mentorship: Monthly coffee sessions with managers offer personalized feedback, goal setting, and career development opportunities
- Flexible Working Hours: Promote work-life balance and individual productivity
