Senior Site Reliability Engineering Manager
SHEIN.com
230k - 235k USD/year
Office
United States
Full Time
Job Responsibilities
Full-time or part-time: Full-time
Job title: Senior Manager, Site Reliability Engineering
Job Location: 777 S. Alameda St. (& Other U.S. Locations), 2nd Floor, Los Angeles, CA 90021
Job Description:
Maintain 24x7 production environment with a high level of service availability. Perform quality reviews, manage operational issues. Define and drive a culture of operational excellence across infrastructure and engineering through things such as SLAs, process, monitoring, etc. Provide leadership and direction to engineers that are responsible for break-fix, uptime and reliability for core services, distribution, network elements and related interfaces. Provide people-care management for team members, including hiring, setting, and monitoring of annual performance plans, coaching, and career development; ensures that proper knowledge and career development tools are in place to support ongoing team member development. Set clear expectations and create a positive work environment based on accountability, in collaboration with the other engineering teams. Work with other engineering managers to grow our culture of automation and reliability. Continuously improve our 24/7 on call and incident management process and lead blameless post-mortems.
Supervises 2 subordinates – Site Reliability Engineers.
Minimum Education & Experience Requirements:
Master’s degree in Electrical Engineering, Electronic and Information Engineering, Computer Science, a related field, or a foreign equivalent plus 3 years of progressively responsible postbaccalaureate experience in job offered or any engineering related job titles.
Applicant must possess at least 3 years of experience in the following: (1) Linux System Administration including Network and DNS Management, Performance tuning, resource optimization, and security practices; (2) Cloud Technology including AWS and Azure for scalable computing and globally distributed data management; (3) architecting and maintaining robust data pipelines using serverless computing platforms such as AWS Lambda and Azure Functions; (4) object storage services like AWS S3 and Azure Blob Storage; (5) designing and managing cloud network architectures, including Virtual Private Clouds (VPC), subnets, load balancing, and DNS configurations to ensure secure and efficient network traffic flow across cloud services; (6) site reliability engineering methodologies; (7) the configuration and maintenance of EMR, Hadoop, Kafka, and Elasticsearch clusters, ensuring performance and stability; (8) Data Warehousing and Real-Time Processing frameworks; and (9) Hadoop Management Platforms including HDP stack in order to streamline management and operational stacks. Telecommuting permitted.
Compensation for this role: $230,239 – $235,239/year
Senior Site Reliability Engineering Manager
Office
United States
Full Time
230k - 235k USD/year
October 1, 2025