Saviynt logo

Principal Site Reliability Engineer

Posted about 8 hours ago

OfficeVancouverSE260k - 275k CAD
Why Join Saviynt
 
•        Work on a mission-critical SaaS platform used by global enterprises
•        Solve complex reliability challenges at scale
•        Influence architecture and engineering culture at a company level
•        Competitive compensation, benefits, and growth opportunities
 
 
Security & Compliance
 
This role requires compliance with Saviynt’s information security and privacy policies, including annual security training
### What You Will Be Doing
In this pivotal role, you will be instrumental in designing, building, and maintaining the shared infrastructure services and platforms that our product and application teams will depend on
 
You will focus on creating reusable, reliable, and scalable solutions that abstract away complexity, enabling other teams to focus on their core business logic and deliver features faster in a multi-cloud environment
 
Design and build core platform components and shared infrastructure services that other development teams will integrate with and leverage to deploy and operate their applications
 
Architect, implement, and manage highly available and scalable Kubernetes platforms as a service for internal consumers
 
Develop robust, internal-facing tools and automation for infrastructure provisioning and management primarily using Go (Golang)
 
Architect and optimize foundational solutions within Cloud environments (AWS, Azure, etc.), focusing on creating reusable patterns and modules for other teams
 
Design and implement shared Event-Driven Architecture components and messaging platforms using technologies like Kafka or Google Pub/Sub that product teams can easily utilize
 
Develop and maintain robust CI/CD pipelines (e.g., GitLab CI and ArgoCD) as a service, providing standardized and automated deployment workflows for various development teams
 
Design and build resilient Distributed Systems components that serve as building blocks for other applications, focusing on reliability, fault tolerance, and performance
 
Manage and optimize our shared infrastructure across Multi-Region Cloud Environments, ensuring that platform services are globally available and performant for all consumers
 
Establish and enhance centralized Observability and Monitoring platforms and tools that provide self-service insights for consuming teams
 
Define and implement clear, well-documented RESTful API designs for the infrastructure services you build, ensuring ease of integration for internal clients
 
Implement and manage Service Mesh (e.g., Envoy, Istio) capabilities, providing traffic management, security, and policy enforcement as a shared platform for services
 
Design, implement, and optimize highly available Relational Database services or shared data platforms for broad organizational use
 
Collaborate closely with product development teams to understand their infrastructure needs and pain points, providing technical guidance and support
 
Participate in on-call rotations to support the critical shared infrastructure you build
### What You Bring
9+ years of experience in an Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a strong focus on building tools and services for other engineers
 
Deep expertise with Kubernetes in production environments, particularly in providing it as a platform(i.e single tenant and multi-tenant deployment architectures)
 
Strong programming skills in Go (Golang) and Python, with experience building robust, maintainable backend services and automation
 
Extensive hands-on experience with at least one major Cloud Provider (AWS, GCP, or Azure); multi-cloud experience is a strong plus, especially in building abstractions over them
 
Proven experience designing and implementing Event-Driven Architecture and message queuing systems (e.g., Kafka, RMQ, NATS) as shared services
 
Solid understanding and practical experience with CI/CD pipeline tools (especially GitLab CI) and experience establishing automated delivery processes for other teams
 
Demonstrable experience designing and operating Distributed Systems, with an understanding of patterns for creating reliable, shared components
 
Familiarity with Multi-Region Cloud Environments and strategies for building globally distributed and highly available platform
 
Proficiency in establishing and utilizing comprehensive Observability and Monitoring platforms (e.g., Prometheus, Grafana, ELK stack, Datadog) for shared infrastructure
 
Strong experience with RESTful API design principles and building well-documented, consumable APIs
 
Knowledge of Service Mesh concepts and practical experience with solutions like Istio in a platform context
 
Hands-on experience with Relational Databases (e.g., MySQL, PostgresSQL), ideally in managing them as a service
 
Excellent communication skills and the ability to clearly articulate complex technical concepts to both technical and non-technical audiences
 
A strong customer-centric mindset, treating internal development teams as your primary customers
 
Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience required
Job details
Workplace
Office
Location
Vancouver
Experience
SE
Salary
260k - 275k CAD
per year

Saviynt delivers enterprise control over AI, securing every identity across the organization, including human, non-human, and AI, so businesses can build and deploy AI innovation with complete confidence.

Key team members

Kevin Spurway

Kevin Spurway

Fredrik Hörnell

Fredrik Hörnell

Hemendra Rana

Hemendra Rana

Evelyn Acosta Behrendt

Evelyn Acosta Behrendt

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups