
Head of Support & Service Reliability Engineering
Sycurio
Posted 3 days ago
We are seeking a Head of Support & Service Reliability to lead and evolve our global support function into a proactive, platform-integrated reliability capability.
This role provides an exciting and dynamic opportunity for an outcome focused individual; as Sycurio is in a critical inflection point as we transition from a single-tenant architecture to a multi-tenant SaaS platform, requiring a fundamental shift from reactive ticket handling to systemic reliability, observability, and customer experience management at scale.
You will own the end-to-end operational integrity of the platform, ensuring availability, performance, and customer trust, while partnering closely with Engineering, Product, and Customer-facing teams; being a key contributor to our GRR goal of 90%+
Sycurio employs a strategic managed service provider who provides the people, tooling, and day-to-day execution across all support tiers. The Head of Support sets the standards, governs vendor performance, and ensures every aspect of the support experience — from incident response to customer satisfaction — meets enterprise-grade expectations
-
Service Reliability & Platform Stability
-
Own platform availability, performance, and reliability across all tenants
-
Reduce incident frequency, severity, and blast radius
-
Establish and drive Service Reliability Engineering (SRE) principles
-
Ensure scalability and operational readiness of a multi-tenant platform
-
Incident Management & Response
-
Implement and lead a structured incident management framework (P1–P4)
-
Act as executive owner of major incidents (P1/P2)
-
Drive improvements in:
-
Mean Time to Detect (MTTD)
-
Mean Time to Resolve (MTTR)
-
Ensure clear, consistent internal and external communication during incidents
-
Observability & Monitoring
-
Define and implement a comprehensive observability strategy, including:
-
Technical telemetry (infrastructure, application, APIs)
-
Business telemetry (transactions, payment success rates, usage)
-
End-to-end customer journey visibility
-
Ensure issues are detected proactively, not customer-reported
-
Partner with Product and Engineering to embed telemetry into the platform
-
Support Operations (L1–L3)
-
Lead global support teams ensuring high-quality, SLA-driven case management
-
Define and enforce support processes, tooling, and performance standards
-
Improve key metrics:
-
First response time
-
Resolution time
-
Reopen rate
-
Escalation quality
-
Platform Operations & Change Management
-
Oversee operational aspects of the platform, including:
-
Release management and deployment safety, ensuring all releases are observable, reversible, and low-risk
-
Change control processes
-
Environment consistency across staging and production
-
Own the visibility and continuous improvement of delivery and recovery performance using the DORA metrics, in partnership with Engineering
-
Issue Management & Root Cause Discipline
-
Establish rigorous Root Cause Analysis (RCA) standards
-
Identify and eliminate systemic issues (not just symptom fixes)
-
Track and reduce recurring incidents
-
Feed insights into Product and Engineering roadmaps
-
Customer Experience & Commercial Alignment
-
Align support with Customer Success and Sales
-
Ensure coordinated communication during incidents
-
Protect customer relationships during critical events
-
Introduce tenant-aware impact assessment (ARR, strategic accounts, regulatory exposure)
-
Support enterprise-grade expectations for transparency and reliability
-
Cross-functional Leadership
-
Act as the bridge between:
-
Engineering
-
Product
-
Customer Delivery / Success
-
Embed supportability and operational readiness into:
-
Pre-sales (Stage 4/5 governance)
-
Product development
-
Deployment processes
-
Managed Service Governance
-
Chair regular operational reviews and quarterly business reviews with the managed service leadership team
-
Own the managed service scorecard — defining KPIs, reviewing performance data, and driving accountability for misses
-
Manage contract compliance, SLA adherence, and commercial exposure from managed service underperformance
-
Lead continuous improvement programs jointly with the managed service provider, including tooling upgrades, process redesigns, and training investments
-
Maintain an escalation path for systemic or persistent managed service failure, up to and including remediation planning
Required
-
10+ years in Support, Platform Operations, or SRE leadership roles
-
Proven experience in multi-tenant SaaS and legacy environments
-
Strong understanding of:
-
Distributed systems
-
Incident management at scale
-
Observability frameworks
-
Track record of building and scaling high-performing operational teams
-
Experience in outsourced or hybrid operational models
Job details
Jobr Assistant extension
Get the extension →