STN Inc logo

NOC Engineer / NOC Lead

Posted 12 days ago

RemoteRemoteSE

NOC Engineer / NOC Lead

Infrastructure operations · shared across customers

Reports to: Manager, NOC (or Director, Service Operations)

Location: Remote (US) with assigned shift; rotating coverage

Department: Infrastructure & DC Operations / Network Engineering

Position summary

The NOC Engineer operates STN's 24/7 monitoring and first-response capability for GPU One (GPUaaS) infrastructure. The role triages alerts, executes documented runbooks, and coordinates with on-call specialists during incidents to protect customer SLAs.

Key responsibilities

  • Monitor infrastructure alerts, customer SLA dashboards, and system health on a 24/7 basis

  • Triage incidents and engage on-call SREs, Network, Hardware, or Field Engineering as needed

  • Execute documented runbooks for common platform, network, and hardware issues

  • Manage the incident lifecycle including initial customer notification and status updates

  • Coordinate planned maintenance windows and change windows with internal teams and customers

  • Update status pages and customer-facing communications during incidents

  • Maintain shift handoff documentation and active-incident logs

  • Support ticket queue handling including Tier 1 ticket resolution

  • Contribute to continuous improvement of monitoring coverage, alert quality, and runbooks

  • Work rotating shifts including nights, weekends, and holidays

Required qualifications

  • 3+ years in a NOC, SOC, or IT operations function

  • Hands-on experience with monitoring tools (Datadog, Prometheus, Grafana, PagerDuty, or equivalent)

  • Strong Linux and basic networking fundamentals

  • Excellent written and verbal communication, particularly under pressure

  • Willingness and ability to work rotating shifts including overnight coverage

Preferred qualifications

  • GPU, HPC, or large-scale cloud infrastructure background

  • ITIL Foundations certification

  • Demonstrated on-call and major-incident response experience

  • Scripting skills (Python, Bash) for runbook automation

Job details
Workplace
Remote
Location
Remote
Experience
SE

Secure, production-grade GPU cloud for AI teams. SOC 2 & HIPAA compliant with 99.999% uptime, no noisy neighbors, and expert human support.

Employees
83
Industry
IT Services and IT Consulting
Headquarters
Pleasanton, California
Founded
2016
Specialties
Managed Services, SOC2 Certified, Cyber Security, Risk Assessments, HIPAA, Compliance, Managed SIEM, Backup, Recovery, Incident Response, Ransomware Prevention, Penetration Testing, Social Engineering, Network Engineering, and VAR Reseller

Key team members

Sabur Mian

Sabur Mian

Christopher Chua

Christopher Chua

Trevor Walker

Trevor Walker

Tom Genn

Tom Genn

Apply smarter with Jobr

Jobr aggregates jobs directly from company career portals — no middlemen. Our team applies on your behalf with AI-tailored resumes, reviewed by a human before submission.

Direct from company career pages
AI-personalised cover letters
Human review before every submit
Application tracking & follow-ups