company logo

Senior Software Engineer/SRE - Automated Disaster Recovery

Bloomberg

160k - 240k USD/year

Office

New York

Full Time

Senior Software Engineer/SRE - Automated Disaster Recovery Location New York Business Area Engineering and CTO Ref # 10045491

Description & Requirements

The Team:

We are the Platform Database Services Disaster Recovery as a Service SRE team (DRaaS), charged to administer the end-to-end testing of Bloomberg's datacenters for disaster recovery scenarios of numerous services which support applications that constitute Bloomberg’s line of products! On any given day we're inventing, engineering, developing, building, coding, trouble-shooting and maintaining a wide range of: tools, monitors, frameworks, interfaces, protocols, solutions and best-practices around Disaster Recovery. These components stitch together a robust suite of automated and self-healing systems that manage the services that the Platform Database Services SRE team provides to the rest of the firm. 

What's in it for you:

You will be part of a team that works to help meet company and regulatory defined Disaster Testing standards. Manage and develop solutions that support various disaster recovery tools, creating these applications to integrate the services they provide into the Bloomberg operational environment as well as Bloomberg products. This in-house tooling suite is required to test our clusters and managed services that reside in our datacenters and nodesites in an automated, scale-able and self driven fashion, complete with accompanying metrics and transparency tools that would be required for internal and external clients. Tooling is expected to be written with end-to-end unit testing and continuous integration to provide the highest level of stability.

We have product ownership and "the classic SRE responsibilities" such as: system tuning, performance analysis, defining and following availability targets such as SLA’s, SLO’s and SLI’s as well as having immediate access to the experts that are designing and coding the Bloomberg specific components, APIs and methods used by and supporting the disaster recovery infrastructure. You’ll receive insight and entry to the lowest levels of how Bloomberg applications interact with each other and the runtime environments for the purposes of both in-depth troubleshooting and enhancing stability, reliability, performance and feature-set.

You'll need to have:

  • 4+ years of experience in Python and/or TypeScript

  • A degree in Computer Science, Engineering or similar field of study or equivalent work experience

  • 5+ years experience with Unix, Unix tools and shell scripting

  • Experience designing stable, long-lasting APIs

  • Deep understanding of TCP/IP networking and the OSI model

  • Experience designing and automating repeatable processes in a client/server modeled environment

  • Ability to build and maintain highly sophisticated, available, performant, and scalable, critically important systems

  • Experience building monitors and alarms for system performance, status and stability

  • Experience with CI/CD systems and writing robust unit and system tests

We'd love to see:

  • Basic knowledge in Rapid framework

  • Experience analyzing existing systems and identifying shortcomings with proven methods for improvement

  • Experience with Chaos Engineering

  • Experience with Splunk/Humio and Grafana or other metric based reporting tools

  • Experience with GitHub and JIRA

  • Passion for product ownership

Salary Range = 160000 - 240000 USD Annually + Benefits + Bonus
The referenced salary range is based on the Company's good faith belief at the time of posting. Actual compensation may vary based on factors such as geographic location, work experience, market conditions, education/training and skill level.


We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation (exempt roles only), paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.

Senior Software Engineer/SRE - Automated Disaster Recovery

Office

New York

Full Time

160k - 240k USD/year

August 20, 2025

company logo

Bloomberg

Bloomberg