Senior Site Reliability Engineer (SRE)

Alternative Payments.com

72k - 90k USD/year

Hybrid

Remote (Brazil)

Full Time

At Alternative Payments, we are transforming the way service-based companies handle payments. Our innovative platform automates the entire accounts receivable process, helping businesses save time, reduce costs, and scale with confidence.

We are building a global team that values innovation, impact, and collaboration. As part of a scaling FinTech company, every person on our team has the opportunity to shape our future, influence our products, and make a real difference for our customers.

What We’re Looking For

We’re seeking a Site Reliability Engineer to join our EPD Team and help us drive operational excellence, enhance infrastructure reliability and establish cutting-edge observability across our critical systems.

This role is ideal for someone who thrives on complex technical problem-solving, deep operational ownership, and cross-functional collaboration, and wants to take a hands-on role in shaping our reliability strategy, driving innovation in our infrastructure, and implementing robust SRE practices. You will be joining our team as part of a cross-team DevOps structure.

You will be instrumental in tackling significant challenges currently faced by our platform, such as addressing visibility into system capacity and performance metrics and proactively preventing incidents through enhanced monitoring.

A critical part of your mission will be to fill the end-to-end request traceability and improve our load testing and capacity planning. You'll design and implement automated alerting systems that our Firefighters can confidently rely on, effectively bridging the gap between deep infrastructure knowledge and rapid incident response capabilities.

This role is available to candidates who are eligible to work in Brazil, in a remote setting.

What You’ll Do

As a Senior SRE, you will play a crucial role in building and maintaining the resilience and performance of our platform. This is what we have mapped out for you:

Leading and executing on key reliability initiatives from planning to delivery, particularly focusing on monitoring, alerting, and incident response for the Firefighters team. Your immediate projects (first 1-3 months) will include:
- Monitoring and Alerting Setup: Configuring comprehensive alerting systems, including queue monitoring and service health checks.
- Metrics and Dashboards: Building performance dashboards, implementing load testing, and creating capacity metrics for presentations.
- Observability Enhancement: Implementing end-to-end traceability with distributed tracing and service profiling.
- Infrastructure Automation: Working on pipeline improvements, moving to strength-based pipelines.
- Datadog Integration: Continuing the migration back to Datadog and optimizing our monitoring stack.
Collaborating with cross-functional teams (including our support engineers and DevOps members) to deliver scalable solutions, optimize processes, and implement highly reliable systems.
Taking ownership of complex SRE tasks, including configuring monitoring systems, defining and enforcing SLIs, SLOs, and SLAs.
Proposing improvements and helping establish best practices, workflows, and standards for incident response, blameless post-mortems, and continuous improvement.

What You’ll Bring

7-10+ years of experience in Site Reliability Engineering, DevOps, or a similar role focused on large-scale distributed systems.
Strong skills in Kubernetes for container orchestration and cluster management.
Extensive experience with AWS as a core cloud platform for infrastructure management.
Critical proficiency with Datadog for monitoring, logging, tracing, and alerting.
Proven experience in designing, implementing, and optimizing CI/CD pipelines, ideally with GitHub Pipelines.
Strong understanding and practical application of SRE principles: SLI/SLO/SLA definition, error budget management, incident response, post-mortem analysis, and toil reduction.
A proactive mindset with the ability to solve complex problems, drive projects independently, and continuously innovate our reliability practices.
Strong communication skills, especially in English, to collaborate effectively across technical teams and stakeholders.

Nice to Haves

Experience in FinTech, payments, startup, or scale-up environments.
AWS certification, demonstrating commitment to learning and mastering cloud technologies.
Experience with SOC2 compliance, CI security validations, and other infrastructure security aspects.
Demonstrated knowledge or experience with Infrastructure as Code tools, particularly Terraform.
Familiarity with other monitoring tools like Grafana.
Comfort working in fast-paced, dynamic, and high-impact environments.

What We Offer

Competitive salary tailored to your experience, skills, and expertise.
- The total compensation range for this role is $72,000 - $90,000 USD/year, plus equity. The range displayed on each job posting reflects the approximate total target compensation for the position. Within the range, individual pay is determined by factors including relevant skills, experience, education/training.
Equity opportunities so you can share in our growth and success.
Unlimited PTO and flexibility when you need it the most.
Referral bonus. We truly believe we hire fantastic people, and great talent recognizes great talent. We offer a significant bonus for your hired referral.
Yearly learning & development stipend to help you grow and do your best work.

Why Choose Alternative Payments?

At Alternative Payments, you’ll do work that truly matters.

Own your impact: Lead meaningful, high-impact projects and collaborative initiatives that are shaping the future of FinTech and redefining how businesses get paid.
Collaborate: Work with a diverse, innovative team where every voice is heard and great ideas come from anywhere.
Grow with us: Your career journey is top of mind. We prioritize internal growth and give you the space to shape your path based on your goals — whether that’s deepening your expertise in your domain or exploring something new.
Thrive in a supportive culture: As a scaling start-up, there’s a lot to be done and initiative is key. We believe in shared learning, open communication, and building each other up. When one of us grows, we all do.

Our Values

Transparency & Honesty: We communicate openly and truthfully with partners, investors, and each other so everyone understands where we stand and where we’re headed.
Resourcefulness: We stay scrappy, find creative solutions, and make progress even when the path isn’t obvious. We have a bias for action and seek out the information and resources necessary to make decisions and move quickly.
Partnership: We win and lose together. We collaborate with our partners, investors, and teammates to tackle big challenges and reach shared goals.
Revolutionary & Boldness: We challenge conventions, take calculated risks, and build better, stronger solutions that move our business and the industry forward.
Accountability: We take ownership of our decisions and results. We follow through on our commitments knowing our work directly impacts our partners, our team and our business.

Applying to Alternative Payments

We’re looking for candidates who are ready to step in and make an impact from day one. We know that sometimes people hold back unless they meet every requirement, but if you’re excited about the role, bring relevant experience, and are ready to contribute, we want to hear from you!

All resumes are reviewed by our small but mighty talent team. While we may use AI tools to help prioritize applications, real people are behind every resume review and hiring decision. We’re also committed to an inclusive and accessible process. If you require reasonable accommodation during the hiring process, please let us know upon being selected to interview.