AdTechTalent
Engineering4 days agoRemote

Raptive

Sr. Site Reliability Engineer

site reliability engineeringSREplatform engineeringDevOpsKubernetesTerraformCI/CDGitHub ActionsFluxArgoGrafanaPagerDutyPrometheusdistributed systemscloud cost managementAgileScrum

Key details

Salary

$120K – $180K

Employment type

Full-time

Seniority

Senior

Years experience

5-10

Location

New York, New York, United States

Full job description

Raptive is seeking a Senior Site Reliability Engineer with 8+ years of experience in SRE, platform engineering, or DevOps. The role involves designing, building, and evolving scalable distributed systems and Kubernetes-based infrastructure. Responsibilities include leading initiatives from concept to production, ensuring system resilience, observability, and performance, improving CI/CD pipelines, and mentoring team members. Required skills include Kubernetes (Helm charts, cluster operations, autoscaling), Terraform, CI/CD tools (GitHub Actions, Flux, Argo), cloud cost management, observability tools (Grafana, PagerDuty, Prometheus), and Agile/Scrum methodologies. The position is full-time, hybrid (remote U.S. candidates encouraged, with option for in-office in New York City). Salary range is $120,000-$180,000 plus additional incentives.

What you'll do

  • Design, build, and evolve platform and infrastructure that powers products
  • Lead initiatives from concept through production ensuring systems are resilient, observable, and performant
  • Partner closely with product and engineering to enable fast, reliable delivery
  • Raise engineering standards across CI/CD, infrastructure, and operations
  • Mentor others and help shape the technical direction of the team through thought leadership and hands-on engineering

Requirements

  • 8+ years of professional Site Reliability, platform engineering or DevOps related work
  • Proven track record designing distributed, scalable architectures
  • Hands‑on Kubernetes experience authoring Helm charts, operating clusters, tuning autoscaling, and debugging production issues
  • Infrastructure‑as‑Code with Terraform (module authoring, CI validation, environment promotion)
  • Experience improving CI/CD pipelines and release processes (GitHub Actions, Flux/Argo, or similar)
  • Familiarity with cloud cost management; able to spot and fix cost anti‑patterns in code or infrastructure
  • Hands-on experience with observability tools such as Grafana and PagerDuty, with familiarity in Prometheus metrics, distributed tracing, and RUM tooling
  • Experience developing secure software using Agile/Scrum methodologies
  • Expertise in debugging and troubleshooting production workloads

Tech stack

KubernetesHelmTerraformGitHub ActionsFluxArgoGrafanaPagerDutyPrometheusRUM toolingAgileScrum

Benefits

Eligible for additional incentive compensationRecognized as a Fortune 100 Best Places To Work for 2025-2026

Apply now

This MVP uses a placeholder application flow. In production, this section can connect to an external apply URL or a native application form.

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.