Passionate music fans. Innovative tech pros. Perfect harmony. Join our band.
Senior Site Reliability Engineer
Location
New York
Posted
1 day ago
Salary
$164.4K - $234.9K / year
Job Description
Job Requirements
- 5+ years of hands-on experience operating cloud infrastructure (GCP and/or AWS), using Terraform and Kubernetes to run production systems at scale.
- practical experience — or a strong demonstrated interest — in operating LLM-based systems, RAG pipelines, or agentic workloads, and understand the reliability challenges of non-deterministic systems.
- think in distributed systems first principles — consistency, availability, partition tolerance — and translate that thinking into pragmatic infrastructure decisions.
- proficient in at least one modern language (TypeScript, Java, Go, or Python) and comfortable navigating large, heterogeneous codebases, including environments where AI-generated PRs are common.
- build automation and improve systems so that whole categories of operational issues disappear over time.
- communicate complex infrastructure trade-offs clearly to both technical and non-technical stakeholders, and write postmortems that lead to meaningful change.
Benefits
- health insurance
- six-month paid parental leave
- 401(k) retirement plan
- monthly meal allowance
- 23 paid days off
- paid flexible holidays
- paid sick leave
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Senior DevOps Engineer I
TrueMLTrueML is a fintech company building software to create positive experiences for consumers seeking financial health.
As a Senior DevOps Engineer, you will enhance our cloud-native infrastructure, manage IaC with Terraform, and optimize CI/CD processes, focusing on AWS and Kubernetes operations.
DevOps Architect designing and optimizing DevOps practices at Effectual
This position is on the DevOps team, supporting the MNTN platform and Engineers. The right person will not only have a deep knowledge of system administration and GCP, but will also be able to work with a variety of Developers. You will work closely with our Engineering team and ...
Senior Site Reliability Engineer
AkamaiAkamai powers and protects life online. Leading companies worldwide choose Akamai to build, deliver, and secure their digital experiences helping billions of people live, work, and play every day. With the world's most distributed compute platform from cloud to edge we make it easy for customers to develop and run applications, while we keep experiences closer to users and threats farther away. Join us Are you seeking an opportunity to make a real difference in a company with a global reach and exciting services and clients? Come join us and grow with a team of people who will energize and inspire you!
Do you enjoy collaborating with teams to solve complex challenges? Do you enjoy solving large scale distributed content delivery challenges? Join our critical Platform and Reliability Engineering Team! The Platform & Reliability Engineering team is responsible for defining, measu...