Penn Mutual
Helping people get stronger is a pretty good business to be in.
Senior Site Reliability Engineer
DevOps EngineerDevOps EngineerFull TimeRemoteTeam 1,001-5,000Since 1847H1B SponsorCompany SiteLinkedIn
Location
United States
Posted
19 days ago
Salary
$128K - $165K / year
Bachelor Degree6 yrs expEnglishAWSDistributed SystemsITSM
Job Description
• Lead reliability availability, scalability, and recovery design for critical systems.
• Define and evolve SLOs, SLIs, and error budget practices across services.
• Identify systemic reliability risks and drive cross-team remediation efforts.
• Influence application and platform architecture to improve operational outcomes.
• Act as a technical lead during major incidents and complex outages.
• Drive high-quality root cause analysis and recommend corrective actions.
• Improve incident response processes, tooling, and runbooks.
• Design and implement advanced automation to eliminate operational toil at scale.
• Build and maintain shared SRE tooling and platforms.
• Set engineering standards for reliability-focused code and operational practices.
• Review and improve CI/CD, deployment, and rollback strategies.
• Partner with Release and Change Management to automate release practices.
• Lead risk assessments for high impact changes and releases.
• Ensure compliance requirements are met without sacrificing engineering velocity.
• Serve as a reliability authority for release readiness decisions.
• Mentor junior SREs and junior engineers through technical guidance and review.
• Lead by example in operational excellence and engineering rigor.
• Influence reliability culture across engineering and product teams.
Job Requirements
- Bachelor’s degree in Computer Science, Engineering, or related field.
- 6–10+ years of experience in SRE, software engineering, platform, or DevOps roles.
- Professional experience in performing root cause analysis on incidents, documenting SRE systems and usage.
- Strong programming skills with professional experience in multiple languages.
- Deep experience with AWS and distributed systems.
- Advanced knowledge of observability, ITSM, and reliability engineering principles.
- Proven ability to operate effectively in complex, regulated environments.
- Experience with use/implementation of observability tools (metrics, logs, tracing).
- Experience with CI/CD pipelines and deployment automation.
- Experience with Root Cause Analysis investigation/documentation.
- Familiarity with containerization and orchestration technologies.
- Strong troubleshooting and analytical skills.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer19 days ago
Full TimeRemoteTeam 11-50Since 1968H1B No Sponsor
DevOps Engineer enhancing software development lifecycle at Summit Racing Equipment
AnsibleChefDockerJavaScriptJenkinsKubernetesLinuxNode.jsPuppetPythonSDLCSeleniumTCP/IP.NET
Florida + 4 moreAll locations: Florida, Nevada, Ohio, Tennessee, Texas
DevOps Engineer19 days ago
InternshipRemoteTeam 1,001-5,000Since 1972H1B Sponsor
Site Reliability Engineer Intern at Credit Acceptance developing software and testing standards
JUnitKubernetesSOAP
Site Reliability Engineer – Team Lead
Meal TicketDeliver Data-Driven Results with our suite of solutions for distributors, suppliers, and operators.
DevOps Engineer20 days ago
Full TimeRemoteTeam 51-200Since 2010
Engineering leader for SaaS cloud company in food & beverage hospitality industry.
AzureCloudKubernetesMySQLTerraform
DevOps Engineer20 days ago
Full TimeRemoteTeam 10,001+Since 1966H1B Sponsor
Site Reliability Lead Engineer for Jabil's Intelligent Infrastructure division
AnsibleCloudDNSFirewallsPythonVMware