We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1 We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Lead Reliability Engineering Manager
Location
United States
Posted
11 days ago
Salary
Not specified
No structured requirement data.
Job Description
Role Description
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a VP of Engineering, Reliability. In this pivotal role, you'll define and execute the reliability engineering roadmap while managing a team responsible for ensuring system stability across cutting-edge infrastructure and AI-native architectures. Your impact will bridge the gap between engineering efficiency and operational excellence, paving the way for scalable growth and enhanced service delivery. This position demands a visionary leader with a track record of transforming reliability within innovative technology environments. You will leverage your extensive experience to create a forward-looking vision that meets organizational goals while ensuring compliance and security.
- Define and execute the reliability engineering roadmap, aligning with enterprise growth.
- Balance centralized platform capabilities with distributed ownership for scalability.
- Establish SLO/SLI/error budget frameworks for feature velocity and system stability.
- Lead infrastructure cost management and capacity planning to meet enterprise commitments.
- Develop and scale a multi-disciplinary team while fostering a culture of ownership.
- Drive continuous improvement through DORA metrics and incident trend analysis.
- Empower developers with self-service tooling and clear documentation.
- Act as the primary engineering interface for compliance and security requirements.
- Collaborate with executives to position reliability as a key enabler for success.
Qualifications
- 15+ years of engineering experience, with 7+ years in leading reliability or infrastructure teams.
- Proven track record managing organizations of 40+ engineers across multiple teams.
- Demonstrated experience evolving reliability operating models for scalable businesses.
- Expertise in regulated sectors where compliance and data sensitivity are critical.
- Strong understanding of SRE principles, including SLOs and incident management.
- Technical command of AWS, Terraform (IaC), and modern observability stacks.
- Experience owning cloud infrastructure budgets and cost management.
- Familiarity with AI/ML workloads and their reliability requirements.
- Executive presence for engaging with the C-suite on risk management.
Benefits
- A dynamic, rapidly growing organization focused on helping businesses thrive.
- Comprehensive Medical, Dental, & Vision Insurance for full-time employees.
- Competitive and fair pay commensurate with experience.
- Maternity and paternity leave policies for full-time employees.
- Short and long-term disability coverage.
- Opportunities to learn from a dedicated leadership team.
- Top-of-the-line company swag for team members.
Job Requirements
- 15+ years of engineering experience, with 7+ years in leading reliability or infrastructure teams.
- Proven track record managing organizations of 40+ engineers across multiple teams.
- Demonstrated experience evolving reliability operating models for scalable businesses.
- Expertise in regulated sectors where compliance and data sensitivity are critical.
- Strong understanding of SRE principles, including SLOs and incident management.
- Technical command of AWS, Terraform (IaC), and modern observability stacks.
- Experience owning cloud infrastructure budgets and cost management.
- Familiarity with AI/ML workloads and their reliability requirements.
- Executive presence for engaging with the C-suite on risk management.
Benefits
- A dynamic, rapidly growing organization focused on helping businesses thrive.
- Comprehensive Medical, Dental, & Vision Insurance for full-time employees.
- Competitive and fair pay commensurate with experience.
- Maternity and paternity leave policies for full-time employees.
- Short and long-term disability coverage.
- Opportunities to learn from a dedicated leadership team.
- Top-of-the-line company swag for team members.
Related Guides
Related Categories
Related Job Pages
More Engineering Manager Jobs
AgriTech Co-Founder / Head of Engineering (100 % remote) (m/f/d)
EWOR GmbHThe EWOR Fellowship backs the top tech founders globally with up to €500k and bespoke mentorship by unicorn founders (Adjust, ProGlove, SumUp).
The role involves owning, building, and running a new startup in fields such as AgriTech, while embarking on an extensive personal development journey crafted by unicorn founders. Support will be provided in hiring top talent, iterating the product to achieve product-market fit, and building out sales or marketing capabilities.
Network Engineering Trainer
Core4ce CareersCore4ce is a team of innovators, self-starters, and critical thinkers—driven by a shared mission to strengthen national security and advance warfighting outcomes. Got a big idea? At Core4ce, The Forge gives every employee the chance to propose bold innovations and help bring them to life with internal backing. Join us to build a career that matters—supported by a company that invests in you.
The network engineering trainer will train, either in-person or virtually, Network Security Operations Center (NSOC) personnel on network devices found in the Medical Community of Interest (MedCOI). The trainer will develop training materials and provide instruction on the baseli...
Manager of Software Engineering overseeing development team at healthcare technology company
Senior Engineering Reliability Manager
JobgetherWe use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1 We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a VP of Engineering, Reliability. In this pivotal role, you'll define and execute the reliability engineering roadmap while managing a team responsible for ensuring system stability ...