Cutover
The Collaborative Automation platform
Site Reliability Engineer
Location
United States
Posted
141 days ago
Salary
$120K - $130K / year
EnglishAWSDNSDockerGrafanaJava ScriptPrometheusPythonRubyTerraform
Job Description
• Incident Response: Respond to incidents and alerts, triaging urgency and investigating root cause
• Documentation: Regular contributions to improve our documentation on system design, troubleshooting, best practices, and engineering processes
• Root Cause Analysis: Contribute to post-mortems and help identify long-term improvements under guidance
• Collaboration: Support cross-functional teams during investigations and post-incident reviews
• Observability: Support and enhance observability tools and techniques by identifying metrics, logging, and alerting improvements
• Automation: Write and execute simple automation scripts (e.g. Python, Ruby, Bash) to improve reliability and toil reduction
• Development: Work on internal tools, pipelines, and IaC solutions to help improve the speed of software delivery and recovery
• System Reliability: Work on efforts to enhance the reliability and performance of our application and systems, ensuring optimal uptime and minimal disruptions.
• Infrastructure Optimization: Work closely with the development and platform engineering teams to optimize the infrastructure on AWS, ensuring scalability and efficiency.
Job Requirements
- A genuine excitement for complex problem solving within our tech stack, applying what you know to our unique problems.
- Familiarity with at least one scripting language such as Ruby, JavaScript, Python, Bash
- Experience with containerization (i.e. Docker) or IaC (e.g. Terraform, Helm, CloudFormation)
- An eagerness to follow modern engineering practices and learn from others
- Familiarity with observability tools such as DataDog, New Relic, Grafana, Prometheus, ELK, or OpenTelemetry
- Understanding of core networking concepts (DNS, HTTP/S, Load Balancing, etc.)
- A collaborative mindset with clear communication skills
- Willing to ask questions to gain a better understanding of new or complex concepts
Benefits
- Share Options
- 20 days of PTO per year + public holidays
- 3 volunteer days to use for any charitable/voluntary cause you would like.
- A top-tier private health insurance package.
- 401k contribution plan
- Work from home stipend
- A personal learning and development budget through Learnerbly. You’ll be supported in your quest for knowledge, whatever that looks like to you.
- Fully subsidised therapy sessions to subscriptions to leading wellbeing platforms.