Senior Site Reliability Engineer – SRE
Location
Illinois
Posted
70 days ago
Salary
$165K - $225K / year
Seniority
Senior
Job Description
Job Requirements
- 5+ years in SRE, DevOps, or infrastructure engineering roles with proven experience operating production infrastructure at scale.
- Deep hands-on experience building and operating production Kubernetes clusters on bare-metal infrastructure.
- Strong understanding of Kubernetes internals including custom resource definitions (CRDs), operators, controllers, admission webhooks, and scheduling.
- Strong fundamentals in Linux systems administration, performance tuning, troubleshooting, and automation in production environments.
- Proficiency with infrastructure-as-code tools (Terraform, Ansible, Helm) and building automation to reduce operational overhead.
- Solid understanding of networking concepts including IPAM, DNS, DHCP, VLAN/VXLAN, routing, load balancing, and experience troubleshooting network issues in production.
- Experience building and maintaining comprehensive monitoring solutions using tools like Prometheus, Grafana, and centralized logging systems.
- Understanding of SRE principles including SLIs/SLOs/SLAs, error budgets, incident management, and blameless postmortems.
- Strong scripting skills in Go, Python, or Bash for automation, tooling development, and operational efficiency.
- Demonstrated ability to troubleshoot complex issues under pressure, manage incidents effectively, and communicate clearly during outages.
- Excellent communication skills and ability to work across teams including systems engineers, network engineers, and software developers.
Benefits
- 6% 401(k) match
- Fully covered health insurance premiums
- Other comprehensive offerings to support your well-being and success as we grow together.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer with expertise in Azure, AWS, and Terraform
Deployment Engineer
CyngnAutonomous Vehicle solutions and retrofits for industrial use cases across logistics, material handling, and mining.
Deployment Engineer optimizing autonomy for Cyngn's autonomous vehicles in customer facilities
Senior DevOps Engineer, AWS Cloud
H1H1 is the connecting force for global HCP, clinical, scientific and research information.
Senior DevOps Engineer scaling AWS cloud infrastructure for healthcare company
The Site Reliability Engineering team in CaptivateIQ operates across the engineering organization, supporting our development teams by providing them with the tools and processes they need to get their job done well. We ensure that the service provided by our product is great for...




