The market intelligence and search platform trusted by over 3,500 leading organizations

Staff Site Reliability Engineer

Full TimeRemoteTeam 1,001-5,000Since 2011H1B SponsorCompany Site LinkedIn

Location

United States

Posted

20 hours ago

Salary

Not specified

No structured requirement data.

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

Our Site Reliability Engineering team is growing, and we are looking for a highly experienced Staff Site Reliability Engineer to help shape the future of reliability, scalability, and performance at AlphaSense. This is a hands-on, high-impact role where you will:

Architect core reliability platforms
Lead by example in incident response
Drive cultural adoption of SRE best practices across our global engineering organization

Our mission is to engineer our platform to the reliability standards of mission-critical systems, targeting 99.99% uptime, while continuously enhancing our systems and processes. This role is key to that mission and goes beyond traditional system maintenance; it’s about pioneering the platforms, practices, and culture that enable engineering to scale effectively. You will act as a force multiplier, mentoring fellow engineers, influencing architectural decisions, and setting the technical bar for reliability across the company.

Qualifications

8+ years of experience in Site Reliability Engineering, DevOps, or a similar role, with at least 3+ of those years operating in a Senior+ SRE position
Strong background in running production SaaS systems at scale
Proficiency in at least one programming/scripting language (Python, Go, or similar)
Hands-on expertise with cloud platforms (AWS, GCP, or Azure) and Kubernetes
Deep understanding of networking fundamentals (TCP/IP, DNS, HTTP/S, load balancing)
Experience with monitoring & alerting (Prometheus, Grafana, Datadog, ELK)
Familiarity with advanced observability (OTEL, continuous profiling)
Proven incident management experience, including leading high-severity incidents and postmortems
Strong troubleshooting skills across the full stack
Excellent communication and collaboration skills

Requirements

Architect Reliability Paved Paths: Build frameworks and self-service tooling that let teams own the reliability of their services in a “You Build It, You Run It” culture
Lead AI-Driven Reliability: Drive our AIOps strategy — automating diagnostics, remediation, and proactive failure prevention
Champion Reliability Culture: Embed SRE practices across engineering via design reviews, production readiness, and operational standards
Incident Leadership: Act as Incident Commander during critical events, modeling operational excellence, and ensuring blameless postmortems lead to lasting improvements
Advance Observability: Deliver end-to-end monitoring, tracing, and profiling (Prometheus, Grafana, OTEL, Continuous Profiling) to optimize performance proactively
Mentor & Multiply: Elevate engineers across SRE and product teams through mentorship, technical guidance, and knowledge sharing

Benefits

Compensation Range: $150,000 — $225,000 USD
You may also be offered equity, and a generous benefits program

Job Requirements

8+ years of experience in Site Reliability Engineering, DevOps, or a similar role, with at least 3+ of those years operating in a Senior+ SRE position
Strong background in running production SaaS systems at scale
Proficiency in at least one programming/scripting language (Python, Go, or similar)
Hands-on expertise with cloud platforms (AWS, GCP, or Azure) and Kubernetes
Deep understanding of networking fundamentals (TCP/IP, DNS, HTTP/S, load balancing)
Experience with monitoring & alerting (Prometheus, Grafana, Datadog, ELK)
Familiarity with advanced observability (OTEL, continuous profiling)
Proven incident management experience, including leading high-severity incidents and postmortems
Strong troubleshooting skills across the full stack
Excellent communication and collaboration skills
Architect Reliability Paved Paths: Build frameworks and self-service tooling that let teams own the reliability of their services in a “You Build It, You Run It” culture
Lead AI-Driven Reliability: Drive our AIOps strategy — automating diagnostics, remediation, and proactive failure prevention
Champion Reliability Culture: Embed SRE practices across engineering via design reviews, production readiness, and operational standards
Incident Leadership: Act as Incident Commander during critical events, modeling operational excellence, and ensuring blameless postmortems lead to lasting improvements
Advance Observability: Deliver end-to-end monitoring, tracing, and profiling (Prometheus, Grafana, OTEL, Continuous Profiling) to optimize performance proactively
Mentor & Multiply: Elevate engineers across SRE and product teams through mentorship, technical guidance, and knowledge sharing

Benefits

Compensation Range: $150,000 — $225,000 USD
You may also be offered equity, and a generous benefits program

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More US Remote Jobs

Staff Site Reliability Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages