ClickHouse is an open-source, column-oriented OLAP database management system.

Senior Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteTeam 51-200Since 2016H1B SponsorCompany Site LinkedIn

Location

United States

Posted

2 days ago

Salary

$141K - $230K / year

No structured requirement data.

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

As one of the first joiners to our Reliability Engineering Team at ClickHouse, you will be responsible for building and leading processes to ensure the reliability, availability, scalability, and performance of our cloud infrastructure that runs ClickHouse databases. You will collaborate with different teams like Control Plane, Dataplane, Core, Security, Support, and Operations and guide them to design and implement scalable, secure, highly available, and fault-tolerant distributed systems. You will also own the areas of incident management and response, post-mortem analysis including running blameless postmortems, and continuous improvement of our ClickHouse services. This role is a unique opportunity to make a significant impact on our elastic, limitless scale, high-performance, serverless ClickHouse Cloud.

Collaborate with various engineering teams in ClickHouse to design and implement scalable, secure, and highly available systems for ClickHouse.
Establish and manage service level objectives (SLOs) and service level agreements (SLAs) for ClickHouse Cloud.
Ensure all the infrastructure components in ClickHouse Cloud (including Dataplane, Control Plane, and ClickHouse Core) have monitoring and alerting in place to ensure timely detection and resolution of incidents.
Enhance and refine incident response processes and post-mortem analysis for any outages in ClickHouse Cloud including working with the support team to communicate to the impacted customers.
Continuously improve the reliability and performance of our ClickHouse services.
Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities.
Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize downtime.

Qualifications

Bachelor’s or Master’s degree in Computer Science or a related field.
At least 8 years of experience in Site Reliability Engineering or a related field.
Previous experience using ClickHouse in production.
Hands-on experience with Go and/or Python.
Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.
Excellent understanding of distributed databases and SQL, particularly ClickHouse is a major plus.
Hands-on experience with container orchestration tools such as Kubernetes or Docker Swarm.
Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet.
You are a strong problem solver and have solid production debugging skills.
You are passionate about efficiency, availability, scalability, and data governance.
You thrive in a fast-paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward.
You have a high level of responsibility, ownership, and accountability.
Excellent communication and interpersonal skills.

Requirements

The typical starting salary for this role in the US is $141,000 — $208,000 USD.
The typical starting salary for this role in US Premium Markets is $157,000 — $230,000 USD.
Compensation may vary based on various factors including education, qualifications, certifications, experience, skills, location, performance, and the needs of the business or organization.
If you have any questions or comments about compensation as a candidate, please get in touch with us at paytransparency@clickhouse.com.

Benefits

Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries.
Healthcare - Employer contributions towards your healthcare.
Equity in the company - Every new team member who joins our company receives stock options.
Time off - Flexible time off in the US, generous entitlement in other countries.
A $500 Home office setup if you’re a remote employee.
Global Gatherings - We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites.
Culture - As part of our first 500 employees, you will be instrumental in shaping our culture.

Job Requirements

Bachelor’s or Master’s degree in Computer Science or a related field.
At least 8 years of experience in Site Reliability Engineering or a related field.
Previous experience using ClickHouse in production.
Hands-on experience with Go and/or Python.
Strong knowledge of cloud computing platforms such as AWS, Azure, or Google Cloud Platform.
Excellent understanding of distributed databases and SQL, particularly ClickHouse is a major plus.
Hands-on experience with container orchestration tools such as Kubernetes or Docker Swarm.
Strong experience with automation and configuration management tools such as Ansible, Terraform, or Puppet.
You are a strong problem solver and have solid production debugging skills.
You are passionate about efficiency, availability, scalability, and data governance.
You thrive in a fast-paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward.
You have a high level of responsibility, ownership, and accountability.
Excellent communication and interpersonal skills.
The typical starting salary for this role in the US is $141,000 — $208,000 USD.
The typical starting salary for this role in US Premium Markets is $157,000 — $230,000 USD.
Compensation may vary based on various factors including education, qualifications, certifications, experience, skills, location, performance, and the needs of the business or organization.
If you have any questions or comments about compensation as a candidate, please get in touch with us at paytransparency@clickhouse.com.

Benefits

Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries.
Healthcare - Employer contributions towards your healthcare.
Equity in the company - Every new team member who joins our company receives stock options.
Time off - Flexible time off in the US, generous entitlement in other countries.
A $500 Home office setup if you’re a remote employee.
Global Gatherings - We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites.
Culture - As part of our first 500 employees, you will be instrumental in shaping our culture.

Related Categories

DevOps Engineer

Related Job Pages

Remote Full-time Jobs (US)More US Remote Jobs

More DevOps Engineer Jobs

Site Reliability Engineer

Qlik

DevOps Engineer2 days ago

Full TimeRemoteTeam 1,001-5,000

As a Site Reliability Engineer at Qlik, you’ll sit at the heart of our cloud ecosystem, helping power the reliability, security, and scalability of Qlik and Talend Cloud services used around the world. This is your opportunity to work on systems operating at serious scale — s...

View details: Site Reliability Engineer

United States

$110K - $140K / year

Apply

Cloud Reliability Engineer (AWS) - Remote

CentralSquare Technologies

DevOps Engineer2 days ago

Full TimeRemoteTeam 1,001-5,000

The Site Reliability Engineer leads the architecture, design, and deployment of network solutions across the client portfolio, focusing on developing specifications and implementing cloud network security architecture. Key duties involve designing, developing, installing, and maintaining software solutions for Cloud Operations efficiency, refining deployment processes, and participating in 24x7 on-call support rotations.

View details: Cloud Reliability Engineer (AWS) - Remote

United States

Apply

Senior DevOps Engineer

Ascending

DevOps Engineer2 days ago

Full TimeRemoteTeam 11-50

Our client is dedicated to serving our nation's military and Veterans. They have the honor to support federal agencies in their efforts to advance the United States health care system and improve the overall health and well-being of all those who serve or have served our country....

View details: Senior DevOps Engineer

United States

Apply

DevOps Engineer

Bright Vision Technologies

"Retrieve the best out of you" in each process what you do.

DevOps Engineer2 days ago

Full TimeRemoteTeam 51-200Since 2020H1B Sponsor

Company Site LinkedIn

We are looking for a skilled DevOps Engineer to join our dynamic team and contribute to our mission of transforming business processes through technology. This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth pote...

View details: DevOps Engineer

United States

Apply

Senior Site Reliability Engineer

Job Description

Job Requirements

Benefits

Related Guides

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Site Reliability Engineer

Cloud Reliability Engineer (AWS) - Remote

Senior DevOps Engineer

DevOps Engineer