Akamai

Akamai powers and protects life online. Leading companies worldwide choose Akamai to build, deliver, and secure their digital experiences helping billions of people live, work, and play every day. With the world's most distributed compute platform from cloud to edge we make it easy for customers to develop and run applications, while we keep experiences closer to users and threats farther away. Join us Are you seeking an opportunity to make a real difference in a company with a global reach and exciting services and clients? Come join us and grow with a team of people who will energize and inspire you!

Senior Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteTeam 5,001-10,000

Location

United States

Posted

2 days ago

Salary

Not specified

No structured requirement data.

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

Do you enjoy collaborating with teams to solve complex challenges? Do you enjoy solving large scale distributed content delivery challenges? Join our critical Platform and Reliability Engineering Team!

The Platform & Reliability Engineering team is responsible for defining, measuring, & optimizing the key performance indicators of delivery customers. Your expertise in software engineering and systems administration will be instrumental in building robust and resilient infrastructure.

In this role, you'll play a pivotal role in shaping the future of our products. You'll collaborate with product teams from the earliest stages of development to ensure the reliability, scalability, and performance of our systems. You'll define key performance indicators (KPIs), advance the state of monitoring, alerting and operational responses, and investigate complex performance issues.

As a Senior Site Reliability Engineer, you will be responsible for:

  • Working on Internet technologies to improve the performance, availability, and scalability of large distributed content delivery systems.
  • Engaging in collaborative efforts with cross-functional teams, including Product & engineering, to define and establish measurable SLIs and SLOs.
  • Providing technical expertise and feedback to ensure system designs and implementations meet reliability and performance requirements.
  • Monitoring platform availability and performance, debug issues by leveraging data analysis skills and implement corrective actions to avoid recurrence.
  • Developing and implementing automation solutions to improve operational efficiency and reduce toil.
  • Participating in design reviews and providing technical guidance to ensure designs meet requirements for scalability, performance, and robustness.

Qualifications

  • 5 years of relevant experience and a Bachelor's degree in Computer Science or its equivalent.
  • Familiarity with Internet protocols (DNS/HTTP/TLS/TCP etc.).
  • Experience utilizing Oracle SQL for data integrity checks, root cause analysis of data anomalies, and the development of data reports.
  • Proficiency in Scripting languages (Python, bash, JavaScript etc.).
  • Experience with monitoring and alerting systems (e.g., Prometheus, Grafana, ADBMS, Datadog), including metric collection, alerting, dashboarding, and troubleshooting.
  • Fluency working in a UNIX/Linux computing environment.

Benefits

  • Flexible working options through FlexBase, allowing 95% of employees to choose to work from home, the office, or both.
  • Opportunities to grow, flourish, and achieve great things.
  • Benefits surrounding all aspects of your life, including health, finances, family, work-life balance, and personal endeavors.
  • Industry-leading benefits including healthcare, 401K savings plan, company holidays, vacation (PTO), sick time, parental leave, and an employee assistance program.

Company Description

Akamai powers and protects life online. Leading companies worldwide choose Akamai to build, deliver, and secure their digital experiences, helping billions of people live, work, and play every day. With the world's most distributed compute platform—from cloud to edge—we make it easy for customers to develop and run applications, while we keep experiences closer to users and threats farther away.

Job Requirements

  • 5 years of relevant experience and a Bachelor's degree in Computer Science or its equivalent.
  • Familiarity with Internet protocols (DNS/HTTP/TLS/TCP etc.).
  • Experience utilizing Oracle SQL for data integrity checks, root cause analysis of data anomalies, and the development of data reports.
  • Proficiency in Scripting languages (Python, bash, JavaScript etc.).
  • Experience with monitoring and alerting systems (e.g., Prometheus, Grafana, ADBMS, Datadog), including metric collection, alerting, dashboarding, and troubleshooting.
  • Fluency working in a UNIX/Linux computing environment.

Benefits

  • Flexible working options through FlexBase, allowing 95% of employees to choose to work from home, the office, or both.
  • Opportunities to grow, flourish, and achieve great things.
  • Benefits surrounding all aspects of your life, including health, finances, family, work-life balance, and personal endeavors.
  • Industry-leading benefits including healthcare, 401K savings plan, company holidays, vacation (PTO), sick time, parental leave, and an employee assistance program.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer2 days ago
Full TimeRemoteTeam 501-1,000Since 1998

Position Summary:  Telestream is a seeking a DevOps Engineer to ensure seamless collaboration between our Software Development and IT Operations teams.  Your extensive experience and technical expertise in CI/CD pipelines, infrastruct...

CI/CDinfrastructure automationcloud platforms
United States

DevOps Engineer

Bugcrowd

See Security Differently™

DevOps Engineer3 days ago
Full TimeRemoteTeam 201-500Since 2012H1B No Sponsor

We are seeking a DevOps Engineer to support and enhance our cloud infrastructure, CI/CD pipelines, and operational tooling. This role focuses on enabling engineering teams with reliable deployment pipelines and scalable infrastructure for our security platform. Essential Duties a...

AWSDockerKubernetesTerraformBashPythonCI/CDInfrastructure as CodeMicroservicesIncident Response
United States
Full TimeRemoteTeam 51-200

This senior technical leader will be responsible for the availability, security, performance, and scalability of the platform, owning the cloud environment end-to-end and architecting the foundation for autonomous remediation at scale. Key duties include designing and maintaining AWS infrastructure using Infrastructure as Code, evolving CI/CD pipelines, leading incident response, and engineering for reliability and security.

AWSEC2EKSIAMALBRDSS3TerraformGitHub ActionsDatadogPrometheusCloudWatchPythonBash
United States
Full TimeRemote

The Senior DevOps Engineer will design, build, and operate the cloud infrastructure and deployment systems that power Ivím Health’s digital platform. This role focuses on AWS infrastructure, Terraform-based infrastructure as code, CI/CD automation, and release engineering support...

AWSTerraformCI/CDDockerKubernetesECSEKSRDSS3CloudFrontIAMCloudWatchGitHub ActionsGitLab CIJenkinsPythonBashLinuxNetworkingWordPressReactJavaScript
United States