Ariel Partners

At Ariel Partners, we solve the most difficult problems that inhibit technology from enabling our customers to achieve their goals. Our vision is to be recognized by our stakeholders as an elite provider of IT solutions, so when they have their biggest challenges, we are on their short list. We are looking for team members who share our values of: Integrity - to do the right thing even when it hurts; Commitment - to the long-term success and happiness of our customers, our people, and our partners; Courage - to take on difficult challenges, accept new ideas, and accept incremental failure; Excellence - the constant pursuit of excellence. Ariel Partners is an Equal Opportunity Employer in accordance with federal, state, and local laws.

Senior Staff Site Reliability Engineer / DevOps Engineer

DevOps EngineerDevOps EngineerFull TimeRemote

Location

United States

Posted

5 days ago

Salary

Not specified

DatadogAPMMonitoringAlertingDashboardsDistributed SystemsIncident ResponseCi/cdAutomationDev OpsSite Reliability EngineeringPlatform Engineering

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

We are seeking a Staff Site Reliability Engineer (SRE)/DevOps Engineer to improve the reliability, observability, and operational health of our production platform. This role requires someone who can go beyond basic monitoring—the ideal candidate must understand application architecture and service dependencies in order to design meaningful alerts and actionable observability, not just monitoring noise. This position combines SRE, DevOps, and observability engineering, with a strong focus on improving alert quality, reducing operational fatigue, and strengthening platform reliability.

  • Optimize and clean up Datadog APM instrumentation, monitors, and dashboards to improve signal quality and reduce telemetry costs
  • Design intelligent alerting strategies to reduce PagerDuty alert fatigue
  • Develop monitoring that reflects real user impact and system health, not infrastructure noise
  • Gain deep understanding of application architecture and service dependencies to diagnose failures and cascading impacts
  • Support DevOps and platform engineering efforts, including automation and CI/CD improvements
  • Participate in on-call support during business hours (Mon–Fri) and lead incident response improvements

Qualifications

  • Must be US Citizen
  • 8+ years of experience in Site Reliability Engineering, DevOps, or platform engineering
  • Strong hands-on experience with Datadog (APM, monitoring, dashboards, alerting)
  • Experience designing actionable monitoring and intelligent alerting
  • Strong understanding of distributed systems and application architecture
  • Experience supporting production systems and incident response
  • Solid DevOps automation and infrastructure skills

Ideal Candidate

This role is best suited for an engineer who:

  • Understands applications deeply enough to create meaningful alerts
  • Can reduce monitoring noise and operational fatigue
  • Combines SRE reliability practices with strong DevOps engineering skills

Contact Information

If you are interested in getting more information about this opportunity, please contact Irina Rozenberg at Recruiting@arielpartners.com at your earliest convenience.

Company Description

At Ariel Partners, we solve the most difficult problems that inhibit technology from enabling our customers to achieve their goals. Our vision is to be recognized by our stakeholders as an elite provider of IT solutions, so when they have their biggest challenges, we are on their short list. We are looking for team members who share our values of:

  • Integrity to do the right thing even when it hurts
  • Commitment to the long-term success and happiness of our customers, our people, and our partners
  • Courage to take on difficult challenges, accept new ideas, and accept incremental failure
  • The constant pursuit of Excellence

Ariel Partners is an Equal Opportunity Employer in accordance with federal, state, and local laws.

Job Requirements

  • Must be US Citizen
  • 8+ years of experience in Site Reliability Engineering, DevOps, or platform engineering
  • Strong hands-on experience with Datadog (APM, monitoring, dashboards, alerting)
  • Experience designing actionable monitoring and intelligent alerting
  • Strong understanding of distributed systems and application architecture
  • Experience supporting production systems and incident response
  • Solid DevOps automation and infrastructure skills
  • Ideal Candidate
  • This role is best suited for an engineer who:
  • Understands applications deeply enough to create meaningful alerts
  • Can reduce monitoring noise and operational fatigue
  • Combines SRE reliability practices with strong DevOps engineering skills
  • Contact Information
  • If you are interested in getting more information about this opportunity, please contact Irina Rozenberg at Recruiting@arielpartners.com at your earliest convenience.

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Senior Staff DevOps Engineer

Ariel Partners

At Ariel Partners, we solve the most difficult problems that inhibit technology from enabling our customers to achieve their goals. Our vision is to be recognized by our stakeholders as an elite provider of IT solutions, so when they have their biggest challenges, we are on their short list. We are looking for team members who share our values of: Integrity - to do the right thing even when it hurts; Commitment - to the long-term success and happiness of our customers, our people, and our partners; Courage - to take on difficult challenges, accept new ideas, and accept incremental failure; Excellence - the constant pursuit of excellence. Ariel Partners is an Equal Opportunity Employer in accordance with federal, state, and local laws.

DevOps Engineer6 days ago
Full TimeRemote

We are seeking a hands-on Staff DevOps Engineer to support deployment, optimization, and operations of our cloud-native infrastructure. This role focuses on building and maintaining scalable systems in an AWS and Kubernetes-based environment, improving reliability, and optimizing...

AWSKubernetesTerraformCI/CDArgoGitHub ActionsElasticDatadogRedisSQSSpot OceanFalkorDBNeptune
United States
InternshipRemoteTeam 5,001-10,000Since 1981H1B No Sponsor

Site Reliability Engineering Intern working on observability and monitoring at AWP Safety

CloudDockerKubernetesMicroservicesPythonGo
Ohio
$30 - $34 / hour

DevOps Engineer

Pyramid Systems

Pyramid Systems, Inc. is an award-winning, technology leader, driving digital transformation across federal agencies. Voted a Top Workplace, both regionally (Washington, DC) and Nationally (USA) the past 2 years (2023 and 2024) based on the feedback from our employees. Headquartered in Fairfax, VA, and have a growing national footprint. We value and promote our Flexible Workplace approach because of the positive impacts it has on work-life integration. We remain committed to ensuring every employee’s voice is heard, performance and results are recognized and rewarded, development and advancement is a focus, and diversity, equity and inclusion is a company priority. We offer competitive compensation and benefits (including a recently launched Employee Stock Ownership Plan - ESOP), a robust performance-based rewards program, and we know how to have fun! Our people and culture have endured and delivered for our clients for nearly three decades. EEO Statement Pyramid Systems, Inc. is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or protected veteran status and will not be discriminated against on the basis of disability.

DevOps Engineer6 days ago
Full TimeRemote

Pyramid Systems is seeking an experienced DevOps/AWS Engineer to take responsibility for creating, building, deploying, orchestrating, and automating deployment packages in an AWS cloud-based environment. Design and automate cloud environments at scale using Infrastructure as Cod...

AWSTerraformAnsibleDockerKubernetesPythonLinuxGitHub ActionsGitLab CI/CDJenkinsInfrastructure as CodeCI/CD
United States
DevOps Engineer6 days ago
Full TimeRemoteTeam 10,001+H1B No Sponsor

Expert DevOps / DevSecOps supporting GenAI initiatives at Inetum

CloudOpen Source
Louisiana