HHAeXchange

Better Homecare, Better Health

Site Reliability Architect

Full TimeRemoteTeam 501-1,000Since 2008H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

29 days ago

Salary

$170K - $185K / year

Bachelor Degree10 yrs expEnglishAWSCloudDNSGoogle Cloud PlatformJavaKubernetesPythonTcp/ipTerraformGo

Job Description

• Architect with a resiliency-by-design intent, for self-healing, fault-tolerant systems, focusing on proactive readiness rather than reactive correction. • Operate within a secure high-volume, high-volatility application environment, utilizing advanced networking and compute structures, in cloud hosted environments (AWS/GCP). • Move the organization from "firefighting" to a proactive culture through habits and systems supporting feature flagging, production readiness reviews, architectural decision records, and chaos engineering. • Support the incident management practice, mentoring SREs and Software engineers alike in utilizing our monitoring and observability toolsets for effective troubleshooting. • Define SLIs, SLOs, and error budgets that balance feature velocity with platform stability, supporting a shift to service ownership. • Underscore an automation-first perspective using Terraform, CDK, and other cloud-formation infrastructure as code toolsets to ensure repeatable, audit-ready environments.

Job Requirements

  • Bachelor's or Master's degree in Computer Science, Information Systems, or related field and applicable experience.
  • 10 + years in SRE/DevOps with 4 of that in an enterprise SaaS environment.
  • 4+ years in software development contributing to a SaaS-based, cloud-hosted product line.
  • Proven track record in a distributed SaaS environment managing multi-cloud or multi-region workloads.
  • Proficiency in modern cloud networking, including DNS, TCP/IP, Load Balancing, and Zero Trust security models.
  • Strong coding skills in Go, Python, Java, C#, or others, to build internal reliability tools and automate complex operational workflows.
  • Expert-level knowledge of Kubernetes (EKS/GKE) architecture, including multi-cluster management and stateful workloads.
  • Ability to optimize cloud spend while maintaining high performance and reliability.
  • Experience operating in a DevSecOps context with compliance guardrails (e.g., GDPR, HIPAA, HITRUST) across varied infrastructures
  • Willingness to explore and adopt AI tools responsibly to enhance productivity and innovation in your role

Benefits

  • competitive health plans
  • paid time-off
  • company paid holidays
  • 401K retirement program with a Company elected match
  • other company sponsored programs

Related Categories

Related Job Pages