Veza
The data security platform built on the power of authorization.
Staff/Principal Site Reliability Engineer
Location
United States
Posted
137 days ago
Salary
$184K - $240K / year
Bachelor Degree7 yrs expEnglishAWSCloudDistributed SystemsEC2GrafanaKubernetesLinuxMicroservicesPrometheusPythonTerraformGo
Job Description
• Lead enterprise-wide reliability and infrastructure projects across multiple teams with high autonomy
• Navigate ambiguous problem spaces and deliver innovative solutions under tight deadlines
• Architect and deploy solutions for Cloud Prem and SaaS customers at scale
• Drive technical innovation and establish SRE best practices across the organization
• Respond to critical incidents, lead root cause analysis, and implement long-term resolutions
• Develop automation solutions to streamline operations and reduce manual workload
• Participate in on-call rotation and ensure effective incident handoff and documentation
• Partner with Engineering, Product, and Customer Success teams to align reliability goals with business objectives
• Communicate complex technical concepts effectively to technical and non-technical audiences, including executives
• Influence technical decisions across teams through thought leadership and demonstrated expertise
• Build consensus and drive adoption of new tools, processes, and architectural patterns
• Provide tier 2/3 technical support to enterprise customers for complex troubleshooting
• Work directly with customer technical teams to resolve deployment, configuration, and integration challenges
• Conduct technical onboarding and provide expert guidance on platform architecture and best practices
• Create customer-facing documentation, troubleshooting guides, and run-books
• Lead customer calls and technical discussions as a trusted advisor
• Mentor SRE and engineering team members, elevating technical capabilities
• Foster a culture of reliability, operational excellence, and continuous improvement
Job Requirements
- BS degree in Computer Science or related field (or equivalent practical experience)
- 7+ years in Site Reliability Engineering, DevOps, or Infrastructure Engineering
- Proven track record leading large-scale, cross-team infrastructure projects from conception to production
- Demonstrated ability to work autonomously on ambiguous projects with tight deadlines
- 5+ years with AWS (VPC, EC2, RDS, EKS, CloudFormation) and cloud automation
- Expert-level experience with Kubernetes, Helm, Linux, and Terraform
- Strong experience with GitOps model, distributed version control, and CI/CD pipelines
- Proficiency with monitoring tools (Prometheus, Grafana, DataDog)
- Strong programming/scripting skills (Python, Go, Bash) for automation
- Deep understanding of distributed systems, microservices, and reliability patterns
- Experience with Bazel and CueLang a plus
Benefits
- Competitive salary
- Equity and a competitive benefits package
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
DevOps Engineer140 days ago
Full TimeRemoteTeam 501-1,000Since 1999H1B Sponsor
Senior Site Reliability Engineer managing cloud services and infrastructure
AnsibleAWSAzureChefCloudElasticSearchJavaLinuxLogstashPuppetPythonRubyUnixGo
Senior Site Reliability Engineer
CircleThe all-in-one community platform for creators and brands. https://circle.so/
DevOps Engineer141 days ago
Full TimeRemoteTeam 51-200Since 2019H1B Sponsor
Senior Site Reliability Engineer ensuring fast, reliable, and secure systems for Circle’s platform
AWSKubernetesMySQLPostgresRedis
Intermediate DevOps Engineer
AbacusNextCloud-based tech provider for legal and accounting firms. AbacusLaw, Amicus Attorney, Amicus Cloud, OfficeTools, HotDocs
DevOps Engineer142 days ago
Full TimeRemoteTeam 201-500Since 1983H1B No Sponsor
DevOps Engineer designing and implementing processes at CARET
AnsibleAWSAzureCloudDNSDockerKubernetesMongoDBPythonRedisSQLTerraform
Senior DevOps Engineer – Application Deployment
SkillTude Talent SolutionsYour Personalised Talent Acquisition Partner
DevOps Engineer143 days ago
Full TimeRemoteTeam 1-10Since 2017H1B No Sponsor
Senior DevOps Engineer managing application deployments in cloud environments
AWSAzureCloudDockerGrafanaKubernetesPrometheusPythonTerraformVault