Temporal Technologies
Build invincible apps.
Staff Software Engineer – Reliability
Full-stack EngineerSoftware EngineerFull TimeRemoteLeadTeam 51-200Since 2018H1B SponsorCompany SiteLinkedIn
Location
United States
Posted
2 days ago
Salary
$212K - $286.2K / year
Seniority
Lead
Bachelor DegreeEnglishCloudDistributed Systems
Job Description
• Own reliability outcomes for operating Temporal Cloud end to end, partnering across engineering, infrastructure, and product to drive measurable improvements.
• Define, implement, and evolve reliability targets and associated practices, including alerting thresholds, operational readiness criteria, and escalation paths.
• Plan and run gamedays to validate incident response, operational procedures, and cross-team coordination under realistic failure scenarios.
• Build and scale a chaos testing program that exercises failure modes safely and drives remediation work that reduces real risk.
• Define and maintain a reliability scorecard across services and key operational processes, and use it to prioritize reliability investments.
• Lead load testing and performance testing efforts, including test design, tooling, and analysis of bottlenecks and capacity constraints.
• Improve observability standards (metrics, logs, traces, dashboards) so reliability signals are consistent, actionable, and easy to audit.
• Drive post-incident learning and corrective actions, ensuring fixes are durable and reduce recurrence risk over time.
• Make system-level tradeoffs across reliability, performance, cost, and velocity, and document decisions clearly for long-term maintainability.
• Mentor other engineers and raise the bar on reliability engineering practices across teams.
Job Requirements
- Strong computer science fundamentals, especially in distributed systems, concurrency, and performance.
- Demonstrated ability to design and build complex systems that operate reliably under high load and partial failure.
- Experience driving reliability improvements across multiple services, not just within a single codebase.
- Hands-on experience with at least one of: gamedays, chaos testing, load testing, or building reliability scorecards.
- Strong judgment in ambiguous situations, including the ability to prioritize reliability work based on risk and impact.
- Excellent communication skills, including the ability to align multiple stakeholders on reliability goals, plans, and tradeoffs.
- A collaborative mindset and a track record of mentoring and leveling up engineering practices.
Benefits
- Unlimited PTO, 12 Holidays + 2 Floating Holidays
- 100% Premiums Coverage for Medical, Dental, and Vision
- AD&D, LT & ST Disability, and Life Insurance (Standard & Supplemental Available)
- Empower 401K Plan
- Additional Perks for Learning & Development, Lifestyle Spending, In-Home Office Setup, Professional Memberships, WFH Meals, Internet Stipend and more!
Related Guides
Related Job Pages
More Full-stack Engineer Jobs
Corporate Engineering Leader
Owens CorningOur people and products help us build a more sustainable future. #WeAreOwensCorning
Full-stack Engineer2 days ago
Full TimeRemoteTeam 10,001+Since 1938H1B Sponsor
Controls and Automation Project Manager for Owens Corning's Roofing and Asphalt division
Ohio
Full-stack Engineer2 days ago
Full TimeRemoteTeam 201-500
The role involves working as an experienced Full-Stack Python Developer on a long-term online education project. The developer will benefit from great flexibility in technology and approaches within a friendly team atmosphere.
PythonJavaScriptReactPostgreSQLDjangoAWSClickHouse
United States
Senior Manager – Software Development Engineering
CVS HealthBringing our heart to every moment of your health.
Full-stack Engineer2 days ago
Full TimeRemoteTeam 10,001+Since 1963H1B No Sponsor
Senior Manager leading .NET and SQL-focused engineering team at CVS Health.
SQL.NET
Full-stack Engineer2 days ago
Full TimeRemoteTeam 501-1,000Since 2015H1B Sponsor
Staff Engineer shaping technical strategy for Discord's premium features
PythonReactReact Native



