Upbound

Upbound delivers a single point of control to manage all your applications and infrastructure across teams and clouds.

Senior Software Engineer

Full-stack EngineerSoftware EngineerFull TimeRemoteTeam 11-50Since 2017H1B No SponsorCompany SiteLinkedIn

Location

Texas

Posted

133 days ago

Salary

Not specified

EnglishCloudDistributed SystemsGrafanaKubernetesPrometheusGo

Job Description

• Actively build and operate Upbound Spaces in production, troubleshooting and resolving issues across multi-tenant SaaS environments, as well as contributing to Upbound's open-source projects, including Crossplane. • Take ownership of building features in high demand by Upbound's customers and deliver new functionality that will delight and amaze our users. • Investigate and debug complex issues in customer environments, including multi-control plane scenarios, resource reconciliation problems, and performance bottlenecks. • Communicate through thoughtful and thorough design documents for new initiatives and detailed post-incident reviews that drive system improvements. • Support the full project lifecycle for highly scalable and reliable services running in a cloud environment – discovery, analysis, architecture, design, review, documentation, building, migration, automation, deployment, production-readiness, and ongoing operational support. • Write and maintain Go code that interfaces with the Kubernetes API, such as operators, controllers, add-ons, etc., with a focus on observability, debuggability, and operational excellence. • Deploy, manage, and troubleshoot our Kubernetes services in production, using metrics, logs, and traces to identify and resolve issues quickly. • Build and maintain operational tooling for debugging customer environments, analyzing control plane health, and automating incident response. • Author documentation, user guides, runbooks, and blog posts to support and promote new features that you release. • Support the software release cycle for Spaces self-hosted distributions, including diagnosing issues in customer-managed deployments. • Participate in on-call rotation to support Upbound Cloud, responding to incidents and driving them to resolution.

Job Requirements

  • Have experience operating production cloud services at scale: monitoring, alerting, incident response, post-mortems, and continuous improvement of service reliability.
  • Have strong debugging skills across distributed systems, including experience with observability tools (Prometheus, Grafana, OpenTelemetry, distributed tracing) and techniques for diagnosing issues in production environments.
  • Have experience building and operating controllers that interact with the Kubernetes API server, including troubleshooting reconciliation loops, managing API rate limits, and optimizing controller performance.
  • Are comfortable working directly with customers to understand, reproduce, and resolve complex technical issues in their environments.
  • Take responsibility and ownership for solving problems even if they are outside your lane, especially during incidents affecting customer workloads.
  • Demonstrate excellence in your work, constantly trying to improve your skills and the operational posture of the systems you build.
  • Have empathy for customers and keep them in mind as you build solutions, understanding that reliability and debuggability are features.
  • Realize the importance of clear communication and effective collaboration to work as a team, deliver great results, and support customers through technical challenges.
  • Help create a safe environment where everyone can contribute, learn from failures, share on-call knowledge, and help each other grow as operators and engineers.

Benefits

  • Health insurance
  • Flexible working arrangements
  • Professional development opportunities

Related Job Pages

More Full-stack Engineer Jobs

Full Stack Developer

Softgic

Digital and Cognitive Transformation.

Full-stack Engineer133 days ago
Full TimeRemoteTeam 51-200Since 2011H1B No Sponsor

Full Stack Developer for modernization and maintenance of software ecosystem

ASP.NETClassic ASPJavaScriptMS SQL ServerRustSQL.NET
United States
$3K / month

Associate Principal Engineer – SAP SD Consultant

Nagarro

Nagarro (Frankfurt: NA9) is a leader in digital product engineering and drives technology-led business breakthroughs.

Full-stack Engineer134 days ago
Full TimeRemoteTeam 10,001+Since 1996H1B Sponsor

SAP S/4HANA Cloud Solution Architect specializing in Sales and Distribution at Nagarro

CloudERP
South Dakota

Staff Engineer

Nagarro

Nagarro (Frankfurt: NA9) is a leader in digital product engineering and drives technology-led business breakthroughs.

Full-stack Engineer134 days ago
Full TimeRemoteTeam 10,001+Since 1996H1B Sponsor

Software Engineer developing solutions remotely for global clients

United States

Senior Staff Engineer, InfraOps

Nagarro

Nagarro (Frankfurt: NA9) is a leader in digital product engineering and drives technology-led business breakthroughs.

Full-stack Engineer134 days ago
Full TimeRemoteTeam 10,001+Since 1996H1B Sponsor

Senior Staff Engineer in Infrastructure Operations at Nagarro

United States