DigitalOcean

The cloud ☁️ of choice for developers, startups, and growing digital businesses around the world.

Staff Software Engineer

Full TimeRemoteTeam 1,001-5,000Since 2011H1B SponsorCompany SiteLinkedIn

Location

Massachusetts

Posted

2 days ago

Salary

$191K - $239K / year

15 yrs expEnglishAnsibleCloudDistributed SystemsGrafanaG RPCKafkaMicroservicesNo SQLPrometheusRedisSQLTerraformGo

Job Description

• Architect, design, develop, and maintain scalable backend services and systems. • Drive technical initiatives and large cross-team projects from concept to production. • Collaborate with product managers, UX designers, and engineers across distributed teams to deliver end-to-end solutions. • Develop deep expertise in observability tools and technologies such as Prometheus, Grafana, time-series databases, and distributed tracing. • Build and maintain high-performance APIs and microservices using Go (Golang) and gRPC, integrating with systems like Kafka, Redis, and NoSQL databases. • Work with Terraform and Ansible to automate infrastructure deployment and configuration management. • Utilize knowledge of SQL for data analysis, service integration, and operational insights. • Lead efforts in debugging, troubleshooting, and performance tuning of complex distributed systems. • Champion operational excellence by improving reliability, monitoring, and alerting practices. • Provide technical leadership, mentorship, and guidance to other engineers.

Job Requirements

  • 15+ years of relevant industry experience building and operating large-scale cloud services or distributed systems in a fast-paced, high-growth environment.
  • Strong programming experience in Go (Golang) and deep understanding of distributed systems fundamentals.
  • Solid understanding of observability, monitoring, and alerting systems (e.g., Prometheus, Grafana).
  • Experience working with OTEL (OpenTelemetry) Collector, including instrumentation, data pipelines, and telemetry ingestion for metrics, logs, and traces.
  • Proven experience designing and implementing scalable event-driven architectures using Kafka or similar technologies.
  • Experience with gRPC, Terraform, and Ansible for service communication and infrastructure automation.
  • Working knowledge of SQL, Redis, and NoSQL databases.
  • Demonstrated ability to drive operational excellence and improve system reliability.
  • Experience making pragmatic technical trade-offs while balancing short-term needs and long-term goals.
  • Excellent communication and collaboration skills, especially with geographically distributed teams.
  • Strong ownership mindset and the ability to independently deliver high-impact projects.

Benefits

  • Competitive salary
  • Paid time off
  • Professional development opportunities
  • Flexible work hours
  • Employee Assistance Program
  • Local Employee Meetups
  • Reimbursement for relevant conferences and training
  • Access to LinkedIn Learning's courses
  • Bonus eligibility based on performance
  • Equity compensation for eligible employees

Related Job Pages