Sift

We’re the leader in Digital Trust & Safety, empowering companies of all sizes to unlock revenue without risk.

Software Engineer

Full TimeRemoteTeam 201-500Since 2011H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

1 day ago

Salary

Not specified

No structured requirement data.

Job Description

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

The Core Platform team is responsible for maintaining and optimizing the data, infrastructure, messaging, and services platform that powers Sift’s online systems. We ensure these systems are always available, reliable, and performing at their best to meet customer needs. In the event of an outage or failure, we follow well-practiced recovery plans to restore services swiftly. Managing such complex, large-scale systems requires continuous monitoring and proactive maintenance to uphold these standards.

  • Design and build immutable infrastructure and fault-tolerant, multi-AZ/multi-region systems that are resilient and self-healing.
  • Implement multi-region deployments, such as BigTable clusters spanning multiple regions, with strategies to ensure specific customers are routed to designated regions (e.g., sticky sessions at the regional level).
  • Optimize local development and testing workflows to be fast, efficient, and seamless.
  • Create dynamic environments that enable specific services to interact with other environments in real time.
  • Develop automated bot solutions for deployment and monitoring, integrating with Slack for streamlined updates.
  • Participate in on-call support and incident response activities, providing 12/7 coverage for one calendar week approximately once every 3-4 weeks.

Qualifications

  • 2+ years of experience as a Software Engineer focused on infrastructure/platform services or in a Site Reliability Engineering (SRE) role.
  • Strong programming skills in languages such as Java, Scala, or Python.
  • Extensive experience building and managing cloud infrastructure on AWS or GCP.
  • Expertise in building infrastructure as code and automating provisioning processes using tools like CloudFormation or Terraform.
  • Proficiency in setting up and managing monitoring and alerting systems, both open-source and commercial.
  • Familiarity with Docker and container orchestration technologies like Kubernetes, GKE, or AWS ECS.
  • Experience troubleshooting and resolving production system issues, with a focus on building automated solutions to prevent future occurrences.
  • Proven expertise in automation and a solid understanding of configuration management tools.

Requirements

  • Deep understanding of large-scale computing and approach infrastructure as code.
  • Passionate about building immutable infrastructure and resilient, multi-AZ/multi-region systems that can withstand failures.
  • Recognize the importance of monitoring and alerting, with a goal to design self-healing systems.
  • Strive to act as a force multiplier by making thoughtful trade-offs to drive success.

Benefits

  • Competitive total compensation package
  • 401k plan
  • Medical, dental and vision coverage
  • Wellness reimbursement
  • Education reimbursement
  • Flexible time off

Company Description

Sift is the AI-powered fraud platform securing digital trust for leading global businesses. Our deep investments in machine learning and user identity, a data network scoring 1 trillion events per year, and a commitment to long-term customer success empower more than 700 customers to grow fearlessly. Global brands rely on Sift to unlock growth and deliver seamless consumer experiences.

Job Requirements

  • 2+ years of experience as a Software Engineer focused on infrastructure/platform services or in a Site Reliability Engineering (SRE) role.
  • Strong programming skills in languages such as Java, Scala, or Python.
  • Extensive experience building and managing cloud infrastructure on AWS or GCP.
  • Expertise in building infrastructure as code and automating provisioning processes using tools like CloudFormation or Terraform.
  • Proficiency in setting up and managing monitoring and alerting systems, both open-source and commercial.
  • Familiarity with Docker and container orchestration technologies like Kubernetes, GKE, or AWS ECS.
  • Experience troubleshooting and resolving production system issues, with a focus on building automated solutions to prevent future occurrences.
  • Proven expertise in automation and a solid understanding of configuration management tools.
  • Deep understanding of large-scale computing and approach infrastructure as code.
  • Passionate about building immutable infrastructure and resilient, multi-AZ/multi-region systems that can withstand failures.
  • Recognize the importance of monitoring and alerting, with a goal to design self-healing systems.
  • Strive to act as a force multiplier by making thoughtful trade-offs to drive success.

Benefits

  • Competitive total compensation package
  • 401k plan
  • Medical, dental and vision coverage
  • Wellness reimbursement
  • Education reimbursement
  • Flexible time off

Related Job Pages