CertifID

CertifID is the most secure way to send and receive wiring information.

Senior Site Reliability Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteTeam 11-50Since 2017H1B No SponsorCompany SiteLinkedIn

Location

Texas + 1 moreAll locations: Texas, Michigan

Posted

8 days ago

Salary

Not specified

Bachelor Degree9 yrs expEnglish.netAksAWSAzureBashC#DatadogEksGCPGoGrafanaKubernetesLinuxOpentelemetryPrometheusPythonTerraform

Job Description

Cybercrime is rising, reaching record highs in 2024. According to the FBI's IC3 report, total losses exceeded $16 billion. With investment fraud and BEC scams at the forefront, the message is clear: the real estate sector remains a lucrative target for cybercriminals. At CertifID, we take this threat seriously and provide a secure platform that verifies the identities of parties involved in transactions, authenticates wire transfer instructions, and detects potential fraud attempts. Our technology is designed to mitigate risks and ensure that every transaction is conducted with confidence and peace of mind.


We know we couldn’t take on this challenge without our incredible team. We have been recognized as one of the Best Startups to Work for in Austin, made the Inc. 5000 list, and won Best Culture by Purpose Jobs three years in a row. We are guided by our core values and our vision of a world without wire fraud. We offer a dynamic work environment where you can contribute to meaningful impact and be part of a team dedicated to enhancing security and fighting fraud.


We are seeking a Senior Site Reliability Engineer (Senior SRE) to drive reliability improvements across our production SaaS environment. You’ll play a critical role in building scalable infrastructure patterns, advancing observability, improving incident response, and partnering with engineering teams to embed reliability into system design and delivery.


This role is ideal for an experienced Sr. SRE who enjoys solving complex operational problems, building automation, and mentoring others.

What You’ll Do

  • Reliability & Platform Operations: Own and improve the reliability, availability, and performance of production systems while defining and operationalizing SLIs/SLOs and error budgets.
  • AI Agent Enablement:  Design and implement autonomous and semi-autonomous AI agents for monitoring distributed systems and applications. Build agents capable of consuming multi-source observability data (metrics, logs, traces, etc.).
  • Incident Response: Participate in and help lead an on-call rotation, serving as an escalation point for major incidents and facilitating blameless postmortems.
  • Automation & Infrastructure: Build automated workflows to eliminate manual work and design/maintain Infrastructure-as-Code with Terraform.
  • Observability: Improve metrics, logs, traces, and alerting using tools like Datadog or Prometheus to reduce noise and increase signal.
  • Collaboration & Mentorship: Partner with application teams to implement reliability best practices and mentor junior engineers to foster a culture of knowledge sharing.

Who You Are

  • Strategic Architect: You look beyond the "what" to understand the "why," providing insights that influence our GTM and technical direction.
  • Startup Veteran: You are comfortable moving fast and staying proactive in an environment where the playbook is still being written.
  • Relatable & Adaptable: You can navigate different personalities across the organization, from high-energy sales teams to analytical engineering partners.
  • Lifelong Learner: You have a thirst for learning, keeping up with emerging technologies and industry trends.

What We're Looking For

  • Experience: 5+ years in SRE, DevOps, Platform Engineering, or Infrastructure Engineering.
  • Cloud Expertise: Proven experience supporting production SaaS systems in Azure (preferred), AWS, or GCP.
  • Technical Stack: Strong Linux, networking, and distributed systems troubleshooting skills.
  • Containers: Strong experience with containers and orchestration (Kubernetes/EKS/AKS).
  • IaC & Tooling: Expertise with Infrastructure-as-Code (Terraform strongly preferred).
  • Programming: Strong scripting/programming skills in Python, Go, Bash, or C#/.NET.
  • Observability: Hands-on experience with Datadog, Prometheus/Grafana, or OpenTelemetry.

What We Offer

  • Flexible vacation
  • 12 company-paid holidays
  • 10 paid sick days
  • No work on your birthday
  • Health, dental, and vision Insurance (including a $0 option)
  • 401(k) with matching, and no waiting period
  • Equity
  • Life insurance
  • Generous parental paid leave
  • Wellness reimbursement of $300/year
  • Remote worker reimbursement of $300/year
  • Professional development reimbursement
  • Competitive pay
  • An award-winning culture

Not sure if you check all the boxes? Apply anyway! 


We know that great talent comes in many forms, and we value potential just as much as experience. If you're excited about this role and believe you can grow into it, we’d love to hear from you. We’re looking for people who are eager to learn, adapt, and solve challenges—so if that sounds like you, don’t let a checklist hold you back!


Change doesn't happen overnight, and the same goes for us here at CertifID. We evolve collectively and individually as we grow by leaning into the core values that define us. As we grow, we embody GRIT—collectively and individually—to raise the bar and influence outcomes in everything we do. Guard the Customer - Raise the Bar - Influence Outcomes - Teamwork Wins

Benefits

  • 401(K), 401(K) matching, Company equity, Company-sponsored outings, Continuing education stipend, Dental insurance, Disability insurance, Documented equal pay policy, Volunteer in local community, Family medical leave, Fitness stipend, Flexible Spending Account (FSA), Flexible work schedule, Generous parental leave, Company-sponsored happy hours, Health insurance, Highly diverse management team, Job training & conferences, Open door policy, Life insurance, Paid volunteer time, Open office floor plan, Paid holidays, Paid industry certifications, Paid sick days, Onsite office parking, Performance bonus, Promote from within, Lunch and learns, Remote work program, Free snacks and drinks, Team based strategic planning, OKR operational model, Continuing education available during work hours, Unlimited vacation policy, Vision insurance, Wellness programs, Some meals provided, Mental health benefits, Diversity employee resource groups, Hiring practices that promote diversity, Fertility benefits, Employee resource groups, Employee-led culture committees, Day off for your birthday, Quarterly engagement surveys, Hybrid work model, In-person all-hands meetings, Employee awards, Pay transparency, Wellness days, Meditation space, Mother's room, Personal development training, Flexible time off, Bereavement leave benefits

Related Categories

Related Job Pages

More DevOps Engineer Jobs

Site Reliability Engineer

Devsu

Join Devsu and discover a workplace that values your growth, supports your well-being, and empowers you to make a global impact.

DevOps Engineer8 days ago
Full TimeRemote

We are seeking a Site Reliability Engineer (SRE) with deep expertise in monitoring, observability, and reliability engineering to support systems running across on-premises infrastructure and Google Cloud Platform (GCP). This role is primarily responsible for designing, operating...

United States + 1 moreAll locations: United States, Dominican Republic

Senior DevOps Engineer

ChowNow

The only fair-for-all food ordering marketplace — no commissions for restaurants and no hidden fees for diners.

DevOps Engineer8 days ago
Full TimeRemoteTeam 201-500Since 2011H1B Sponsor

Senior DevOps Engineer responsible for enhancing technology infrastructure at ChowNow

AnsibleAWSEC2ElasticSearchLinuxMySQLPostgreSQLPythonRedisTerraform
United States
$169.7K - $200.5K / year

Senior DevOps Engineer (Exol)

Exol

Symbotic is an automation technology leader reimagining the supply chain with its end-to-end, AI-powered robotic and software platform. Symbotic reinvents the warehouse as a strategic asset for the world’s largest retail, wholesale, and food & beverage companies Applying next-gen technology, high-density storage and machine learning to solve today's complex distribution challenges Transforms the flow of goods and the economics of supply chain for its customers

DevOps Engineer8 days ago
Full TimeRemote

The role involves designing, building, and maintaining production-grade cloud infrastructure using Terraform, focusing on state management and module development for scalable delivery pipelines. Responsibilities also include architecting secure GCP solutions, optimizing CI/CD pipelines, and implementing robust monitoring and alerting systems.

United States
$147K - $202K / year

Junior Dev Ops Engineer

BlueVoyant

At BlueVoyant, we recognize that effective cyber security requires active prevention and defense across both your organization and supply chain. Our proprietary data, analytics, and technology, coupled with deep expertise, works as a force multiplier to secure your full ecosystem. Founded in 2017 by Fortune 500 executives. Headquartered in New York City. Offices in Maryland, Tel Aviv, San Francisco, London, Budapest, and Latin America.

DevOps Engineer8 days ago
Full TimeRemoteTeam 501-1,000

The role involves reducing operational workload through task automation and assisting with the deployment, support, and troubleshooting of services in production environments. Responsibilities also include improving CI/CD pipelines, contributing to cloud infrastructure using Terraform, and supporting Kubernetes clusters.

United States