MLabs logo
MLabs

We are a Haskell, Rust, Blockchain and AI consultancy.

Senior DevOps/SRE Engineer

DevOps EngineerDevOps EngineerFull TimeRemoteSeniorTeam 51-200H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

1 day ago

Salary

$120K - $150K / year

Seniority

Senior

Bachelor DegreeEnglishAnsibleAWSDockerGrafanaJavaScriptKafkaKubernetesNode.jsPostgreSQLPrometheusPythonRedisTerraformGo

Job Description

• Build and maintain the infrastructure for concurrent AI trading agents, managing complex cron schedules, state files, and trailing stop processes. • Deploy and manage agent environments, including workspace persistence, isolated session management, and Model Context Protocol (MCP) server connectivity. • Design and operate pipelines for shipping trading skills and plugins to production without interrupting live trading activity. • Execute deployment strategies (blue/green, canary) ensuring active financial positions remain protected during every infrastructure change. • Build comprehensive alerting across the full stack using metrics, logs, and traces to detect agent failures, state file corruption, or infrastructure regressions before financial loss occurs. • Operate and scale core platform infrastructure, including Kubernetes (EKS) clusters, Redis, Postgres, ClickHouse, and Kafka. • Maintain blockchain node infrastructure and ensure stable connectivity to exchange APIs and on-chain transaction systems. • Lead incident response and on-call practices, including debugging, mitigation, and post-mortems to improve long-term platform reliability.

Job Requirements

  • Extensive experience in DevOps, SRE, or Infrastructure Engineering, preferably within a startup environment where systems were built from the ground up.
  • Proven track record of deploying, scaling, and debugging production workloads, specifically within AWS EKS.
  • Proficiency with tools such as Terraform, Ansible, or equivalent frameworks.
  • Hands-on experience with Docker and Helm for packaging production services.
  • Experience operating production-grade data and messaging systems (Redis, Postgres/RDS, ClickHouse, Kafka).
  • Strong experience with Prometheus, Grafana, Datadog, Loki, or OpenTelemetry to build proactive operational visibility.
  • Ability to debug across multiple languages, including Python, Node.js, and Go.
  • Understanding of systems where latency and reliability have direct financial consequences.
  • Familiarity with node infrastructure, exchange APIs, wallet operations, and on-chain monitoring.
  • Experience managing secrets, access controls, and production hardening for sensitive financial environments.
  • Experience defining SLOs and building mature on-call practices.

Benefits

  • Opportunity to build infrastructure for a new category of software (Autonomous AI Agents).
  • High-autonomy environment with a focus on engineering excellence and technical ownership.
  • Competitive compensation package commensurate with senior-level experience.
  • Remote-first or flexible working arrangements (as specified by the client).

Related Categories

Related Job Pages

More DevOps Engineer Jobs

DevOps Engineer1 day ago
Full TimeRemote

Relai is Europe's fastest growing Bitcoin-only app. We're looking for an experienced, results-oriented and impact-driven Senior DevOps Engineer who can help us scale our infrastructure and pursue our mission of bringing the best store of value to more people. This is a unique opp...

KubernetesDockerHelmGitLinuxTerraformGoogle CloudAWSISO 27001GDPRmicroservicesnetworkingcloud security
United States + 180 moreAll locations: United States, Canada, Brazil, Colombia, Argentina, Chile, Venezuela, Bolivarian Republic Of, Bolivia, Plurinational State Of, Ecuador, French Guiana, Guyana, Paraguay, Peru, Suriname, Uruguay, Mexico, Costa Rica, El Salvador, Guatemala, Honduras, Nicaragua, Panama, Dominican Republic, Puerto Rico, Bahamas, Guadeloupe, Haiti, Jamaica, Martinique, Montserrat, United Kingdom, Germany, France, Estonia, Portugal, Hungary, Poland, Ukraine, Romania, Bulgaria, Czech Republic, Slovakia, Belarus, Moldova, Republic Of, Sweden, Greece, Belgium, Italy, Ireland, Switzerland, Netherlands, Finland, Malta, Denmark, Lithuania, Croatia, Spain, Austria, Bosnia And Herzegovina, Iceland, Luxembourg, Macedonia, The Former Yugoslav Republic Of, Montenegro, Norway, Serbia, Slovenia, Albania, Cyprus, Latvia, Monaco, South Africa, Egypt, Algeria, Angola, Benin, Botswana, Burkina Faso, Burundi, Cameroon, Cape Verde, Central African Republic, Chad, Congo, Côte D'ivoire, Congo, The Democratic Republic Of The, Equatorial Guinea, Eritrea, Ethiopia, Gabon, Gambia, Ghana, Guinea, Guinea-bissau, Kenya, Lesotho, Liberia, Libyan Arab Jamahiriya, Madagascar, Malawi, Mali, Mauritania, Mauritius, Mayotte, Morocco, Mozambique, Namibia, Niger, Nigeria, Réunion, Rwanda, Senegal, Seychelles, Sierra Leone, Somalia, Sudan, Swaziland, Tanzania, United Republic Of, Togo, Tunisia, Uganda, Zambia, Zimbabwe, Georgia, Turkey, Israel, United Arab Emirates, Armenia, Azerbaijan, Bahrain, Iraq, Jordan, Kuwait, Lebanon, Oman, Qatar, Saudi Arabia, Palestinian Territory, Occupied, Yemen, India, Japan, Philippines, Pakistan, Thailand, Singapore, Viet Nam, Taiwan, Province Of China, Indonesia, Cambodia, Lao People's Democratic Republic, Malaysia, Myanmar, Korea, Republic Of, China, Afghanistan, Bangladesh, Bhutan, Kazakhstan, Kyrgyzstan, Maldives, Mongolia, Nepal, Sri Lanka, Tajikistan, Turkmenistan, Uzbekistan, Australia, Papua New Guinea, Kiribati, Palau, French Polynesia, Tuvalu, New Zealand
Globality, Inc. logo

Senior Site Reliability Engineer

Globality, Inc.

Market leader in autonomous sourcing technology, leveraging sophisticated AI to deliver savings and better outcomes

DevOps Engineer1 day ago
Full TimeRemoteTeam 201-500H1B No Sponsor

Sr. Site Reliability Engineer managing scalable infrastructure for Globality's AI-driven services

AnsibleAWSAzureCloudDistributed SystemsGoogle Cloud PlatformJavaPythonRubyTerraformGo
Montana
$120K - $180K / year
The AES Group logo

Deployment Engineer

The AES Group

We bring businesses and talent together to deliver the most innovative technology solution that create the most positive

DevOps Engineer1 day ago
Full TimeRemoteTeam 51-200Since 2000

Deployment Engineer supporting hosted environments for technology consulting company

ChefCitrixJenkinsShell ScriptingVMware
Texas
$30 - $40 / hour
OpenLoop logo

Staff Security Engineer – DevOps Integrations

OpenLoop

Powering superior telehealth from end-to-end. #HealingAnywhere

DevOps Engineer1 day ago
Full TimeRemoteTeam 201-500Since 2020

Staff Security Engineer at OpenLoop with DevOps integrations responsibilities

AWSCloudDockerGoogle Cloud PlatformKubernetes
United States