Albert Invent

Invent the future, faster.

Staff ML Ops Engineer

Machine Learning EngineerMachine Learning EngineerFull TimeRemoteTeam 51-200Since 2022H1B No SponsorCompany SiteLinkedIn

Location

California

Posted

35 days ago

Salary

Not specified

Bachelor Degree7 yrs expEnglishAWSAzureCloudDistributed SystemsFlaskGoogle Cloud PlatformKubernetesMicroservicesPython

Job Description

• Design, deploy, and maintain Kubernetes infrastructure supporting AI/ML workloads • Manage containerized services, autoscaling, networking, and resource optimization • Design and build high-performance Python APIs and services using FastAPI or similar frameworks • Architect backend systems for scalability, reliability, and low latency • Build integrations between AI/ML systems and the broader Albert platform • Build and operate distributed systems that handle compute-intensive and high-throughput workloads • Design for fault tolerance, graceful degradation, and horizontal scalability • Implement async workflows, job queues, and task orchestration as needed • Architect and maintain data pipelines and storage systems supporting AI/ML workflows • Implement observability including logging, metrics, tracing, and alerting • Own system reliability—troubleshoot issues, conduct post-mortems, and continuously improve • Design CI/CD pipelines and promote automation best practices • Partner closely with ML engineers to understand requirements and deliver production-ready infrastructure • Translate ML prototypes and research code into scalable, maintainable systems

Job Requirements

  • A degree in Computer Science or a related field with 7+ years of industry experience (Bachelor's) or 5+ years (Master's or PhD) in software engineering
  • Experience supporting AI/ML teams or deploying ML systems in production
  • Experience with GPU workloads and scheduling
  • Advanced proficiency in Python including async programming and performance optimization
  • Deep experience with Kubernetes—cluster management, networking, autoscaling, and troubleshooting
  • Strong background in distributed systems and microservices architecture
  • Experience with cloud platforms (AWS, GCP, or Azure) and infrastructure-as-code
  • Proficiency in REST API development using FastAPI, Flask, or similar
  • Experience with containerization and CI/CD pipelines
  • Track record of operating production systems at scale

Benefits

  • Health insurance
  • Flexible working hours
  • Professional development opportunities

Related Job Pages

More Machine Learning Engineer Jobs

Staff Machine Learning Engineer, Perception

Path Robotics

Enabling Robots To Build So That Humans Can Create.

Machine Learning Engineer35 days ago
Full TimeRemoteTeam 201-500Since 2014H1B Sponsor

Staff Machine Learning Engineer developing robotic welding solutions for Path Robotics

CloudPython
Ohio

Machine Learning Engineer II – Ad Forecasting

Spotify

Passionate music fans. Innovative tech pros. Perfect harmony. Join our band.

Machine Learning Engineer35 days ago
Full TimeRemoteTeam 5,001-10,000Since 2008H1B Sponsor

Machine Learning Engineer building a next generation advertising platform

ApacheCloudDistributed SystemsJavaPythonScalaSpark
New York
$148.9K - $212.7K / year

Director, Machine Learning – Platform

Flex

Flex splits your bills into smaller, stress-free payments throughout the month. Start today with your rent bill!

Machine Learning Engineer35 days ago
Full TimeRemoteTeam 201-500Since 2019H1B Sponsor

Director of Machine Learning & Platform leading ML initiatives for FinTech company

United States
$280K - $350K / year

Senior Machine Learning Engineer, Research Team

GR8 Tech

Launch, grow, or upgrade your iGaming business with GR8 Tech high-performance Sportsbook and iGaming platform.

Machine Learning Engineer35 days ago
Full TimeRemoteTeam 501-1,000H1B No Sponsor

Open this job to view full details and requirements.

AWSCloudDockerLinuxNumpyPandasPythonPyTorchScikit-LearnTensorflow
United States