Albert Invent
Invent the future, faster.
Staff ML Ops Engineer
Machine Learning EngineerMachine Learning EngineerFull TimeRemoteTeam 51-200Since 2022H1B No SponsorCompany SiteLinkedIn
Location
California
Posted
35 days ago
Salary
Not specified
Bachelor Degree7 yrs expEnglishAWSAzureCloudDistributed SystemsFlaskGoogle Cloud PlatformKubernetesMicroservicesPython
Job Description
• Design, deploy, and maintain Kubernetes infrastructure supporting AI/ML workloads
• Manage containerized services, autoscaling, networking, and resource optimization
• Design and build high-performance Python APIs and services using FastAPI or similar frameworks
• Architect backend systems for scalability, reliability, and low latency
• Build integrations between AI/ML systems and the broader Albert platform
• Build and operate distributed systems that handle compute-intensive and high-throughput workloads
• Design for fault tolerance, graceful degradation, and horizontal scalability
• Implement async workflows, job queues, and task orchestration as needed
• Architect and maintain data pipelines and storage systems supporting AI/ML workflows
• Implement observability including logging, metrics, tracing, and alerting
• Own system reliability—troubleshoot issues, conduct post-mortems, and continuously improve
• Design CI/CD pipelines and promote automation best practices
• Partner closely with ML engineers to understand requirements and deliver production-ready infrastructure
• Translate ML prototypes and research code into scalable, maintainable systems
Job Requirements
- A degree in Computer Science or a related field with 7+ years of industry experience (Bachelor's) or 5+ years (Master's or PhD) in software engineering
- Experience supporting AI/ML teams or deploying ML systems in production
- Experience with GPU workloads and scheduling
- Advanced proficiency in Python including async programming and performance optimization
- Deep experience with Kubernetes—cluster management, networking, autoscaling, and troubleshooting
- Strong background in distributed systems and microservices architecture
- Experience with cloud platforms (AWS, GCP, or Azure) and infrastructure-as-code
- Proficiency in REST API development using FastAPI, Flask, or similar
- Experience with containerization and CI/CD pipelines
- Track record of operating production systems at scale
Benefits
- Health insurance
- Flexible working hours
- Professional development opportunities
Related Guides
Related Job Pages
More Machine Learning Engineer Jobs
Staff Machine Learning Engineer, Perception
Path RoboticsEnabling Robots To Build So That Humans Can Create.
Machine Learning Engineer35 days ago
Full TimeRemoteTeam 201-500Since 2014H1B Sponsor
Staff Machine Learning Engineer developing robotic welding solutions for Path Robotics
CloudPython
Ohio
Machine Learning Engineer II – Ad Forecasting
SpotifyPassionate music fans. Innovative tech pros. Perfect harmony. Join our band.
Machine Learning Engineer35 days ago
Full TimeRemoteTeam 5,001-10,000Since 2008H1B Sponsor
Machine Learning Engineer building a next generation advertising platform
ApacheCloudDistributed SystemsJavaPythonScalaSpark
Director, Machine Learning – Platform
FlexFlex splits your bills into smaller, stress-free payments throughout the month. Start today with your rent bill!
Machine Learning Engineer35 days ago
Full TimeRemoteTeam 201-500Since 2019H1B Sponsor
Director of Machine Learning & Platform leading ML initiatives for FinTech company
Senior Machine Learning Engineer, Research Team
GR8 TechLaunch, grow, or upgrade your iGaming business with GR8 Tech high-performance Sportsbook and iGaming platform.
Machine Learning Engineer35 days ago
Full TimeRemoteTeam 501-1,000H1B No Sponsor
Open this job to view full details and requirements.
AWSCloudDockerLinuxNumpyPandasPythonPyTorchScikit-LearnTensorflow
United States