Senior Infrastructure & DevOps Engineer
Location
United States
Posted
11 days ago
Salary
Not specified
No structured requirement data.
Job Description
Role Description
To build and maintain the automated production line for PHIN's Physical Superintelligence. You will own the plumbing that allows our simulation engine to seamlessly scale, ensuring that our team can deploy updates multiple times a day and ingest massive amounts of simulation data without friction.
- Greenfield Observability: Architect and implement a comprehensive logging, monitoring, and alerting stack across our platform from the ground up.
- Compute Architecture, Scaling & FinOps: Provision, manage, and optimize highly concurrent scaling clusters. Act as a cloud-agnostic thinker to direct future architecture and implement rigorous FinOps practices to minimize the cost of running thousands of simultaneous jobs.
- Infrastructure as Code (IaC): Own, maintain, and expand our Terraform footprint.
- Continuous Deployment (CD): Design and maintain high-velocity CI/CD pipelines supporting multiple deployments per day. Ensure "code to production" is a seamless, automated journey.
- Backend Robustness: Manage the API layer that sits between the infrastructure and the application layer. Read and refactor services to optimize data movement, squash bottlenecks, and maintain security.
- Data Pipeline Architecture: Build the underlying pipelines to move, store, and process the massive datasets generated by atomic-scale simulations.
- Platform DevEx & MLOps: Build self-serve tooling and event-driven pipelines that empower the entire organization. Create seamless abstractions so our developers can focus on what they do best.
- DevOps & Intelligence Automation: Ruthlessly automate manual toil. Use and build AI-driven tools to manage logs, infrastructure provisioning, and business workflows.
- Standard Enterprise Security: Implement and maintain security best practices (SOC2/ISO focus) required for enterprise-grade contracts.
Qualifications
- 5–8 years as a high-output Individual Contributor in Infrastructure or Backend roles.
- Comfortable touching any part of the system—from networking and security to API design and data engineering.
- Familiarity with Python and TypeScript/Node.js.
- Deep experience with major cloud providers.
- Familiarity with high-performance computing (HPC) schedulers like Slurm is a major plus.
- Not married to one framework; you choose the best tool for the job (K8s, Serverless, HPC Schedulers, etc.).
- Expert user of intelligence tools (Claude, Cursor, Codex, Copilot, Agents, etc.) to 10x your own productivity and automate business tasks.
- Previous experience working closely with machine learning teams, supporting ML workflows, or building MLOps pipelines is highly desirable.
Job Requirements
- 5–8 years as a high-output Individual Contributor in Infrastructure or Backend roles.
- Comfortable touching any part of the system—from networking and security to API design and data engineering.
- Familiarity with Python and TypeScript/Node.js.
- Deep experience with major cloud providers.
- Familiarity with high-performance computing (HPC) schedulers like Slurm is a major plus.
- Not married to one framework; you choose the best tool for the job (K8s, Serverless, HPC Schedulers, etc.).
- Expert user of intelligence tools (Claude, Cursor, Codex, Copilot, Agents, etc.) to 10x your own productivity and automate business tasks.
- Previous experience working closely with machine learning teams, supporting ML workflows, or building MLOps pipelines is highly desirable.
Related Guides
Related Categories
Related Job Pages
More DevOps Engineer Jobs
Domain & Release Manager
Koniag Government Services, LLCKoniag Government Services (KGS) is an Alaska Native Owned corporation supporting the values and traditions of our native communities through an agile employee and corporate culture that delivers Enterprise Solutions, Professional Services and Operational Management to Federal Government Agencies.
Koniag IT Systems, LLC, a Koniag Government Services company , is seeking a Domain & Release Manager to support KITS and our government customer. The position is remote. We offer competitive compensation and an extraordinary benefits package including health, dental and vision in...
Principal DevOps Engineer
ZscalerWe make it easy to secure your cloud transformation. Get fast, secure, and direct access to apps without appliances.
Principal DevOps Engineer managing AWS infrastructure at Zscaler
Description Founded in 2007, Initiate Government Solutions (IGS) is a Woman-Owned Small Business and a fully remote IT services provider supporting federal partners nationwide. We deliver innovative Enterprise IT and Health Services solutions with a strong focus on data analytics...
Reporting to the Manager of Reliability, you will work within the Reliability team alongside the Failure Analysis team to clearly communicate and address leading technical risks and characteristic reliability associated with Juul products and processes. Support reliability testin...