Altarum

Solutions to Advance Health

Principal Data Engineer – ML Platforms

Full TimeRemoteTeam 201-500Since Altarum was founded in 1997.H1B No SponsorCompany SiteLinkedIn

Location

Virginia

Posted

86 days ago

Salary

$144.8K - $188.0K / year

Bachelor Degree7 yrs expEnglishAirflowAmazon RedshiftAWSAzureCloudGoogle Cloud PlatformGrafanaKafkaPrometheusPythonSQLTerraform

Job Description

• Design and operate modern, cloud-agnostic lakehouse architecture using object storage, SQL/ELT engines, and dbt • Build CI/CD pipelines for data, dbt, and model delivery (GitHub Actions, GitLab, Azure DevOps) • Implement MLOps systems: MLflow (or equivalent), feature stores, model registry, drift detection, automated testing • Engineer solutions in AWS and AWS GovCloud today, with portability to Azure Gov or GCP • Use Infrastructure-as-Code (Terraform, CloudFormation, Bicep) to automate secure deployments • Build scalable ingestion and normalization pipelines for healthcare and public health datasets • Create reusable connectors, dbt packages, and data contracts for cross-division use • Publish clean, conformed, metrics-ready tables for Analytics Engineering and BI teams • Support Population Health in turning evaluation and statistical models into pipelines • Define SLOs and alerting; instrument lineage & metadata; ensure ≥95% of data tests pass • Perform performance and cost tuning (partitioning, storage tiers, autoscaling) with guardrails and dashboards • Build production-grade pipelines for risk prediction, forecasting, cost/utilization models, and burden estimation • Develop ML-ready feature engineering workflows and support time-series/outbreak detection models • Translate R/Stata/SAS evaluation code into reusable pipelines • Implement Model Card Protocol (MCP) and fairness/explainability tooling (SHAP, LIME) • Ensure compliance with HIPAA, 42 CFR Part 2, IRB/DUA constraints, and NIST AI RMF standards • Develop runbooks, architecture diagrams, repo templates, and accelerator code • Provide technical guidance for proposals and client engagements.

Job Requirements

  • 7–10+ years in data engineering, ML platform engineering, or cloud data architecture
  • Expert in Python, SQL, dbt, and orchestration tools (Airflow, Glue, Step Functions)
  • Deep experience with AWS + AWS GovCloud
  • CI/CD and IaC experience (Terraform, CloudFormation)
  • Familiarity with MLOps tools (MLflow, Sagemaker, Azure ML, Vertex AI)
  • Ability to operate in regulated environments (HIPAA, 42 CFR Part 2, IRB)
  • Preferred: Experience with FHIR, HL7, Medicaid/Medicare claims, and/or SDOH datasets
  • Databricks, Snowflake, Redshift, Synapse
  • Event streaming (Kafka, Kinesis, Event Hubs)
  • Feature store experience
  • Observability tooling (Grafana, Prometheus, OpenTelemetry)
  • Experience optimizing BI datasets for Power BI.

Benefits

  • Competitive Medical, Dental and Optical plans
  • Generous Paid Time Off, 8 Company observed holidays plus 3 floating holidays
  • Tuition Assistance
  • 401K Plan (3% employer contribution plus opportunity for gainsharing)
  • Life, AD&D & Disability coverage
  • A flexible work environment

Related Categories

Related Job Pages