Apella

Technology for better surgery

Senior Software Engineer, Data Platform

Full TimeRemoteTeam 11-50H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

15 days ago

Salary

$175K - $225K / year

Bachelor DegreeEnglishAirflowApacheAWSAzureBig QueryCloudGoogle Cloud PlatformKafkaPythonSDLCSQLTerraform

Job Description

• Build and extend batch pipelines using dbt for transformations and Dagster for orchestration, scheduling, and asset-driven lineage. • Develop and optimize BigQuery data models (dimensional, wide-table, or domain-oriented) to support analytics, experimentation, and reporting use cases. • Advance real-time streaming capabilities by implementing and maintaining Kafka/PubSub + Flink pipelines, primarily using FlinkSQL, to deliver low-latency datasets and event-derived metrics. • Design data platform standards: SDLC, naming conventions, modeling patterns, incremental strategies, schema evolution approaches, and best practices for batch + streaming including CI/CD and testing. • Improve reliability and observability by implementing monitoring, alerting, and SLAs/SLOs for pipelines and data quality. • Partner with analytics, product, and engineering teams to onboard new data sources, define contracts, and deliver trusted datasets. • Own platform operations including performance tuning, data quality, cost optimization, and scaling across both warehouse and streaming systems. • Design a unified serving layer architecture that cleanly exposes consistent, trusted datasets across both batch and streaming systems. • Establish strong data governance, reliability standards, and observability practices.

Job Requirements

  • Strong proficiency in SQL (advanced querying, performance considerations, data modeling).
  • Hands-on experience with dbt (models, tests, sources, macros, snapshots, incremental strategies).
  • Experience with batch orchestration tooling Dagster/Airflow (assets/jobs, schedules/sensors, partitioning, backfills, observability).
  • Proficiency in Python for data engineering tasks (pipeline glue code, libraries, tooling, testing).
  • Deep familiarity with BigQuery or equivalent cloud native data warehouse tooling (partitioning/clustering, cost/performance optimization, best practices).
  • Solid experience with GCP (AWS/Azure) infrastructure (core services, IAM, security practices, deployments/automation).
  • Strong engineering fundamentals: version control, testing, code review, documentation, and operational ownership.
  • Nice to have: Experience with data quality tooling and patterns (e.g., anomaly detection, expectation-based testing, lineage).
  • Nice to have: Experience designing semantic layers or metrics layers for analytics.
  • Nice to have: Familiarity with event-driven architectures, schema registries, CDC patterns, and schema evolution strategies.
  • Nice to have: Experience building or maintaining streaming data pipelines with Kafka and Apache Flink, including FlinkSQL.
  • Nice to have: Experience with IaC (e.g., Terraform) and CI/CD for data platforms.
  • Nice to have: Understanding of privacy/security controls (PII handling, access controls, auditability).

Benefits

  • Competitive salary and stock options
  • Flexible vacation policy and a culture that values time for rest and recharging
  • Remote-first work environment with unique virtual and in-person events to foster team connection
  • Comprehensive health, dental, and vision insurance—we're a healthcare company that prioritizes your health
  • 16 weeks of parental leave for all parents

Related Job Pages