Rackspace Technology
Realize the full value of the cloud.
Senior Big Data Engineer – Oozie, Pig, GCP
Data EngineerData EngineerFull TimeRemoteTeam 5,001-10,000Since 1998H1B No SponsorCompany SiteLinkedIn
Location
United States
Posted
115 days ago
Salary
$116.1K - $198.4K / year
Bachelor Degree5 yrs expEnglishAirflowApacheCloudDistributed SystemsGoogle Cloud PlatformHadoopHbaseJavaMap ReducePythonRedisSparkSQLTerraform
Job Description
• Design and develop scalable batch processing systems using technologies like Hadoop, Oozie, Pig, Hive, MapReduce, and HBase, with hands-on coding in Java or Python (Java is a must).
• Must be able to lead Jira Epics
• Write clean, efficient, and production-ready code with a strong focus on data structures and algorithmic problem-solving applied to real-world data engineering tasks.
• Develop, manage, and optimize complex data workflows within the Apache Hadoop ecosystem, with a strong focus on Oozie orchestration and job scheduling.
• Leverage Google Cloud Platform (GCP) tools such as Dataproc, GCS, and Composer to build scalable and cloud-native big data solutions.
• Implement DevOps and automation best practices, including CI/CD pipelines, infrastructure as code (IaC), and performance tuning across distributed systems.
• Collaborate with cross-functional teams to ensure data pipeline reliability, code quality, and operational excellence in a remote-first environment.
Job Requirements
- Bachelor's degree in Computer Science, software engineering or related field of study.
- Experience with managed cloud services and understanding of cloud-based batch processing systems are critical.
- Must be able to lead Jira Epics is MUST
- Proficiency in Oozie, Airflow, Map Reduce, Java are MUST haves.
- Strong programming skills with Java (specifically Spark), Python, Pig, and SQL.
- Expertise in public cloud services, particularly in GCP.
- Proficiency in the Apache Hadoop ecosystem with Oozie, Pig, Hive, Map Reduce.
- Familiarity with BigTable and Redis.
- Experienced in Infrastructure and Applied DevOps principles in daily work. Utilize tools for continuous integration and continuous deployment (CI/CD), and Infrastructure as Code (IaC) like Terraform to automate and improve development and release processes.
- Proven experience in engineering batch processing systems at scale.
- 5+ years of experience in customer-facing software/technology or consulting.
- 5+ years of experience with “on-premises to cloud” migrations or IT transformations.
- 5+ years of experience building, and operating solutions built on GCP
- Proficiency in Oozie andPig
- Must be able to lead Jira Epics
- Proficiency in Java or Python
Benefits
- The role may include variable compensation in the form of bonus, commissions, or other discretionary payments. These discretionary payments are based on company and/or individual performance and may change at any time.
Related Guides
Related Categories
Related Job Pages
More Data Engineer Jobs
Staff Data Engineer
AmplifyA pioneer in K–12 education, Amplify partners with educators to make learning rigorous and riveting for every student.
Data Engineer116 days ago
Full TimeRemoteTeam 1,001-5,000Since 2000H1B Sponsor
Data Engineer collaborating with analytics teams at Amplify
AirflowETLPythonSQLTableau
Senior Data Engineer, Postgres DBA
OppFiTech-enabled mission-driven specialty finance platform broadening the reach of community banks to extend credit access
Data Engineer116 days ago
Full TimeRemoteTeam 501-1,000Since 2013H1B Sponsor
Senior Data Engineer I managing PostgreSQL databases for OppFi
AirflowApachePostgresPythonSQL
Senior Data Engineer
Vanna HealthEmpowering people with serious mental illness to overcome any barrier to a meaningful and healthy life.
Data Engineer116 days ago
Full TimeRemoteTeam 1-10Since 2021H1B No Sponsor
Senior Data Engineer building data platform capabilities to support mental health care.
BigQueryCloudMySQLPostgresPythonSparkSQL
Data Engineer116 days ago
Full TimeRemoteTeam 11-50H1B Sponsor
Director of Data Engineering leading data practice at Livefront
AWSAzureCloudETLGoogle Cloud PlatformPythonSparkSQL