Description

We are seeking a highly skilled Senior Data Engineer to help evaluate and design robust data integration solutions for large-scale, disparate datasets spanning multiple platforms and infrastructure types, including cloud-based and potentially undefined or evolving environments. This role is critical in identifying optimal data ingestion, normalization, and transformation strategies while collaborating with cross-functional teams to ensure data accessibility, reliability, and security across systems.

This position is contingent upon award.

Responsible for developing, expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams.
Support software developers, database architects, data analysts and data scientists on data initiatives and ensure optimal data delivery architecture is consistent throughout ongoing projects.
Creates new pipeline and maintains existing pipeline, updates Extract, Transform, Load (ETL) process, creates new ETL feature , builds PoCs with Redshift Spectrum, Databricks, AWS EMR, SageMaker, etc.;
Implements, with support of project data specialists, large dataset engineering: data augmentation, data quality analysis, data analytics (anomalies and trends), data profiling, data algorithms, and (measure/develop) data maturity models and develop data strategy recommendations.
Operate large-scale data processing pipelines and resolve business and technical issues pertaining to the processing and data quality.
Assemble large, complex sets of data that meet non-functional and functional business requirements
Identify, design, and implement internal process improvements including re-designing data infrastructure for greater scalability, optimizing data delivery, and automating manual processes ?
Building required infrastructure for optimal extraction, transformation and loading of data from various data sources using AWS and SQL technologies
Building analytical tools to utilize the data pipeline, providing actionable insight into key business performance metrics including operational efficiency and customer acquisition?
Working with stakeholders including data, design, product and government stakeholders and assisting them with data-related technical issues
Write unit and integration tests for all data processing code.
Work with DevOps engineers on CI, CD, and IaC.
Read specs and translate them into code and design documents.
Perform code reviews and develop processes for improving code quality.
Perform other duties as assigned.

Requirements

All candidates must pass public trust clearance through the U.S. Federal Government. This requires candidates to either be U.S. citizens or pass clearance through the Foreign National Government System which will require that candidates have lived within the United States for at least 3 out of the previous 5 years, have a valid and non-expired passport from their country of birth and appropriate VISA/work permit documentation.

Bachelor's degree in Computer Science, Software Engineering, Data Science, Statistics, or related technical field.
10+ years of experience in software/data engineering, including data pipelines, data modeling, data integration, and data management.
Expertise in data lakes, data warehouses, data meshes, data modeling and data schemas (star, snowflake…).
Strong expertise in SQL, Python, and/or R, with applied experience in Apache Spark and large-scale processing using PySpark or Sparklyr.
Experience with Databricks in a production environment.
Strong experience with AWS cloud-native data services, including S3, Glue, Athena, and Lambda.
Strong proficiency with GitHub and GitHub Actions, including test-driven development.
Proven ability to work with incomplete or ambiguous data infrastructure and design integration strategies.
Excellent analytical, organizational, and problem-solving skills.
Strong communication skills, with the ability to translate complex concepts across technical and business teams.
Proven experience working with petabyte-level data systems.

Preferred Qualifications:

Experience working with healthcare data, especially CMS (Centers for Medicare & Medicaid Services) datasets.
CMS and Healthcare Expertise: In-depth knowledge of CMS regulations and experience with complex healthcare projects; in particular, data infrastructure related projects or similar.
Demonstrated success providing support within the CMS OIT environment, ensuring alignment with organizational goals and technical standards.
Demonstrated experience and familiarity with CMS OIT data systems (e.g. IDR-C, CCW, EDM, etc.)
Experience with cloud platform services: AWS and Azure.
Experience with streaming data (Kafka, Kinesis, Pub/Sub).
Familiarity with data governance, metadata management, and data quality practices.

Working Environment:
eSimplicity supports a hybrid work environment operating within the Eastern time zone so we can work with and respond to our government clients. Expected hours are 9:00 AM to 5:00 PM Eastern unless otherwise directed by your manager.

Occasional travel for training and project meetings. It is estimated to be less than 25% per year.

Benefits:
We offer highly competitive salaries and full healthcare benefits.

Equal Employment Opportunity:
eSimplicity is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, religion, color, national origin, gender, age, status as a protected veteran, sexual orientation, gender identity, or status as a qualified individual with a disability.

Sr Data Engineer

eSimplicity

Sr Data Engineer

eSimplicity