Job Summary

Synechron is seeking a skilled Data Engineer experienced in Google Cloud Platform (GCP), Databricks, PySpark, and SQL. In this role, you will design, develop, and maintain scalable data pipelines and workflows to enable advanced analytics and business intelligence solutions. You will work within a collaborative environment to integrate diverse data sources, optimize data processing workflows, and ensure data quality and availability. Your contributions will support strategic decision-making and enhance the organization’s data-driven initiatives.

Software Requirements

Required Skills:

Hands-on experience with GCP services, specifically BigQuery, Cloud Storage, and Composer for data pipeline orchestration
Proficiency in Databricks platform with PySpark for building and optimizing large-scale ETL/ELT processes
Expertise in writing and tuning complex SQL queries for data transformation, aggregation, and reporting on large datasets
Experience integrating data from multiple sources such as APIs, cloud storage, and databases into a central data warehouse
Familiarity with workflow orchestration tools like Apache Airflow or Cloud Composer for scheduling, monitoring, and managing data jobs
Knowledge of version control systems (Git), CI/CD practices, and Agile development methodologies

Preferred Skills:

Experience with other cloud platforms (AWS, Azure) or additional GCP services (Dataflow, Pub/Sub)
Knowledge of data modeling and data governance best practices
Familiarity with containerization tools like Docker or Kubernetes

Overall Responsibilities

Design, develop, and maintain scalable data pipelines using GCP, Databricks, and associated tools
Write efficient, well-documented SQL queries to support data transformation, data quality, and reporting needs
Integrate data from diverse sources, including APIs, cloud storage, and databases, to create a reliable central data repository
Develop automated workflows and schedules for data processing tasks utilizing Composer or Airflow
Collaborate with data analysts, data scientists, and business stakeholders to understand data requirements and deliver solutions
Monitor, troubleshoot, and optimize data pipelines for performance, scalability, and reliability
Maintain data security, privacy standards, and documentation compliance
Stay informed about emerging data engineering technologies and apply them effectively to improve workflows

Technical Skills (By Category)

Programming Languages:
- Required: PySpark (Python in Databricks), SQL
- Preferred: Python, Java, or Scala for custom data processing
Databases/Data Management:
- Required: BigQuery, relational databases, large-scale data transformation and querying
- Preferred: Data cataloging and governance tools
Cloud Technologies:
- Required: GCP services including BigQuery, Cloud Storage, Composer
- Preferred: Experience with other cloud services (AWS, Azure)
Frameworks and Libraries:
- Required: Databricks with PySpark, Airflow or Cloud Composer
- Preferred: Data processing frameworks such as Apache Beam, Dataflow
Development Tools and Methodologies:
- Version control using Git
- CI/CD pipelines for automated deployment and testing
- Agile development practices
Security & Compliance:
- Knowledge of data security best practices, access controls, and data privacy regulations

Experience Requirements

Minimum of 3 years of professional experience in data engineering or a related role
Proven expertise in designing and implementing large-scale data pipelines using GCP and Databricks
Hands-on experience with complex SQL query development and optimization
Working knowledge of workflow orchestration tools such as Airflow or Cloud Composer
Experience processing data from multiple sources, including APIs and cloud storage solutions
Experience in an Agile environment preferred

Alternative pathways:
Candidates with strong data pipeline experience on other cloud platforms who are willing to adapt and learn GCP services may be considered.

Day-to-Day Activities

Develop, test, and deploy data pipelines that facilitate analytics, reporting, and data science initiatives
Collaborate with cross-functional teams during sprint planning, stand-ups, and code reviews
Monitor scheduled jobs for successful execution, troubleshoot failures, and optimize performance
Document processes, workflows, and data sources in compliance with organizational standards
Continuously review pipeline performance, implement improvements, and ensure robustness
Participate in scalable architecture design discussions and recommend best practices

Qualifications

Bachelor’s degree in Computer Science, Data Science, Information Technology, or equivalent field
At least 3 years of experience in data engineering, data architecture, or related roles
Demonstrated expertise with GCP, Databricks, SQL, and workflow orchestration tools

Certifications (preferred):

GCP certifications such as Professional Data Engineer or equivalent
Databricks Data Engineer certification

Professional Competencies

Critical thinking and effective problem-solving skills related to large-scale data processing
Strong collaboration abilities across multidisciplinary teams and stakeholders
Excellent communication skills with the ability to translate technical details into clear insights
Adaptability to evolving technologies and project requirements
Ability to prioritize tasks, manage time efficiently, and deliver on deadlines
Innovative mindset with a focus on continuous learning and process improvement

SYNECHRON’S DIVERSITY & INCLUSION STATEMENT

Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative ‘Same Difference’ is committed to fostering an inclusive culture – promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more.

All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicant’s gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law.

Candidate Application Notice

Data Engineer with GCP, Databricks & PySpark Expertise

Synechron

Data Engineer with GCP, Databricks & PySpark Expertise

Synechron