
About this role
About the Role
We are looking for a Full Stack Data Engineer to design, build, and maintain scalable data platforms and pipelines. The ideal candidate has strong hands-on experience across data ingestion, transformation, orchestration, and cloud-based analytics, with a focus on modern lakehouse architectures.
Key Responsibilities
- Design, develop, and maintain end-to-end data pipelines using Python, PySpark, and SQL
- Build and optimize data transformation workflows using dbt on Snowflake
- Develop scalable lakehouse architectures for structured and semi-structured data
- Implement reliable data ingestion frameworks using Kafka, AWS Glue, and custom connectors
- Orchestrate workflows and manage dependencies using Apache Airflow
- Manage cloud infrastructure on AWS (S3, Glue, EMR, Redshift/Snowflake integrations)
- Implement Infrastructure as Code (IaC) using Terraform
- Collaborate with cross-functional teams to deliver analytics-ready datasets
- Ensure data quality, performance optimization, and cost efficiency
- Use GitLab for version control, CI/CD, and collaborative development
- Monitor, troubleshoot, and resolve data pipeline issues in production environments
Requirements
Required Skills & Qualifications
- 4+ years of experience with AWS data services and cloud-based data engineering
- Strong programming skills in Python and PySpark
- Hands-on experience with Snowflake and dbt for data modeling and transformations
- Solid understanding of SQL for complex analytical queries
- Experience with Apache Airflow for workflow orchestration
- Proficiency in Kafka for real-time/streaming data ingestion
- Experience with AWS Glue/ Airflow for ETL development
- Experience with Terraform for infrastructure automation
- Strong experience with GitLab and CI/CD pipelines