
- Design, develop, and maintain scalable ETL/ELT pipelines using Databricks.
- Build and optimize data workflows using Apache Spark (PySpark/Scala).
- Implement data ingestion from multiple sources (APIs, databases, streaming platforms).
- Develop and manage data lakes and lakehouse architectures.
- Work with cloud platforms such as Amazon Web Services, Microsoft Azure, or Google Cloud Platform.
- Optimize performance of queries and large-scale data processing jobs.
- Ensure data quality, governance, and security best practices.
- Collaborate with data scientists, analysts, and business stakeholders to deliver data solutions.
- Implement CI/CD pipelines and version control for data engineering workflows.
Requirements
- 5+ years of experience in data engineering or big data development.
- Strong hands-on experience with Databricks and Apache Spark (PySpark preferred).
- Proficiency in Python, SQL, and optionally Scala.
- Experience with data modeling, data warehousing, and ETL design.
- Hands-on experience with cloud platforms (AWS/Azure/GCP).
- Familiarity with tools like Airflow, Kafka, Delta Lake.
- Strong understanding of distributed computing and big data architecture.
Job details
Jobr Assistant extension
Get the extension →