
About this role
Job Requirements
Mandatory Skills:
- Bachelor’s or higher degree in Computer Science, Information Technology, or a related discipline; or equivalent experience with minimum 12+ years overall work experience.
- Extensive hands-on experience with Databricks and Apache Spark, including PySpark, SQL, and Scala for data processing and analytics.
- Proven expertise in designing and implementing efficient data ingestion pipelines using Databricks, managing large-scale datasets from various sources.
- Experience delivering proofs of concept to demonstrate the capabilities of Databricks to stakeholders.
- Skilled in developing scalable and reusable frameworks for data ingestion and transformation using Databricks features such as Delta Lake.
- Strong understanding of cloud architectures, particularly within Azure and Databricks integration, including Azure Databricks, Azure Data Lake Storage, and Azure Blob storage.
- Experience in implementing CI/CD pipelines and DevSecOps practices in Databricks environments.
- Proficiency in cloud migration methodologies and processes pertinent to data engineering.
- Strong communication skills, both written and verbal, with the ability to present complex technical information in a clear and concise manner.
- Self-motivated, proactive working style with a strong sense of ownership and problem-solving capability.
- Collaborative team player with experience in mentoring and guiding other engineers on best practices in data engineering.
- Experience in using Big Data file formats (e.g., Parquet, Avro) and compression techniques.
- Familiarity with data governance and metadata management tools within Databricks and Azure environments.
Primary Roles and Responsibilities:
- Design, build, and maintain the data infrastructure utilizing Databricks and its integration within Azure ecosystems.
- Develop and optimize data pipelines to extract, transform, and load data from diverse sources into Databricks-supported data lakes.
- Implement and manage data security and privacy measures, ensuring compliance with industry standards and regulations.
- Monitor and troubleshoot data workflows and optimize data storage and processing for performance and cost efficiency using Databricks.
- Collaborate with cross-functional teams to ensure data quality and availability for business stakeholders and analytics teams.
- Provide strategic technical guidance and support for Databricks initiatives, ensuring alignment with business goals.
- Innovate and implement solutions using Databricks and Spark technologies for processing and analyzing big data.
- Participate in projects focused on cloud architecture improvements, including planning and executing migrations to Databricks and Azure.
Preferred Skills:
- Experience working in Azure DevOps environments with tools such as Terraform and Microsoft Visual Studio Team Services.
- Familiarity with Microsoft Power BI for creating business insights through dashboards and reports.
- Direct experience in building data pipelines using Databricks and Azure Data Factory.
- Understanding of Azure Role-Based Access Control (RBAC) and Identity Access Management (IAM) for managing data security.
- Inclination towards exploring and aligning with Microsoft’s vision and roadmap around emerging tools and technologies.
- Relevant certifications, such as Databricks Certified Data Engineer Associate or DP-203: Data Engineering on Microsoft Azure.
- Experience in utilizing Databricks notebooks and workflows for orchestrating and scheduling data processing tasks.
- Knowledge on the latest innovations in artificial intelligence and machine learning and their integration with Databricks.
- Design, build, and maintain the data infrastructure utilizing Databricks and its integration within Azure ecosystems.
- Develop and optimize data pipelines to extract, transform, and load data from diverse sources into Databricks-supported data lakes.
- Implement and manage data security and privacy measures, ensuring compliance with industry standards and regulations.
- Monitor and troubleshoot data workflows and optimize data storage and processing for performance and cost efficiency using Databricks.
- Collaborate with cross-functional teams to ensure data quality and availability for business stakeholders and analytics teams.
- Provide strategic technical guidance and support for Databricks initiatives, ensuring alignment with business goals.
- Innovate and implement solutions using Databricks and Spark technologies for processing and analyzing big data.
- Participate in projects focused on cloud architecture improvements, including planning and executing migrations to Databricks and Azure.
- Bachelor’s or higher degree in Computer Science, Information Technology, or a related discipline; or equivalent experience with minimum 12+ years overall work experience.
- Extensive hands-on experience with Databricks and Apache Spark, including PySpark, SQL, and Scala for data processing and analytics.
- Proven expertise in designing and implementing efficient data ingestion pipelines using Databricks, managing large-scale datasets from various sources.
- Experience delivering proofs of concept to demonstrate the capabilities of Databricks to stakeholders.
- Skilled in developing scalable and reusable frameworks for data ingestion and transformation using Databricks features such as Delta Lake.
- Strong understanding of cloud architectures, particularly within Azure and Databricks integration, including Azure Databricks, Azure Data Lake Storage, and Azure Blob storage.
- Experience in implementing CI/CD pipelines and DevSecOps practices in Databricks environments.
- Proficiency in cloud migration methodologies and processes pertinent to data engineering.
- Strong communication skills, both written and verbal, with the ability to present complex technical information in a clear and concise manner.
- Self-motivated, proactive working style with a strong sense of ownership and problem-solving capability.
- Collaborative team player with experience in mentoring and guiding other engineers on best practices in data engineering.
- Experience in using Big Data file formats (e.g., Parquet, Avro) and compression techniques.
- Familiarity with data governance and metadata management tools within Databricks and Azure environments.