KPMG India logo

Databricks Data Architect - Associate Director

KPMG India

Posted 13 days ago

About this role

Job Requirements

Mandatory Skills:

  • Bachelor’s or higher degree in Computer Science, Information Technology, or a related discipline; or equivalent experience with minimum 12+ years overall work experience.
  • Extensive hands-on experience with Databricks and Apache Spark, including PySpark, SQL, and Scala for data processing and analytics.
  • Proven expertise in designing and implementing efficient data ingestion pipelines using Databricks, managing large-scale datasets from various sources.
  • Experience delivering proofs of concept to demonstrate the capabilities of Databricks to stakeholders.
  • Skilled in developing scalable and reusable frameworks for data ingestion and transformation using Databricks features such as Delta Lake.
  • Strong understanding of cloud architectures, particularly within Azure and Databricks integration, including Azure Databricks, Azure Data Lake Storage, and Azure Blob storage.
  • Experience in implementing CI/CD pipelines and DevSecOps practices in Databricks environments.
  • Proficiency in cloud migration methodologies and processes pertinent to data engineering.
  • Strong communication skills, both written and verbal, with the ability to present complex technical information in a clear and concise manner.
  • Self-motivated, proactive working style with a strong sense of ownership and problem-solving capability.
  • Collaborative team player with experience in mentoring and guiding other engineers on best practices in data engineering.
  • Experience in using Big Data file formats (e.g., Parquet, Avro) and compression techniques.
  • Familiarity with data governance and metadata management tools within Databricks and Azure environments.

Primary Roles and Responsibilities:

  • Design, build, and maintain the data infrastructure utilizing Databricks and its integration within Azure ecosystems.
  • Develop and optimize data pipelines to extract, transform, and load data from diverse sources into Databricks-supported data lakes.
  • Implement and manage data security and privacy measures, ensuring compliance with industry standards and regulations.
  • Monitor and troubleshoot data workflows and optimize data storage and processing for performance and cost efficiency using Databricks.
  • Collaborate with cross-functional teams to ensure data quality and availability for business stakeholders and analytics teams.
  • Provide strategic technical guidance and support for Databricks initiatives, ensuring alignment with business goals.
  • Innovate and implement solutions using Databricks and Spark technologies for processing and analyzing big data.
  • Participate in projects focused on cloud architecture improvements, including planning and executing migrations to Databricks and Azure.

Preferred Skills:

  • Experience working in Azure DevOps environments with tools such as Terraform and Microsoft Visual Studio Team Services.
  • Familiarity with Microsoft Power BI for creating business insights through dashboards and reports.
  • Direct experience in building data pipelines using Databricks and Azure Data Factory.
  • Understanding of Azure Role-Based Access Control (RBAC) and Identity Access Management (IAM) for managing data security.
  • Inclination towards exploring and aligning with Microsoft’s vision and roadmap around emerging tools and technologies.
  • Relevant certifications, such as Databricks Certified Data Engineer Associate or DP-203: Data Engineering on Microsoft Azure.
  • Experience in utilizing Databricks notebooks and workflows for orchestrating and scheduling data processing tasks.
  • Knowledge on the latest innovations in artificial intelligence and machine learning and their integration with Databricks.
  • Design, build, and maintain the data infrastructure utilizing Databricks and its integration within Azure ecosystems.
  • Develop and optimize data pipelines to extract, transform, and load data from diverse sources into Databricks-supported data lakes.
  • Implement and manage data security and privacy measures, ensuring compliance with industry standards and regulations.
  • Monitor and troubleshoot data workflows and optimize data storage and processing for performance and cost efficiency using Databricks.
  • Collaborate with cross-functional teams to ensure data quality and availability for business stakeholders and analytics teams.
  • Provide strategic technical guidance and support for Databricks initiatives, ensuring alignment with business goals.
  • Innovate and implement solutions using Databricks and Spark technologies for processing and analyzing big data.
  • Participate in projects focused on cloud architecture improvements, including planning and executing migrations to Databricks and Azure.
  • Bachelor’s or higher degree in Computer Science, Information Technology, or a related discipline; or equivalent experience with minimum 12+ years overall work experience.
  • Extensive hands-on experience with Databricks and Apache Spark, including PySpark, SQL, and Scala for data processing and analytics.
  • Proven expertise in designing and implementing efficient data ingestion pipelines using Databricks, managing large-scale datasets from various sources.
  • Experience delivering proofs of concept to demonstrate the capabilities of Databricks to stakeholders.
  • Skilled in developing scalable and reusable frameworks for data ingestion and transformation using Databricks features such as Delta Lake.
  • Strong understanding of cloud architectures, particularly within Azure and Databricks integration, including Azure Databricks, Azure Data Lake Storage, and Azure Blob storage.
  • Experience in implementing CI/CD pipelines and DevSecOps practices in Databricks environments.
  • Proficiency in cloud migration methodologies and processes pertinent to data engineering.
  • Strong communication skills, both written and verbal, with the ability to present complex technical information in a clear and concise manner.
  • Self-motivated, proactive working style with a strong sense of ownership and problem-solving capability.
  • Collaborative team player with experience in mentoring and guiding other engineers on best practices in data engineering.
  • Experience in using Big Data file formats (e.g., Parquet, Avro) and compression techniques.
  • Familiarity with data governance and metadata management tools within Databricks and Azure environments.

Job details

Workplace

Office

Location

Bangalore, Karnataka, India

Job type

Full Time

Similar

Company

Jobr Assistant extension

Get the extension →