
About this role
Full Time Senior Senior Data Engineer in AI at Milestone Technologies, Inc. in Hyderabad, IN. Apply directly through the link below.
At a glance
- Work mode
- Office
- Employment
- Full Time
- Location
- Hyderabad, IN
- Experience
- Senior · 5+ years
Core stack
- Data Engineering
- Cross-functional
- Computer Science
- Data Warehouse
- Observability
- Apache Spark
- Optimization
- Performance
- Distributed
- Scalability
- Compliance
- Snowflake
- Analytics
- Debugging
- Airflow
- Logging
- Python
- Lambda
- Design
- CI/CD
- Kafka
- SOLID
- SQL
- AWS
- ETL
- dbt
- LLM
- EMR
- S3
Quick answers
What are the qualifications?
Skills Bachelor’s degree (or higher) in Computer Science, Engineering, or a related technical field
What skills are required?
Data Engineering, Cross-functional, Computer Science, Data Warehouse, Observability, Apache Spark, Optimization, Performance, Distributed, Scalability, and more.
Milestone Technologies, Inc. is hiring for this role. Visit career page
Hyderābād, India
Milestone Technologies is seeking a skilled Data Engineer to support a client’s data engineering initiatives on AWS. This role focuses on building scalable data pipelines, improving data quality, and ensuring reliable data processing across modern lakehouse architectures.
You will work closely with data engineering teams, business analysts, and reporting teams to design, build, and optimize data pipelines while ensuring high data quality, observability, and performance.
- Build scalable data ingestion pipelines for relational, semi-structured, and unstructured data sources
Design, implement, and optimize lakehouse architectures using Apache Iceberg - Optimize table design including partitioning, compaction, schema evolution, and performance tuning for Iceberg datasets
- Implement best practices for versioning, time travel, incremental processing, and ACID compliance
- Develop and optimize Apache Spark (batch and streaming) jobs for large-scale data processing
- Work extensively with AWS services such as Glue, EMR, Lambda, Step Functions, and S3 with a focus on cost and performance optimization
- Build and manage real-time data pipelines using Kafka and Kafka Streaming
- Design and orchestrate workflows using DBT and Airflow
- Implement automated data quality checks, validation frameworks, and error monitoring mechanisms
- Establish observability frameworks including monitoring, logging, and alerting for data pipelines
- Collaborate with analytics/reporting teams to enable data quality dashboards and reporting
- Analyze existing pipelines to identify improvements and enhance reliability and scalability
- Leverage AI/LLM-based tools to accelerate ETL/ELT development, validation, and debugging
- Participate in code reviews and contribute to best practices and engineering standards
Skills
- Bachelor’s degree (or higher) in Computer Science, Engineering, or a related technical field
- 5+ years of experience designing, building, and maintaining data pipelines
- Strong programming skills in SQL, Python, and Apache Spark
- Hands-on experience with AWS data services (Glue, EMR, S3, Lambda, Step Functions)
- Deep understanding of lakehouse architectures and Apache Iceberg
- Experience with DBT and Airflow for data transformation and orchestration
- Strong experience with Kafka and real-time streaming pipelines
- Experience working with Snowflake as a cloud data warehouse
- Strong understanding of data quality frameworks, validation, and monitoring
- Experience handling structured, semi-structured, and unstructured data at scale
- Solid understanding of distributed systems and data engineering best practices
- Experience with CI/CD pipelines and automation (preferred)
- Strong problem-solving skills and ability to work in a fast-paced environment
- Excellent communication skills and ability to collaborate with cross-functional teams