Sr. Data Engineer
EXL.com
Office
Gurugram, Haryana, India
Full Time
We are seeking a skilled Data Engineer to join our team. The successful candidate will be responsible for maintaining and optimizing data pipelines, implementing robust data checks, and ensuring the accuracy and integrity of data flows. This role is critical in supporting data-driven decision-making processes, especially in the context of our insurance-focused business operations.
Key Responsibilities:
- Data Collection and Acquisition: Source Identification, Data Licensing and Compliance, Data Crawling/Collection
- Data Preprocessing and Cleaning: Data Cleaning, Text Tokenization, Normalization, Noise Filtering
- Data Transformation and Feature Engineering: Text Embedding, Text Augmentation, Handling Multilingual Data
- Data Pipeline Development: Scalable Pipelines, ETL Processes, Automation
- Data Storage and Management: Data Warehousing, Database Optimization, Version Control
- Collaboration with Data Scientists and ML Engineers: Data Accessibility, Support for Model Development, Data Quality Assurance
- Performance Optimization and Scaling: Efficient Data Handling, Distributed Computing
- Data Security and Privacy: Data Anonymization, Compliance with Regulations
- Documentation and Reporting: Data Pipeline Documentation, Reporting
Candidate Profile:
- 6 -10 years of relevant experience in data engineering tools
- Tools:
- Data Processing & Storage: Apache Spark, Apache Hadoop, Apache Kafka, Google BigQuery, AWS S3, Databricks
- Machine Learning Frameworks: TensorFlow, PyTorch, Hugging Face Transformers, scikit-learn
- Data Pipelines & Automation: Apache Airflow, Kubeflow, Luigi
- Version Control & Collaboration: Git, DVC (Data Version Control)
- Data Extraction: BeautifulSoup, Scrapy, APIs (RESTful, GraphQL)
