Lead Data Scientist
Foundation AI
Office
Telangana, Hyderabad
Full Time
JOB DESCRIPTION
Designation: Lead Data Scientist
Location: Hyderabad, India
Work Mode: Office
Reporting to: Director/Head of Data Science
About US:
Foundation AI automatically ingests incoming documents, emails, and attachments from across your firm. It profiles, matches, classifies, and saves each to your DMS and then automates document-dependent workflows according to your rules. Read more about us at www.foundationai.com
Job Overview:
As a Lead Data Scientist, you will serve as a senior technical expert, driving the design and delivery of advanced AI/ML and LLM-based solutions from concept to production. This is an individual contributor role with a strong alignment and coordination component — ensuring technical excellence, stakeholder engagement, and seamless integration across product, engineering, and delivery teams.
The ideal candidate combines deep hands-on expertise in AI/ML/LLM/RAG systems with practical problem-solving acumen, delivering scalable solutions to complex business challenges.
Responsibilities:
1. AI/ML/LLM/RAG Solution Development
- Architect and implement end-to-end AI/ML solutions, leveraging structured and unstructured data (text, images, metadata, multimodal sources).
- Lead development and deployment of LLM-powered systems for classification, retrieval-augmented generation (RAG), information extraction, and summarization.
- Apply advanced techniques such as LoRA, PEFT, quantization, and model distillation to optimize model performance, latency, and cost.
- Design and implement robust retrieval pipelines with vector search, hybrid retrieval, and reranking strategies.
2. Technical Excellence & Best Practices
- Ensure code quality, reproducibility, and model governance across the AI/ML lifecycle — from experimentation to production release.
- Establish and promote best practices for data processing, feature engineering, model evaluation, and monitoring.
- Conduct rigorous performance benchmarking and failure analysis, ensuring models meet accuracy, throughput, and reliability targets.
3. Product Lifecycle & Stakeholder Alignment
- Partner closely with product managers, engineers, and customer-facing teams to translate business needs into AI/ML requirements.
- Align model development milestones with product release schedules, ensuring timely and high-quality delivery.
- Serve as a technical advisor during project scoping, prioritization, and release readiness reviews.
4. Research, Innovation & Thought Leadership
- Stay current with cutting-edge research in AI, ML, NLP, LLMs, multimodal models, retrieval systems, and document intelligence..
- Proactively identify opportunities to enhance existing algorithms and develop novel approaches.
- Share learnings through internal tech talks, documentation, and mentorship to foster a culture of innovation.
Skills and Tools:
- LLM & GenAI Expertise: Minimum 2 years of hands-on experience fine-tuning, prompting, and deploying LLMs (commercial and open-source) such as GPT, Claude, Gemini, Mistral, LLaMA, Falcon, Vicuna, MPT, T5.
- NLP & Information Retrieval: At least 4 years working with NLP tasks — classification, NER, summarization, QA, RAG architectures — using Transformer-based models (BERT, RoBERTa, T5, etc.).
- Deep Learning Frameworks: Strong experience with PyTorch and/or TensorFlow for model training and deployment.
- Coding & Engineering: Expert-level Python; strong SQL; experience with FastAPI/Flask for serving models; Git proficiency.
- Data & Infra: Proficiency with PostgreSQL and vector databases (Pinecone, Qdrant, Weaviate, etc.); familiarity with Docker/Kubernetes.
- MLOps & Scaling: Experience with MLFlow/KubeFlow/SageMaker or equivalent for training pipelines, deployment, and monitoring at scale.
- Prompt Engineering: Skilled in CoT, self-consistency, ToT, and advanced prompting for LLM optimization.
- Ability to simplify complex technical concepts for diverse stakeholders.
- Strong problem-solving skills with a bias toward scalable, maintainable solutions.
- Excellent communication and documentation skills.
- Track record of aligning technical execution with business priorities and delivery timelines.
- Experience with multimodal models (text + vision).
- Knowledge of knowledge graph construction and integration.
- Familiarity with cloud services (AWS preferred).
- Exposure to compliance and governance requirements in AI systems
Education:
Bachelor’s or Master’s degree in Computer Science, Data Science, Electrical Engineering, Statistics, or a related discipline from a recognized Tier-1 or Tier-2 institution.
Our Commitment:
At Foundation AI, we're committed to creating an inclusive and diverse workplace. We value equal opportunity and affirmative action principles, giving everyone an equal chance to succeed. We're dedicated to offering equal employment opportunities regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, or veteran status. Upholding these values and adhering to applicable laws is paramount to us.
Competencies
Areas
Tools/Skills
Must / Good to have
Experience
LLM / GenAI Development
GPT, Claude, Gemini, Mistral, LLaMA, T5, Falcon, Vicuna, MPT, OpenAI APIs
Must Have
Min 2 Years
NLP Expertise
NLTK, spaCy, HuggingFace Transformers, SentenceTransformers
Must Have
Min 4 Years
RAG & Information Retrieval
LangChain, LlamaIndex, Pinecone, Qdrant, Weaviate
Must Have
Min 1 Year
Production-Scale Model Deployment
MLFlow, KubeFlow, Ray Serve, SageMaker, CI/CD for ML
Must Have
Min 2 Years
Transformer & Advanced Architectures
BERT, RoBERTa, T5, LLaMA-based fine-tuning
Must Have
Min 3 Years
Python & Software Engineering
NumPy, Pandas, Scikit-learn, FastAPI, Flask, REST APIs
Must Have
Min 4 Years
Prompt Engineering
CoT, Self-Consistency, ToT, Retrieval-Enhanced Prompting
Must Have
Min 1 Year
Vector & Relational Databases
PostgreSQL, Pinecone, Qdrant
Must Have
Min 1 Year
Deep Learning Frameworks
PyTorch, TensorFlow
Must Have
Min 3 Years
Containerization & Orchestration
Docker, Kubernetes
Must Have
—
Cloud & MLOps
AWS (S3, EC2, Lambda, SageMaker), GCP, Azure
Good to Have
—
Computer Vision / Multimodal AI
ResNet, YOLO, CLIP, BLIP
Good to Have
—
Model Optimization Techniques
LoRA, PEFT, Quantization, Pruning, Distillation
Must Have
Min 1 Year
Stakeholder Alignment & Product Integration
Agile, Scrum, cross-functional collaboration
Must Have
—
For any feedback or inquiries, please contact us at careers@foundationai.com
Learn more about us at www.foundationai.com
Lead Data Scientist
Office
Telangana, Hyderabad
Full Time
August 14, 2025