JOB DESCRIPTION

Designation: Lead Data Scientist

Location: Hyderabad, India

Work Mode: Office

Reporting to: Director/Head of Data Science

About US:

Foundation AI automatically ingests incoming documents, emails, and attachments from across your firm. It profiles, matches, classifies, and saves each to your DMS and then automates document-dependent workflows according to your rules. Read more about us at www.foundationai.com

Job Overview:
As a Lead Data Scientist, you will serve as a senior technical expert, driving the design and delivery of advanced AI/ML and LLM-based solutions from concept to production. This is an individual contributor role with a strong alignment and coordination component — ensuring technical excellence, stakeholder engagement, and seamless integration across product, engineering, and delivery teams.

The ideal candidate combines deep hands-on expertise in AI/ML/LLM/RAG systems with practical problem-solving acumen, delivering scalable solutions to complex business challenges.

Responsibilities:

1. AI/ML/LLM/RAG Solution Development

Architect and implement end-to-end AI/ML solutions, leveraging structured and unstructured data (text, images, metadata, multimodal sources).
Lead development and deployment of LLM-powered systems for classification, retrieval-augmented generation (RAG), information extraction, and summarization.
Apply advanced techniques such as LoRA, PEFT, quantization, and model distillation to optimize model performance, latency, and cost.
Design and implement robust retrieval pipelines with vector search, hybrid retrieval, and reranking strategies.

2. Technical Excellence & Best Practices

Ensure code quality, reproducibility, and model governance across the AI/ML lifecycle — from experimentation to production release.
Establish and promote best practices for data processing, feature engineering, model evaluation, and monitoring.
Conduct rigorous performance benchmarking and failure analysis, ensuring models meet accuracy, throughput, and reliability targets.

3. Product Lifecycle & Stakeholder Alignment

Partner closely with product managers, engineers, and customer-facing teams to translate business needs into AI/ML requirements.
Align model development milestones with product release schedules, ensuring timely and high-quality delivery.
Serve as a technical advisor during project scoping, prioritization, and release readiness reviews.

4. Research, Innovation & Thought Leadership

Stay current with cutting-edge research in AI, ML, NLP, LLMs, multimodal models, retrieval systems, and document intelligence..
Proactively identify opportunities to enhance existing algorithms and develop novel approaches.
Share learnings through internal tech talks, documentation, and mentorship to foster a culture of innovation.

Skills and Tools:

LLM & GenAI Expertise: Minimum 2 years of hands-on experience fine-tuning, prompting, and deploying LLMs (commercial and open-source) such as GPT, Claude, Gemini, Mistral, LLaMA, Falcon, Vicuna, MPT, T5.
NLP & Information Retrieval: At least 4 years working with NLP tasks — classification, NER, summarization, QA, RAG architectures — using Transformer-based models (BERT, RoBERTa, T5, etc.).
Deep Learning Frameworks: Strong experience with PyTorch and/or TensorFlow for model training and deployment.
Coding & Engineering: Expert-level Python; strong SQL; experience with FastAPI/Flask for serving models; Git proficiency.
Data & Infra: Proficiency with PostgreSQL and vector databases (Pinecone, Qdrant, Weaviate, etc.); familiarity with Docker/Kubernetes.
MLOps & Scaling: Experience with MLFlow/KubeFlow/SageMaker or equivalent for training pipelines, deployment, and monitoring at scale.
Prompt Engineering: Skilled in CoT, self-consistency, ToT, and advanced prompting for LLM optimization.
Ability to simplify complex technical concepts for diverse stakeholders.
Strong problem-solving skills with a bias toward scalable, maintainable solutions.
Excellent communication and documentation skills.
Track record of aligning technical execution with business priorities and delivery timelines.
Experience with multimodal models (text + vision).
Knowledge of knowledge graph construction and integration.
Familiarity with cloud services (AWS preferred).
Exposure to compliance and governance requirements in AI systems

Education:

Bachelor’s or Master’s degree in Computer Science, Data Science, Electrical Engineering, Statistics, or a related discipline from a recognized Tier-1 or Tier-2 institution.

Our Commitment:

At Foundation AI, we're committed to creating an inclusive and diverse workplace. We value equal opportunity and affirmative action principles, giving everyone an equal chance to succeed. We're dedicated to offering equal employment opportunities regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, or veteran status. Upholding these values and adhering to applicable laws is paramount to us.

Competencies

Areas

Tools/Skills

Must / Good to have

Experience

LLM / GenAI Development

GPT, Claude, Gemini, Mistral, LLaMA, T5, Falcon, Vicuna, MPT, OpenAI APIs

Must Have

Min 2 Years

NLP Expertise

NLTK, spaCy, HuggingFace Transformers, SentenceTransformers

Must Have

Min 4 Years

RAG & Information Retrieval

LangChain, LlamaIndex, Pinecone, Qdrant, Weaviate

Must Have

Min 1 Year

Production-Scale Model Deployment

MLFlow, KubeFlow, Ray Serve, SageMaker, CI/CD for ML

Must Have

Min 2 Years

Transformer & Advanced Architectures

BERT, RoBERTa, T5, LLaMA-based fine-tuning

Must Have

Min 3 Years

Python & Software Engineering

NumPy, Pandas, Scikit-learn, FastAPI, Flask, REST APIs

Must Have

Min 4 Years

Prompt Engineering

CoT, Self-Consistency, ToT, Retrieval-Enhanced Prompting

Must Have

Min 1 Year

Vector & Relational Databases

PostgreSQL, Pinecone, Qdrant

Must Have

Min 1 Year

Deep Learning Frameworks

PyTorch, TensorFlow

Must Have

Min 3 Years

Containerization & Orchestration

Docker, Kubernetes

Must Have

—

Cloud & MLOps

AWS (S3, EC2, Lambda, SageMaker), GCP, Azure

Good to Have

—

Computer Vision / Multimodal AI

ResNet, YOLO, CLIP, BLIP

Good to Have

—

Model Optimization Techniques

LoRA, PEFT, Quantization, Pruning, Distillation

Must Have

Min 1 Year

Stakeholder Alignment & Product Integration

Agile, Scrum, cross-functional collaboration

Must Have

—

For any feedback or inquiries, please contact us at careers@foundationai.com
Learn more about us at www.foundationai.com