This role is for one of the Weekday's clients

Salary range: Rs 2000000 - Rs 2500000 (ie INR 20-25 LPA)

Min Experience: 6 years

Location: Bengaluru

JobType: full-time

We are seeking a highly skilled Tech Lead to define and drive the architecture of multi-agent AI workflows. This role will lead the development of Agent-to-Agent handoff systems, ensuring seamless context and content transfer across AI agents (e.g., Content → Design → Video). The ideal candidate will have hands-on expertise in LLM hosting, orchestration, and optimization, along with strong infrastructure knowledge to deliver secure, scalable, and cost-effective AI solutions.

Requirements

Key Responsibilities

Architect and own the full AI stack, including LLM hosting, backend services, orchestration, and infrastructure.
Design and optimize multi-agent orchestration pipelines with contextual memory and intelligent handoff mechanisms.
Deploy, fine-tune, and optimize LLMs using self-hosted and open-source frameworks (e.g., Ollama, HuggingFace, LangChain).
Implement secure, private AI pipelines with role-based access, encryption, and monitoring — ensuring sensitive data never leaves the environment.
Make strategic infrastructure choices (GPU clusters, vector databases, orchestration tools) to balance performance, cost, and security.
Lead DevOps practices such as CI/CD, observability, monitoring, and auto-scaling for AI workloads.
Mentor and guide engineering teams on best practices for model deployment, scaling, and optimization.
Collaborate with Product and Design teams to accelerate feature delivery and improve cross-functional outcomes.
Ensure compliance with data protection regulations (GDPR, privacy-first AI design).
Continuously evaluate emerging OSS models and frameworks to improve cost-efficiency and system performance.

Qualifications

7+ years of experience in software engineering/AI systems, with at least 3+ years in ML/LLM deployment.
Strong background in LLM hosting and fine-tuning (Ollama, HuggingFace, LangChain, vLLM, LoRA).
Proven experience deploying generative AI across multiple modalities (text, image, video).
Expertise in GPU infrastructure across cloud (AWS/GCP/Azure) and hybrid/on-prem setups, with Kubernetes/Docker.
Solid backend engineering skills (Python, FastAPI/Node.js, microservices, event-driven systems).
Track record of leading engineering teams and delivering production-grade AI products.
Strong knowledge of vector databases (Pinecone, Weaviate, Milvus) and retrieval pipelines (RAG).
Excellent communication skills to align product, design, and technical teams.

Nice to Have

Experience with multi-agent or Agentic AI systems.
Background in marketing-tech or SaaS product platforms.
Knowledge of GPU optimization techniques (quantization, batching, caching).
Hands-on experience with privacy-first AI architectures.

Core Skills

LLM Deployment
AI Systems Architecture
Multi-Agent Orchestration
Secure AI Infrastructure