AI Engineering Lead
Vichara Technologies.com
Office
Ridgewood, NJ, United States
Full Time
Company Description
Vichara is a Financial Services focused products and services firm headquartered in NY and building systems for some of the largest i-banks and hedge funds in the world.
Job Description
Key Responsibilities
🔹 Architecture & System Design
- Architect, design, and lead multi-agent LLM systems using LangGraph, LangChain, and Promptfoo for prompt lifecycle management and benchmarking.
- Build Retrieval-Augmented Generation (RAG) pipelines leveraging hybrid vector search (dense + keyword) using LanceDB, Pinecone, or Elasticsearch.
- Define system workflows for summarization, query routing, retrieval, and response generation, ensuring minimal latency and high precision.
- Develop RAG evaluation frameworks combining retrieval precision/recall, hallucination detection, and latency metrics — aligned with analyst and business use cases.
🔹 AI Model Integration & Fine-Tuning
- Integrate GPT-4o, PaLM 2, and open-weight models (LLaMA, Mistral) for task-specific contextual Q&A.
- Fine-tune transformer models (BERT, SentenceTransformers) for document classification, summarization, and sentiment analysis.
- Manage prompt routing and variant testing using Promptfoo or equivalent tools.
- 🔹 Agentic AI & Orchestration
- Integrate GPT-4o, PaLM 2, and open-weight models (LLaMA, Mistral) for task-specific contextual Q&A.
- Fine-tune transformer models (BERT, SentenceTransformers) for document classification, summarization, and sentiment analysis.
- Manage prompt routing and variant testing using Promptfoo or equivalent tools.
- 🔹 Agentic AI & Orchestration
- Implement multi-agent architectures with modular flows — enabling task-specific agents for summarization, retrieval, classification, and reasoning.
- Design fallback and recovery behaviors to ensure robustness in production.
- Employ LangGraph for parallel and stateful agent orchestration, error recovery, and deterministic flow control.
- 🔹 Data Engineering & RAG Infrastructure
- Design fallback and recovery behaviors to ensure robustness in production.
- Employ LangGraph for parallel and stateful agent orchestration, error recovery, and deterministic flow control.
- 🔹 Data Engineering & RAG Infrastructure
- Architect ingestion pipelines for structured and unstructured data — including financial statements, filings, and PDF documents.
- Leverage MongoDB for metadata storage and Redis Streams for async task execution and caching.
- Implement vector-based search and retrieval layers for high-throughput and low-latency AI systems.
- 🔹 Observability & Production Deployment
- Leverage MongoDB for metadata storage and Redis Streams for async task execution and caching.
- Implement vector-based search and retrieval layers for high-throughput and low-latency AI systems.
- 🔹 Observability & Production Deployment
- Deploy end-to-end AI systems on AWS EKS / Azure Kubernetes Service, integrated with CI/CD pipelines (Azure DevOps).
- Build comprehensive monitoring dashboards using OpenTelemetry and Signoz, tracking latency, retrieval precision, and application health.
- Enforce testing and regression validation using golden datasets and structured assertion checks for all LLM responses.
🔹 Cross-Functional Collaboration
- Collaborate with DevOps, MLOps, and application development teams to integrate AI APIs with React / FastAPI-based user interfaces.
- Work with business analysts to translate credit, compliance, and customer-support requirements into actionable AI agent workflows.
- Mentor a small team of GenAI developers and data engineers in RAG, embeddings, and orchestration techniques.
Qualifications
- Experience:
- 5+ years as an AI or ML Engineer
- Required Skills & Experience
- LLMs & GenAI: GPT-4o, PaLM 2, LangGraph, LangChain, Promptfoo, SentenceTransformers
- RAG Frameworks: LanceDB, Pinecone, ElasticSearch, FAISS, MongoDB
- Agentic AI: LangGraph multi-agent orchestration, routing logic, task decomposition
- Fine-Tuning: BERT / domain-specific transformer tuning, evaluation framework design
- Infra & MLOps: FastAPI, Docker, Kubernetes (EKS/AKS), Redis Streams, Azure DevOps CI/CD
- Monitoring: OpenTelemetry, Signoz, Prometheus
- Languages & Tools: Python, SQL, REST APIs, Git, Pandas, NumPy
🧠Nice-To-Have Skills
- Knowledge of Reranker-based retrieval (MiniLM / CrossEncoder)
- Familiarity with Prompt evaluation and scoring (BLEU, ROUGE, Faithfulness)
- Domain exposure to Credit Risk, Banking, and Investment Analytics
- Experience with RAG benchmark automation and model evaluation dashboards
- 5+ years as an AI or ML Engineer
- Required Skills & Experience
- LLMs & GenAI: GPT-4o, PaLM 2, LangGraph, LangChain, Promptfoo, SentenceTransformers
- RAG Frameworks: LanceDB, Pinecone, ElasticSearch, FAISS, MongoDB
- Agentic AI: LangGraph multi-agent orchestration, routing logic, task decomposition
- Fine-Tuning: BERT / domain-specific transformer tuning, evaluation framework design
- Infra & MLOps: FastAPI, Docker, Kubernetes (EKS/AKS), Redis Streams, Azure DevOps CI/CD
- Monitoring: OpenTelemetry, Signoz, Prometheus
- Languages & Tools: Python, SQL, REST APIs, Git, Pandas, NumPy
- Knowledge of Reranker-based retrieval (MiniLM / CrossEncoder)
- Familiarity with Prompt evaluation and scoring (BLEU, ROUGE, Faithfulness)
- Domain exposure to Credit Risk, Banking, and Investment Analytics
- Experience with RAG benchmark automation and model evaluation dashboards
Additional Information
AI Engineering Lead
Office
Ridgewood, NJ, United States
Full Time
October 10, 2025