company logo

AI Engineering Lead

Vichara Technologies.com

Office

Ridgewood, NJ, United States

Full Time

Company Description

Vichara is a Financial Services focused products and services firm headquartered in NY and building systems for some of the largest i-banks and hedge funds in the world.

Job Description

Key Responsibilities

🔹 Architecture & System Design

  • Architect, design, and lead multi-agent LLM systems using LangGraph, LangChain, and Promptfoo for prompt lifecycle management and benchmarking.
  • Build Retrieval-Augmented Generation (RAG) pipelines leveraging hybrid vector search (dense + keyword) using LanceDB, Pinecone, or Elasticsearch.
  • Define system workflows for summarization, query routing, retrieval, and response generation, ensuring minimal latency and high precision.
  • Develop RAG evaluation frameworks combining retrieval precision/recall, hallucination detection, and latency metrics — aligned with analyst and business use cases.

🔹 AI Model Integration & Fine-Tuning

  • Integrate GPT-4o, PaLM 2, and open-weight models (LLaMA, Mistral) for task-specific contextual Q&A.
  • Fine-tune transformer models (BERT, SentenceTransformers) for document classification, summarization, and sentiment analysis.
  • Manage prompt routing and variant testing using Promptfoo or equivalent tools.
  • 🔹 Agentic AI & Orchestration
  • Integrate GPT-4o, PaLM 2, and open-weight models (LLaMA, Mistral) for task-specific contextual Q&A.
  • Fine-tune transformer models (BERT, SentenceTransformers) for document classification, summarization, and sentiment analysis.
  • Manage prompt routing and variant testing using Promptfoo or equivalent tools.
  • 🔹 Agentic AI & Orchestration
  • Implement multi-agent architectures with modular flows — enabling task-specific agents for summarization, retrieval, classification, and reasoning.
  • Design fallback and recovery behaviors to ensure robustness in production.
  • Employ LangGraph for parallel and stateful agent orchestration, error recovery, and deterministic flow control.
  • 🔹 Data Engineering & RAG Infrastructure
  • Design fallback and recovery behaviors to ensure robustness in production.
  • Employ LangGraph for parallel and stateful agent orchestration, error recovery, and deterministic flow control.
  • 🔹 Data Engineering & RAG Infrastructure
  • Architect ingestion pipelines for structured and unstructured data — including financial statements, filings, and PDF documents.
  • Leverage MongoDB for metadata storage and Redis Streams for async task execution and caching.
  • Implement vector-based search and retrieval layers for high-throughput and low-latency AI systems.
  • 🔹 Observability & Production Deployment
  • Leverage MongoDB for metadata storage and Redis Streams for async task execution and caching.
  • Implement vector-based search and retrieval layers for high-throughput and low-latency AI systems.
  • 🔹 Observability & Production Deployment
  • Deploy end-to-end AI systems on AWS EKS / Azure Kubernetes Service, integrated with CI/CD pipelines (Azure DevOps).
  • Build comprehensive monitoring dashboards using OpenTelemetry and Signoz, tracking latency, retrieval precision, and application health.
  • Enforce testing and regression validation using golden datasets and structured assertion checks for all LLM responses.

🔹 Cross-Functional Collaboration

  • Collaborate with DevOps, MLOps, and application development teams to integrate AI APIs with React / FastAPI-based user interfaces.
  • Work with business analysts to translate credit, compliance, and customer-support requirements into actionable AI agent workflows.
  • Mentor a small team of GenAI developers and data engineers in RAG, embeddings, and orchestration techniques.

Qualifications

  • Experience:
  • 5+ years as an AI or ML Engineer
  • Required Skills & Experience
  • LLMs & GenAI: GPT-4o, PaLM 2, LangGraph, LangChain, Promptfoo, SentenceTransformers
  • RAG Frameworks: LanceDB, Pinecone, ElasticSearch, FAISS, MongoDB
  • Agentic AI: LangGraph multi-agent orchestration, routing logic, task decomposition
  • Fine-Tuning: BERT / domain-specific transformer tuning, evaluation framework design
  • Infra & MLOps: FastAPI, Docker, Kubernetes (EKS/AKS), Redis Streams, Azure DevOps CI/CD
  • Monitoring: OpenTelemetry, Signoz, Prometheus
  • Languages & Tools: Python, SQL, REST APIs, Git, Pandas, NumPy
  • 🧠 Nice-To-Have Skills

  • Knowledge of Reranker-based retrieval (MiniLM / CrossEncoder)
  • Familiarity with Prompt evaluation and scoring (BLEU, ROUGE, Faithfulness)
  • Domain exposure to Credit Risk, Banking, and Investment Analytics
  • Experience with RAG benchmark automation and model evaluation dashboards
  • 5+ years as an AI or ML Engineer
  • Required Skills & Experience
  • LLMs & GenAI: GPT-4o, PaLM 2, LangGraph, LangChain, Promptfoo, SentenceTransformers
  • RAG Frameworks: LanceDB, Pinecone, ElasticSearch, FAISS, MongoDB
  • Agentic AI: LangGraph multi-agent orchestration, routing logic, task decomposition
  • Fine-Tuning: BERT / domain-specific transformer tuning, evaluation framework design
  • Infra & MLOps: FastAPI, Docker, Kubernetes (EKS/AKS), Redis Streams, Azure DevOps CI/CD
  • Monitoring: OpenTelemetry, Signoz, Prometheus
  • Languages & Tools: Python, SQL, REST APIs, Git, Pandas, NumPy
  • Knowledge of Reranker-based retrieval (MiniLM / CrossEncoder)
  • Familiarity with Prompt evaluation and scoring (BLEU, ROUGE, Faithfulness)
  • Domain exposure to Credit Risk, Banking, and Investment Analytics
  • Experience with RAG benchmark automation and model evaluation dashboards

Additional Information

AI Engineering Lead

Office

Ridgewood, NJ, United States

Full Time

October 10, 2025

company logo

Vichara Technologies