• Deployed an LLM-powered research assistant to query structured/unstructured data, reducing research friction by 90% and cutting sales research turnaround from 1 day to under 2 hours.
• Built real-time multi-agent workflows (LangGraph + WebSockets) enabling parallel reasoning, delivering 3x faster LLM interaction in multi-step tasks.
• Architected a hybrid retrieval system (MongoDB vector search + Neo4j graph queries) with semantic routing, reducing average RAG latency from 12–15s to 4–6s across 1,000+ test queries.
• Developed a Graph-RAG pipeline with Cypher query generation, rewriting, and prompt tuning, achieving 95% factual accuracy and cutting hallucinations by 70% on 300+ benchmark queries.
• Designed an asynchronous FastAPI backend with persistent session memory and modular RAG orchestration layers, scaling to 500+ concurrent sessions without performance drop.