Gayatri Patil

Engineer by training. Explorer by nature.
Gayatri Patil

About Me

I'm Gayatri — a Software Engineer and AI Practitioner with 4+ years of experience building enterprise-scale data pipelines, ML systems, and multi-agent AI architectures.

I believe the most elegant solutions often mirror patterns found in nature: adaptive, resilient, and beautifully interconnected. Whether I'm designing a federated learning system or a self-improving AI agent, I'm always drawn to architectures that feel alive.

When I'm not training models or debugging pipelines, you'll find me on a hiking trail with my camera, lost in a book, experimenting in the kitchen, or chasing the next flight somewhere new. I travel to collect perspectives — and I bring every one of them back to my work.

Currently pursuing my MS in Applied Data Science at SJSU, I'm focused on the frontiers of Explainable AI, multi-agent systems, and GenAI. I'm open to collaborations that push boundaries and make everyday systems genuinely smarter.

4+ Years Experience
9+ Projects Built
3 Hackathon Awards
3 Companies

Beyond the Code

Hiking & Nature
Reading
Traveling
Cooking

Professional Experience

Jun 2025 – Aug 2025

Kushmanda.ai

Data Science Intern — San Diego, CA
GoLang Apache Druid React-Native OLAP
  • Designed GoLang APIs for data source onboarding, enabling ingestion of 3 structured/semi-structured sources (CSV/Parquet, JSON, SQL) and reducing manual onboarding effort.
  • Integrated Apache Druid for real-time OLAP and built cross-source joins & React-Native visualizations, improving query latency by 25% and accelerating analytics insights.
Feb 2022 – Dec 2023

Michelin

Developer Analyst — Pune, India
Azure Data Lake ETL/ELT SQL Workday HCM
  • Implemented 20 production ETL/ELT pipelines automating data flow across 4 cross-functional teams, eliminating manual handoffs and improving reporting consistency.
  • Architected Workday HCM SaaS ingestion into Azure Data Lake & Blob Storage with role-based access controls, maintaining 100% governance compliance across enterprise HR data.
  • Drove 19% increase in report adoption by automating 20+ analytical reports, enabling self-serve access to business metrics.
  • Reduced SQL query execution time by 14 sec by implementing materialized views and CTEs.
  • Partnered with Michelin HQ (France) to define 11+ business KPIs and deliver the 2023 analytics roadmap.
Sep 2019 – Feb 2022

LTI — Larsen & Toubro Infotech

Software Engineer — Pune, India
Java Spring Boot REST APIs AML/KYC
  • Engineered batch processing pipelines and microservices for Citi's global KYC compliance platform spanning 200+ countries, automating AML data feed ingestion and validation.
  • Built rule-based AML risk trigger identification engine in Java/Spring Boot, eliminating ~15% of manual data checks and improving regulatory accuracy.
  • Increased KYC data accessibility by 25% by designing and deploying REST APIs enabling seamless integration across 3+ downstream compliance applications.
  • Reduced production bugs by 9% via SonarLint-driven code reviews and developed GWT UI components for KYC monitoring dashboards.

Featured Projects

Advanced Explainable AI Toolkit

LLM Interpretability & Reasoning
PyTorch HuggingFace QLoRA FastAPI Runpod
  • Built LLM explainability suite with 6 interpretability methods — logit lens, gradient attribution, attention analysis, and token confidence — reducing model debugging time significantly.
  • Developed PRM-based reasoning pipeline for step-level error detection in chain-of-thought outputs.

HydraSwarm: Self-Improving AI Agents

Multi-Agent Orchestration System
Next.js TypeScript Claude Node.js HydraDB
  • Simulates a 7-agent software engineering company (PM → Architect → Dev → Reviewer → QA → SRE → CTO) where every agent queries HydraDB before acting and stores lessons back after each task.
  • Agents recall past mistakes — decision quality improves from 7/10 on first run, climbing on repeated similar tasks via lesson-based retrieval and shared org memory.
  • Used 7 distinct HydraDB capabilities: knowledge ingestion, sub-tenants per agent, shared org memory, hybrid recall, graph relations, and inference. 325 unit tests across 21 suites.

Adaptive Browser Agent

LLM-Powered Browser Automation
Playwright React ChromaDB Redis Groq
  • Designed a multi-strategy LLM browser agent with intelligent routing, persistent workflow memory, and semantic retrieval — reducing per-task inference cost by 30%.
  • Routes simpler tasks to lightweight models and serves repeated workflows from cache, minimizing redundant LLM calls.

HelioSync

Federated Learning MLOps Platform
Docker Flask TensorFlow PyTorch
  • Architected a modular federated learning platform with Dockerized services for client orchestration and secure FedAvg aggregation.
  • Implemented CI/CD, visualization, and fault-tolerant training pipeline across distributed clients.

GitSight

GitHub BigData Analytics Pipeline
PySpark Snowflake Azure Power BI
  • Engineered a distributed pipeline to process 54GB+ of GitHub metadata, analyzing developer influence and open-source trends.
  • Structured raw activity data into Snowflake, creating Power BI dashboards to aid recruiters and product teams.

Pitch-Perfect

AI Cricket Shot Analysis
TensorFlow FastAPI React Computer Vision
  • Built a full-stack app for real-time cricket shot classification and form quality assessment using deep learning.
  • Multi-task model classifies shot types (drive, pull) and assesses form quality with confidence scoring, supporting live camera and video upload.

Technical Arsenal

Languages

Python SQL Java C++ R GoLang TypeScript

AI / ML

PyTorch TensorFlow Scikit-learn GenAI / LLM RAG Computer Vision NLP CUDA HuggingFace

Data Engineering

Apache Airflow Spark dbt Apache Druid Databricks Kafka Hadoop ETL/ELT

Cloud & Databases

AWS GCP Azure Snowflake BigQuery MySQL ChromaDB Redis

DevOps & Tools

Docker Kubernetes CI/CD Git Jenkins Spring Boot FastAPI REST APIs

Visualization

Tableau Power BI Looker Superset React

Education

San José State University

M.S. Applied Data Science

Aug 2024 – May 2026  |  California, USA

Machine Learning GenAI Big Data Data Visualization

Savitribai Phule Pune University

B.E. Computer Engineering

Jul 2015 – Jun 2019  |  Pune, India

Algorithms DBMS Software Engineering

Awards & Recognitions

HackWithBay 2.0

First Prize Winner

PrismGraph AI — Researchers waste hours drowning in disconnected PDFs. Standard AI chatbots hallucinate and lose context across papers. PrismGraph entirely deconstructs research papers and builds them into a living, interactive knowledge graph — automatically extracting authors, methodologies, claims, datasets, and citation relationships. Ask a question, get an evidence-backed, fully verifiable answer drawn across a web of interconnected knowledge.

Voice AI × Healthcare Hackathon

First Prize — Synthio Labs × Cekura

Setu — A voice agent for chemotherapy patients that only answers from a verified oncology/pharma dataset, refuses to guess outside it, speaks empathetically, and shows the exact source used for every answer.

HydraDB Hackathon

Hackathon Winner

HydraSwarm — Simulates a 7-agent software engineering company where every agent queries HydraDB before acting and stores lessons back after. Run a task once, score 7/10. Run a similar task again — agents recall what went wrong. Score goes up. Uses 7 distinct HydraDB capabilities: knowledge ingestion, sub-tenants per agent, shared org memory, hybrid recall, graph relations, and inference. 325 unit tests across 21 suites.

Go-Getter Award

Michelin, 2023

Recognized internally at Michelin for exceptional initiative, consistent high-quality delivery, and cross-team impact in analytics.

Blog & Writings

Thoughts on AI, data, and the world we're building.

AI & Sustainability Nov 2025  ·  3 min read

AI-Driven Energy Crisis: An Unknown Threat

Data centers consumed 1.65 billion gigajoules of electricity in 2022. With AI demand projected to grow 35–128% by 2026, the energy cost of intelligence is a problem we can no longer ignore.

Read on Medium
Multi-Agent AI Coming Soon

Building Self-Improving AI Agent Pipelines

How I designed a 7-stage multi-agent system that learns from its own mistakes and improves decision quality over time using persistent organizational memory.

Coming Soon
Explainable AI Coming Soon

Making LLMs Explain Themselves: A Deep Dive into XAI

From logit lens to gradient attribution — a practical guide to understanding what large language models are actually doing when they generate text.

Coming Soon

Get In Touch

Whether it's a collaboration, an opportunity, or just a hello — my inbox is always open.