Gopinath Reddy Thumati
SUMMARY
AI/ML Engineer with 5+ years building and deploying production-grade AI systems at Goldman Sachs and enterprise platforms, partnering
closely with stakeholders to deliver reliable solutions. Expertise in LLMs, RAG pipelines, agentic AI, embeddings, and vector search to develop
enterprise copilots, knowledge assistants, and intelligent automation solutions. Proven track record of integrating AI with observability and
incident platforms to reduce MTTR by 38% and improve operational efficiency.
Strong hands-on experience in LLM orchestration (LangChain, LlamaIndex), prompt engineering, tool/function calling, semantic retrieval, and
multi-agent workflows, along with end-to-end ML pipelines for NLP, computer vision, anomaly detection, and predictive modelling using
PyTorch, and TensorFlow. Skilled in designing and developing FastAPI-based AI services and RESTful APIs, and deploying scalable workloads
on AWS using Docker, Kubernetes, and CI/CD pipelines with MLOps practices such as model monitoring, validation, governance, and A/B
testing. Focused on building reliable, cost-efficient, cloud-native AI solutions through prompt optimization, model routing, and performance
tuning while meeting strict production SLOs.
TECHNICAL SKILLS
Programming Languages: Python, SQL, Java, C++, Go
LLM / GenAI: OpenAI, Azure OpenAI, GPT-4, Prompt Engineering, Function Calling, Tool Calling, Context Management, LLM Evaluation, Guardrails
Agentic AI: Multi-Agent Workflows, Memory, Routing, Reasoning Agents, Autonomous Execution
RAG & Retrieval: Retrieval-Augmented Generation (RAG), Document Chunking, Embeddings, Semantic Search, Vector Search
Frameworks: LangChain, LlamaIndex, Hugging Face Transformers, PyTorch, TensorFlow, FastAPI
Machine Learning: Scikit-learn, XGBoost, Random Forest, spaCy, NLP, Computer Vision, Time-Series Analysis, Anomaly Detection,
Predictive Modeling, Feature Engineering, Model Explainability (SHAP, LIME)
MLOps: MLflow, Model Monitoring, Drift Detection, CI/CD for ML, Model Validation, Governance, A/B Testing
Cloud & DevOps: AWS (EC2, S3, Lambda, Bedrock, SageMaker, DynamoDB), Google Cloud Platform (GCS, Cloud Run, Vertex AI, BigQuery),
Docker, Kubernetes, Databricks, Linux, Cloud-native Deployments
Vector Databases: FAISS, Pinecone, ChromaDB, Elasticsearch, Weaviate
Data & Pipelines: Airflow, Prefect, ETL Pipelines, Spark, PySpark, Pandas, NumPy, Data Validation
APIs & Services: RESTful APIs, FastAPI-based Microservices, Async Services
Databases & Caching: PostgreSQL, MySQL, MongoDB, Redis
Observability: Production Monitoring, Logging and Alerting, Incident Diagnostics, Automated Remediation, Prometheus, Grafana
PROFESSIONAL EXPERIENCE
Goldman Sachs
Jan 2025 - Present
AI/ML Engineer
• Built and deployed agentic AI systems integrated with observability platforms to automate production diagnostics for dozens of enterprise services,
reducing MTTR by 38% across distributed environments.
• Productionized LLM + RAG pipelines with tool/function calling, adaptive context management, and automated validation loops for
scalable workflow optimization, reducing manual operational effort by 45%.
• Implemented LLM governance frameworks, safety guardrails with content moderation, and multi-metric evaluation pipelines to monitor
hallucinations and unsafe outputs under live traffic, reducing operational risk by 52%.
• Designed and optimized semantic retrieval systems using dense embeddings and vector databases (FAISS/Pinecone) to power scalable AI
copilots for real-time troubleshooting and enterprise knowledge search.
• Optimized inference cost and performance using prompt caching, model routing, and token optimization, lowering cost by 36% while
meeting sub-200ms SLOs under peak traffic.
• Designed and delivered an AI‑driven regulatory and research intelligence platform using Python, LangChain, AWS Bedrock, SageMaker,
EC2, S3, and EKS, cutting analyst research time by 25% and saving an estimated $1.2M annually.
Neon IT Systems
Feb 2019 - Aug 2023
Associate AI/ML Engineer
• Architected scalable enterprise AI copilots using production-grade LLM orchestration and multi-agent workflows with contextual
memory/routing logic, driving 38% higher autonomous task completion rates.
• Productionized end-to-end ML pipelines for NLP, computer vision, and patient readmission risk prediction with enterprise XGBoost/Random
Forest, improving production model reliability by 34%.
• Engineered high-performance embedding pipelines and hybrid semantic similarity search using transformer models for real-time document
retrieval and EHR data extraction, boosting retrieval relevance by 31%.
• Designed scalable PySpark ETL workflows and advanced NLP pipelines (Spacy) to process 50M+ disparate patient records across sources,
slashing data prep time from days to hours.
• Developed production FastAPI-based asynchronous AI backend services and model endpoints on AWS EC2/AKS Kubernetes, reducing
conversational AI latency 45% to sub-200ms inference.
• Implemented enterprise explainable AI (SHAP/LIME), SQL data validation and early LLM prompt testing frameworks, cutting model
inconsistencies and readmission prediction errors by 37% in production.
EDUCATION
Master of Science in Data Analytics
Indiana Wesleyan University, Marion, Indiana, USA
April 2025
Bachelor of Science in Computer Science
Sathyabama University Chennai, Tamil Nadu
April 2021
Certifications
•
•
AWS Certified Machine Learning – Specialty (MLS-C01)
AWS Certified Solutions Architect – Associate (SAA-C03)
PROJECTS
Customer Churn Prediction Model
• Built churn prediction model using logistic regression and random forest (Python, Pandas, Scikit-Learn), achieving 87% accuracy and
increasing campaign response rates by 22% through SQL-based behavior analysis and feature engineering.
Inventory Optimization Analysis
• Developed demand forecasting models (Python, Pandas, Scikit-Learn) and performed ABC/Pareto analysis (SQL, Excel), improving
inventory planning accuracy by 31% and reducing excess inventory by 18%.