TARUN SHARMA
Professional Summary
AI/ML, Data Science Expert
● I have 5+ years of experience working with AI/ML stacks.
● Worked as Lead Data Scientist/AI/ML expert with expertise in driving business success
through advanced data analytics, machine learning, and AI solutions.
● Passionate about using high-dimensional datasets and innovative modelling techniques
to deliver actionable insights and foster product improvement.
● Seeking an assistant leadership role to spearhead cutting-edge solutions in a challenging,
innovation-driven environment.
Technical Proficiency
Languages
Python, R, SQL, C++
Machine Learning
Regression, Classification, Clustering, Time Series Analysis,
NLP, Deep Learning, Deep Learning, Statistics, Hypothesis
testing, A/B testing, Pattern recognition, LLMs
Big Data
Hadoop, Spark, Flink, Spark Streaming
Tools
TensorFlow, PyTorch,
Matplotlib, Seaborn
Cloud
AWS, Azure, Google Cloud Platform
MLOps
Docker, Git, Jenkins, Model Monitoring, Drift Detection
Data Visualization
Tableau, Seaborn, Matplotlib
Databases
MySQL, NoSQL, MongoDB, PostgreSQL
Scikit-learn,
Pandas,
NumPy,
Qualifications
● Master's Degree in Data Science and Analytics - IIITM-K | July 2019– April 2021
● Bachelor's Degree in Computer Science Delhi University | May 2016– August
2019
Final grade: 7.824/10.0
Professional Experience
1. Newgen Software [ Lead Data Scientist ] - Feb 2024 - Jul-.
Number Theory [ Senior Data Scientist ] - Feb 2023 - March 204
Number Theory [ Data Scientist ] - July 2021 - Feb 2023
Number Theory [ Data Science Intern ] - Jan 2021 - July 2021
Indian Institute of Information Tech & Mgt-Kerala - [ Data Science Intern ] Dec 2020 - May 2021
6. REFLECTIONS INFO SYSTEMS - [ Data Science Intern ] - Jun 2020 - Nov 2020
Project 1:LYFE
Role: Lead Data
Scientist
Live Link: https://dev.lyfeco.ai/dashboard/documents/
Description: Intelligent Document Assistant which supports various
domains such as Healthcare, Personal, Financial documents.:
Features Developed:
1. Document Processing
2. Document Query
3. Summarization
4. Image Cleaning, Enhancements
5. Date Capturing
6. Text2Sql
7.
AutoDoc
Tech-Stack Used:
● Python, FastAPI, Celery, Redis, HuggingFace
● PostgreSQL, AWS (S3, ECS), Linux
● Jenkins, Docker, GitLab
● Google Vision, OpenAI, Mistral, Phi-3.5
Classification
Project 2: Health
Jeanie
Live Link: https://www.jeanie.health/
Role: Lead Data
Scientist
Description: Healthcare Care Coordination Platform. Complete care
coordination platform which has Risk Assessment, Nutrition analysis,
Medical Workflows for Nurse and Other Medical Practitioners.
Features Developed:
1. Ask Jeanie - Intelligent ChatGpt feature particularly for Healthcare
domain.
2. Intelligent Chat - Socket based implementation of chat with images,
document and other features.
3. Agentic Workflows - Complex agentic workflows implemented for
healthcare
4. Risk Assessment - Risk assessment for various diseases such as
diabetes-I, II.
5. Data Pipelines - BigQuery Data pipelines , HL7, and FHIR.
6. Advanced RAG
Tech-Stack:
●
●
Python, FastAPI, Langchain, Raptor
GCP DataStore, BigQuery, Redis, PubSub
Project 3: Face Swap
Live Link: https://faceswapper.ai/
Role: Senior AI/ML
Developer
Description: Worked with various image and video enhancement models
to deliver APIs for video face-swapping and video enhancements such as
deblur, denoise, superresolution, sharpen, and watermarking.
Role and Responsibilities:
●
●
●
●
●
Created face swap APIs to change faces in videos and dump
modified videos into S3 buckets.
Implemented frame-wise transformations with storage in S3
buckets.
Built APIs for video sharpening, deblurring, superresolution, and
denoising tasks.
Added watermarking and trimming features for videos.
Delivered a complete solution with all features integrated and
final video dumps into S3 buckets.
Tech-Stack: Python, FastAPI, NLP Models, LLM Models,
HuggingFace
Project 4: Application
Risk Scorecard
Description: A system predicting the probability of loan defaults for retail
banking, improving underwriting efficiency.
Role: Lead Data
Scientist
Role and Responsibilities:
●
●
●
Built and deployed over 20 models.
Created risk buckets and final score calculations.
Handled data migration and engineering.
Tech-Stack: Spark, Python, FastAPI, React, Shap, Oracle,
Redis
Project 5: Cross Sell
Model
Description: Developed a model recommending additional banking
products to existing customers.
Role: Lead Data
Scientist
Role and Responsibilities:
●
●
Data engineering and model building.
Integrated marketing alignment for model usage.
Tech-Stack: Spark, Python, FastAPI, React, Redis
Project 6: Collection
Risk Scorecard
Role: Lead Data
Scientist
Description: Built a system evaluating post-loan disbursement risk.
Role and Responsibilities:
●
●
Model retraining and deployment.
Improved prediction accuracy over time.
Tech-Stack: Spark, Python, FastAPI, React, Redis
Project 7: Customer
Lifetime Value
Description: Calculated lifetime value for customers using survival
analysis and business model metrics.
Role: Lead Data
Scientist
Role and Responsibilities:
●
●
Developed resubscription identification modules.
Designed models for subscription and ads-based businesses.
Tech-Stack: Spark, Python, NextJS, Shap, Docker
Project 8: Document
Processing
Description: Processed various document types with features like
classification, data extraction, and export.
Role: Lead Data
Scientist
Role and Responsibilities:
●
●
Built an end-to-end document processing system.
Conducted image processing and model serving
Tech-Stack: PyTorch, FastAPI, MongoDB
Project 9: Customer
Segmentation
Description: Designed a system for customer segmentation based on
behavioral patterns.
Role: Lead Data
Scientist
Role and Responsibilities:
●
●
Developed clustering algorithms for customer segmentation.
Delivered insights to improve marketing strategies.
Tech-Stack: Python, R, Tableau, PostgreSQL
Project 10: Fraud
Detection System
Description: Created a real-time fraud detection solution for financial
transactions.
Role: Lead Data
Scientist
Role and Responsibilities:
●
●
Built anomaly detection algorithms.
Integrated real-time alerts using Kafka streams.
Tech-Stack: Python, Spark, Kafka, PostgreSQL
Project 11: Image
Recognition System
Role: AI/ML Developer
Description: Developed a system for image classification and recognition.
Role and Responsibilities:
●
●
Created CNN models for object detection.
Deployed models via Flask APIs.
Tech-Stack: TensorFlow, Keras, Flask, MongoDB
Project 12: Predictive
Maintenance
Role: AI/ML Developer
Description: Built predictive models to forecast equipment failures.
Role and Responsibilities:
●
●
Analyzed time-series data for predictive insights.
Delivered dashboards to visualize maintenance schedules.
Tech-Stack: Python, Spark, SQL, Tableau
Project 13:
Recommendation
System
Role: AI/ML Developer
Description: Designed a recommendation engine for e-commerce
platforms.
Role and Responsibilities:
●
●
Built collaborative filtering algorithms.
Improved user engagement through personalized
recommendations.
Tech-Stack: Python, FastAPI, Redis, PostgreSQL
Project 14: Sentiment
Analysis Tool
Description: Developed a tool for sentiment analysis of customer
reviews.
Role: AI/ML Developer Role and Responsibilities:
●
●
Processed text data for sentiment classification.
Delivered API endpoints for real-time analysis.
Tech-Stack: Python, NLTK, spaCy, Flask
Personal Profile
Expertise
Language/(s)
●
●
●
AI/ML and Data Science Development
Data Engineering
Back-end Architecture, Architect Engineering
●
Code Review, Code Coverage, Code Review, Project Management
●
AWS, Azure, GCP
●
DeVops, MLops
●
Document Processing – Trade Finance, Insurance, Healthcare,
KYC etc
●
Healthcare Data and different standards HL7, FHIR
●
Compliance GDPR, HIPPA, PIPA and PIPEDA, EU AI Act
●
LLMs and its Finetuning
●
●
English
Hindi