VIVEK HOSAHALLI NARAYANAREDDY
Data Scientist
-
-
2975 Washington Street, Apt 2, 02118 Boston, United States
@hnvivek
@vivekhn
@viveknarayanareddy
Tableau public profile
EDUCATION
Master of Professional Studies in Analytics, Northeastern University
Sep 2019 – Apr 2021 | Boston, MA
A cross-disciplinary master’s degree with a curriculum exploring data science, machine learning, and business informatics.
Coursework: Probability Theory and Introduction to Statistics, Data Mining in Python, Predictive Analytics, Programming Languages,
Data Structures, Big Data (Hadoop, Hive, MapReduce, Spark), Deep Learning, Data Visualization (Qlikview, Tableau, Advanced Excel
(Pivot tables, v-lookups), Google Data Studio), Text Analytics, AWS Cloud Architecting
Bachelor of Engineering in Telecommunication,
Visvesvaraya Technological University
Coursework: Mathematics, Programming in Core Java, C#, SQL, HTML
Aug 2012 – Jul 2016 | Bangalore, India
PROFESSIONAL EXPERIENCE
Tata Consultancy Services
Business Intelligence Developer
May 2018 – Aug 2019 | Bangalore, India
Improved the efficiency of data analysis and decision making by constructing a Python-based pipeline for an email alert system
that sends daily reports
Using advanced PL/SQL concepts to implement complex ETL logic and develop Production-ready code
Cooperated with teams across the globe to build global reports and presented data-driven insights to clients
Mentored a team of 5 newly employed trainees on the project to follow best practices in agile development and QA
Associate Systems Engineer
Mar 2017 – Apr 2018 | Bangalore, India
Created Stored Procedures, user-defined functions, views for handling business logic, functionality and data validation
Analyzed and optimized existing SQL queries, improving efficiency by 20% in a production environment
Automated scripts in Python to extract, transform, clean (incl. anomaly detection), and load into SQL from multiple data sources
allowing easy merging with an existing schema
ACADEMIC PROJECTS
Predicting Employee Attrition (Tensor Flow, Keras, Hyperparameter tuning, Jupyter Notebook)
Develop ML models using ensemble and boosted methods to predict using Tensor Flow and sklearn in Python to identify likelihood
of an active employee leaving company
Implemented feature engineering techniques including feature scaling, handling outliers, imputing missing values, and one-hot
encoding to include categorical variables for training model
Created data pipelines and trained interpretable Classification models (Logistic Regression, Random Forest, Artificial Neural
Network) on data and accomplished a max prediction accuracy of 89.40% and ROC of 0.77
Twitter Sentiment Analysis | (Python, Supervised Machine Learning algorithms, NLP Tool kit - NLTK)
Dynamically retrieved and analyzed data(tweets) from Twitter with continuous integration using Twitter API and classified them as
positive, negative, or neutral sentiment
Examined the gigabytes of structured and unstructured data and Conducted exploratory data analysis and methodologies to answer
key questions, synthesize insights, make inferences, and construct narratives from the data
Achieved accuracy of 97% with SVM and identified the top 10 frequent unigrams, bigrams, and visualized Wordcloud for analysis
New York Tree Census Analysis | (Statistical analysis, Data Mining, Tableau, R shiny)
Teamed with classmates and performed in-depth Quantitative analysis to identify causes affecting tree health status
Designed Tableau dashboard remotely and overcame collaboration challenges using Slack, Docs, Slides, and GitHub
Hand-written digit classification on the MNIST dataset using CNNs (ANN, Keras, Normalizing, Data Augmentation)
Trained CNN model to classify composed images of handwritten digits (0-9), then data was split into a training set of 50,000 images
and a test set of 10,000 where each image is of 28 x 28 pixels in width and height.
Applied Data Augmentation to increase train set and CNN model accomplished 98% accuracy with a single convolution layer
Built a user-friendly stream lit web app in python to interact with model and deployed with EC2 instance on Amazon Web Services
CERTIFICATES AND AWARDS
Neural Network from Scratch in
TensorFlow (Jun 2020)
Advanced SQL for Data Scientists (May
2020)
Star Team Award - Tata Consultancy
Services (2018)
Machine Learning (Stanford Online) Coursera (Jul 2020)
Spot Award - Tata Consultancy Services
(2019)
GE Digital Technology Data Analytics
Program - (Forage Jun 2020)
SQL Course (SoloLearn May 2020)
TECHNICAL SKILLS
Cloud Computing: EMR, S3, Cloud watch, Lambda Functions, Cloud Formation Template, Data Pipeline, API Gateway |
Python: Numpy, Pandas, Sklearn, Seaborn, Keras, Tensorflow, Pytorch | Data Visualization: Tableau, Matplotlib, Google Data studio |
Version Control: GitLab and SVN | SQL: RDBMS (Microsoft SQL and PostgreSQL), Hive QL | Shell Scripting and AWS CLI | Microsoft
OfficeSuite | Soft Skills: Interpersonal | Leadership | Problem-Solving | Time Management