Saikrishna Vanamala
Chicago, IL |-| 312 - 684 – 4575 | linkedin.com/in/vsaikrishna93 | github.com/vsaikrishna93
Education
Master of Science in Computer Science
Expected May 2020
University of Illinois at Chicago (UIC), Chicago, IL
• Relevant Coursework: Cloud Computing, Data Science and Machine Learning, OO Programming and Environments, Database
Systems, Computer Algorithms, Computer Systems Security, Innovation and Entrepreneurship
Bachelor of Engineering - Electronics and Telecommunication-
Savitribai Phule Pune University (formerly University of Pune)
• Honors: University Rank 7th in the Sophomore Year.
• Coursework: Data Structures, Algorithms, Systems Programming, Operating systems, Computer Networks, AI
Professional Work Experience
Associate Data Engineer Intern
June 2019 – August 2019
Publicis Spine – Publicis Groupe
• As part of PeopleCloud solutions team, was responsible for design and development of a Data infrastructure platform – an
amalgamation of public and private cloud (AWS, GCP, Azure and On -premise).
• This platform provided the Data Scientists a one stop platform to perform Data Analysis for multiple clients of Publicis.
Data Engineer - Graduate Assistant
January 2019 – Present
UIC – School of Public Health (SPH)
• Data Integration using MS SSIS, SSRS, Power BI, Web development, Database administration.
Systems Engineer – Data and Analytics
August 2015 – July 2018 (3 years)
Infosys Ltd.
• Served a Financial Services client - Charles Schwab Co. as part of their Global Data Technology (GDT) team.
• Working Domain: Design and maintenance of optimal Data pipeline architecture for batch scheduling automation platform
- ETL/ELT, Map/Reduce using Hadoop, Spark with HDFS, Data-warehouse modelling, Data Quality, Data Governance, Master
Data Management, Disaster Recovery, visualization - interactive dashboard and reports
Technical Skills and Abilities
•
•
•
•
•
Proficiency in Programming: Industry level - Java, Python, Scala, SQL, Golang, C++, C.
Academic - C#, Matlab
Relational DBs: Teradata, MS SQL Server, MySQL, Oracle Distributed DBs/Filesystem: HDFS, HBase, MongoDB, S3
Software Frameworks: Hadoop - Map/Reduce, Apache Spark, REST API - SpringBoot MVC and Jersey, Java RMI, gRPC
Tools/IDEs: AWS (EMR, EC2, S3, Athena, Cloud9, Redshift, Kinesis), GCP (BigQuery, DataStudio, ML-Engine), Informatica
Power Center, MS - SSIS & SSRS, IntelliJ IDEA, Jupyter, Eclipse, Docker, Visual Studio, sbt, maven, git
ML Algorithms: Classification (Decision Trees, SVM), Regression (Linear, Ridge), Ensemble Learning, Clustering (K-Means)
Major Projects Undertaken
•
•
•
•
Design and Implementation of Disaster Recovery Plan for Enterprise Data-warehouse - [at Infosys Ltd.]
§ Disaster recovery mechanism ensured backup of data in case of system failure or damage. A backup storage was
designed for a Teradata Warehouse on a Hadoop cluster using Teradata Connector for Hadoop (TDCH).
§ The direct access of data from the Hadoop Cluster was facilitated using Hive SQL and custom Map/Reduce jobs.
End to End Financial Data Pipeline design and Infrastructure Maintenance - [at Infosys Ltd.]
§ Daily transactional data was stored in multiple forms as per the individual application design. These sources were downstreamed into a central Enterprise Data Warehouse using a complex Batch scheduling automation platform.
§ Designed and Developed ETL - Informatica Workflows to use efficient push down optimization and decrease latency.
§ Automated Data Integration using Teradata BTEQ scripts by combining functionality of FastLoad, MultiLoad and TPump.
Implementation of Page Rank Algorithm for a large XML dataset on Spark Cluster – [Academic at UIC]
§ Designed and implemented page rank algorithm to rank UIC CS faculty authors based on their co-authorships on
publications and also the venue of publications using Spark framework.
§ Deployed the spark application (assembled using sbt) on Standalone local cluster and also on AWS - EMR Spark Cluster
using YARN resource manager and HDFS as distributed storage system.
Design of a Prediction model to predict Effects of Unhealthy Behaviors on Chronic Diseases – [Academic at UIC]
§ Developed, visualized and designed a Prediction model to predict the effects of unhealthy behaviors such as Low
Physical Activity, Smoking and Lack of Sleep on two Chronic Diseases – Obesity and Cancer. Built a Linear model using
Linear Regression and Ridge Regression and improved prediction accuracy to 0.919 using Random forest regressor.