Srinivasa Dileep Patsamatla
Hyderabad, India
Mobile: -
Email:-
Professional Summary:
Seasoned Data Professional with 16+ years of experience in Data Engineering, Development for ETL, Business Intelligence,
Data Quality, Data Warehousing with industry standard Enterprise tools.
Over 4+ years of extensive experience in Leading and Developing Data Pipelines using Pyspark, dbt, python, Snowflake,
Airflow, Matillion etc.
Technical Skills:
•
•
•
•
•
•
Astronomer Airflow
Snowflake
AWS
Pyspark
Jira
Presto
•
•
•
•
•
•
Matillion
Redshift
Azure
Jenkins
SQLDBM
Teradata
•
•
•
•
•
Python
dbt
Databricks
Github
Hive
Education:
•
Master of Computer Applications (MCA) - 2007.
Certifications:
•
Teradata Certified Associate from Teradata
•
Azure Fundamentals from Azure
•
Introduction to Gen AI (Internal)
Professional Experience:
Autodesk, India
Role Principal Data Engineer
Project: EDH
July 2022 – Till Date
Description:
Autodesk EDH (Enterprise Data Hub) is a centralized data lake involving all internal applications from across multiple
vertical businesses inside Autodesk. Data ingested into the data lake and processed for analytics purposes.
Responsibilities:
•
•
•
•
•
•
•
•
•
•
Performed Architectural tasks such as Understanding Data attributes for the Use case and Created Data Models
and Data Mapping.
Reviewed Data Model and mapping with Distinguished Architects.
Developed and Deployed Data pipelines using Matillion and redshift to load and transform data.
Performed Data Quality Checks on each of pipelines developed.
Created Framework Code based on the source so Developers can use it for future requirements.
Performed Several POCs on the New technologies to understand better capabilities against internal tools.
Performed code reviews regularly and provided feedback on code quality.
Worked with Several Vertical and Horizontal teams to collaborate more in understanding the Requirements,
getting Test data, setting up the Environment etc
Performed Data Governance Assessment to be process compliant.
Performed Release management activities and support to production environment
•
•
•
Deployed several utilities to speed up with Development.
Conducted/Attended Daily scrum calls to assign the tasks and get updates.
Timely updates to management on the risks that encountered during the development.
Environment: Matillion, Airflow, dbt, Snowflake, Redshift, jira, Atlassian Wiki, Atlan,SQLDBM, Docker , Nexus,
GitHub.
T-Mobile, Bothell
Role Sr. Project Lead
Project: IDW
July 2015 – Jun 2022
Description:
T-Mobile IDW is an Integrated Data warehouse solution from T-Mobile that has large scale integration between Legacy
data warehouse systems like FADS, EDW. This project implements unified Data Architecture with Hadoop and Teradata
that will bring the best of both worlds i.e. low cost storage and parallel processing of voluminous data in Hadoop and high
performance Analytics on Teradata.
Responsibilities:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Involved in Source System Analysis along with SME’s and data architects.
Extract the data from Various Source system using DMF and BEAM to ingest data in to Data lake Part of the
solution architecture team and implemented optimizations techniques include partitioning, bucketing.
Developed a strategy for Full load and incremental load using Sqoop.
Closely working with Architects to provide the solutions to multiple teams.
Implemented the DevOps model.
Single POC for code deployment to QA and Production.
Developed Spark-SQL code for faster testing and processing of data.
Implemented and extracted the data from Hive and HDFS files using SPARK.
Configured and used CI-CD tools (GitHub, Nexus, Gerrit and Jenkins) for code deployment.
Hadoop cluster Monitoring/Alerts using the Ambari tool.
Extensively used the Pig for Transformations.
Worked with Hortonworks team for the configuration issues.
Implemented LAD (Late Arriving Dimension) functionality in Hadoop.
Implemented optimization and performance tuning in Hive and Pig.
Created oozie workflows to run Hadoop jobs.
Extensively worked on History loads to bring the data into Hadoop.
Created the Views on Hive, Hive on top of HBase tables.
Created HBase tables for loading SCD2 model.
Created the Control M Jobs for daily, weekly Job.
Configured and Maintained source code in GitHub.
Created Hive tables for SCD1 Insert/Upsert models.
Involved in creating Hive table, loading with data, writing hive queries to test the test cases.
Exporting test data into Teradata using Bteq, Sqoop.
Lead the team of 4 offshore developers and keeping track of allocation regularly.
Performed code reviews regularly and provided the feedback on code quality.
Environment: Hortonworks hdp 2.3 stack tools, HP ALM, Rally, Jenkins, Nexus, GitHub.
Tele2/Cognizant, IN
Role Sr Developer/Associate.
Project Tele2 EDW
May 2013 – Jun 2015
Description:
Tele2 AB is one of Europe's biggest telecommunications operators, providing a range of services including fixed-line and
mobile phone, cable TV, data transactions, and internet access. Tele2 EDW is an Enterprise Datawarehouse solution
which enables business users to understand customer churn, usage, revenue etc. Multiple source systems have been
identified from various countries and they feed in data into the Landing area from where country specific transformations
are applied and loaded into the staging area. Downstream reporting systems use Cognos to visualize the data.
Responsibilities:
• Participated in requirement analysis and creation of data solution using Hadoop.
• Involved in analyzing system specifications, designing and developing jobs.
• Worked on ingestion process of the Call log data into Hadoop platform.
• Responsible to deploy and configure Hadoop jobs to QA/Prod environments.
• Created process to web log data enrichment, page fixing, sessionization and session flagging.
• Responsible for creating Hive tables, partitions, loading data and writing hive queries.
• Migrated existing inbound processes from legacy system to Hadoop.
• Transferred historical data from Oracle into Hive using Sqoop jobs during off peak hours.
• Worked with business teams and created Hive queries for ad hoc analysis.
• Participated in the coding of application programs for the out bounds
• Build common purge process to remove the old tables
• Strong problem solving, reasoning, Leadership experience and analytical skills
• Ability to work constructively with developers, QA Team, Project managers, and Management towards a common
goal
• Strong problem solving, reasoning, and analytical skills
• Participated at assigned user conferences, user group meetings, internal meetings,
Prioritization, Production work list call etc
• Well conversant with software testing methodologies including developing Design documents, Test plans, Test
scenarios, Test cases and documentation.
• Prepared documents for trouble shooting.
• Creating and executing SQL queries on an ORACLE database to validate and test data
Environment: Apache Hadoop, Oracle, hive, Unix, SharePoint.
Autodesk, San Rafael, CA
Role Sr. BI Engineer
Project Project Dasher
Jul 2012 –Apr 2013
Description:
Autodesk Building performance project is a product from Autodesk to deliver green energy solution in Building
Performance Management. Application typically gets large volumes of Sensor Data from Backnet Sensor Network and
exports it to HBase Database using REST APIs. Each building will be located across the world. Dasher admin application
is used to manage Company, Site, Building information and Dasher Client is a downstream system which will forecast Real
time & historic sensor data in 3D Building model. Time Series Roll up algorithm for inserting rolled up sensor information
into HBase tables.
Responsibilities:
• Installed Hadoop Cluster in Amazon EC2 instances using Cloudera distribution(CDH4).
• Validate Source files column by column and make sure file structure is according to requirements.
• Developed Framework to perform create/delete/get/update using RESTAPIs
• Dev/QA Hadoop environment setup with Cloudera CDH4 version
• Deployed and configured code across Dev/QA/Pre-Prod/prod environments.
• Troubleshooting the issues and provided RCA for critical production defects.
• Validate ETL Transformations in HBase database for rolled up readings.
• Prepared PIG scripts to validate Time Series Rollup Algorithm.
• Designed Automation scripts using REST Assured Framework to validate REST API data.
• Worked with internals of Map Reduce, HDFS & had been part of building Hadoop architecture
• Installed Oozie and schedule Oozie flows for MapReduce jobs
• Responsible for support, troubleshooting of Map Reduce Jobs, Pig Jobs and maintaining Incremental Loads at
daily, weekly and monthly basis.
• Tasks created using Jira and track burndown chart regularly to report to manager.
• Constant co-ordination between development team to make sure open bugs are fixed.
• Attend Grooming calls on how to improve upcoming sprints.
Environment: Amazon EC2, Apache Hadoop, HBase, PIG, Oozie Java, Rest Assured Framework, Perforce, JIRA.
Infosys, Hyderabad, IN
Role QA Developer ETL
Project Fin ODS
Jul 2010 - Jun 2012
Description:
Fin ODS (Operation Data Store) is a Backend Store for all Modules under Finacle Web Applications. Typically ODS is data
warehouse which maintain Operational data of all modules of Finacle i.e. CASA, Term Deposits, Loans, Payment Systems,
Trade Finance. We also did POC on Hadoop for analyzing customer data for further analytics.
Responsibilities:
• Prepared Data Modals to satisfy the requirements and attended discussion with business teams to analyze the
requirements.
• Analysis of the mapping document indicating source tables, columns, data types, and transformations required,
business rules to be applied.
• Prepared ETL Jobs.
• Deployed ETL Jobs on Test/Prod Environment.
• Prepared UNIX Shell scripts to automate file copy from source server to landing server.
• Data Quality Validation- Check for missing data, mismatch, negative and consistency.
• Preparing and running ETL automation scripts.
• Unit testing of Data stage jobs.
• Developed SQL queries for backend or data base testing.
• Validated the source system data with the staging data using SQL.
• Prepared Traceability Matrix with requirements versus test cases.
• Conduct defect tracking meetings.
• Provided daily and weekly status reports.
• Coordinates activities between onsite and offshore testing team.
• Mentoring team, Attending daily scrum calls.
Environment: IBM Data stage 8.5, Apache Hadoop, Hive, Oracle 11G, Java, WebLogic.
Amdocs, Atlanta/IN
Role Infra Engineer
Project ATT DW
Feb 2008 – Apr 2013
Description:
ATT DW is an enterprise DW solution to provide historical customer data such as customer call records(CDR), usage etc
for reporting. Aggregate FACTs and dimensions are used for reporting needs for the business teams.
.
Responsibilities:
•
•
•
•
•
•
•
•
•
Involved in the phases of SDLC including Requirement collection, analysis of Customer specification.
Responsible for Installing and configuring Informatica workflows.
Prepared automation scripts to speedup day to day activities such as cleaning up of log files, space.
Coordinated with Development/QA/prod teams in resolving issues.
Advanced knowledge in performance troubleshooting the issues.
Timely resolution of issues and reporting to the team members by maintaining SLAs.
Setup Cron jobs to run cleanup scripts periodically.
Resolving tickets submitted by users in QC, troubleshoot the error documenting, resolving the errors.
Sending service ticket reports to the manager.
Environment: Informatica, Teradata, AMDOCS Cramer 8, Unix.