Awais Masood
London, England, United Kingdom - - in/awaismsd
SUMMARY
4+ years of experience in Data Engineering & Data Architecture providing both on-premise and cloud-based solutions across multiple business domains namely healthcare, financial institutions, manufacturing, and telecommunications. Hands-on experience across multiple Data Platforms including Azure Cloud, AWS Cloud, and Open-Source Apache technologies (HDFS, Spark, Hive, Kafka). Proven track record of leading daily production operations and delivering results through strong analytical and problem-solving skills.
EXPERIENCE
Data Engineer
ADDO.AISeptember 2021 - December 2023,
• Collaborated closely with client stakeholders to pinpoint challenges, construct essential data and reporting frameworks, and consistently improve performance by leveraging system and jobs data.
• Designed a transactional data lake architecture, developed custom Change Data Capture (CDC) processes capable of efficiently processing incoming delta streams, and relevant data changes.
• Supervised a services team, focusing on data quality analysis and formulating a robust strategy for successful disaster recovery implementation.
Data Engineer
AFINITIJanuary 2021 - September 2021,
• Developed reusable data-ingestion framework, optimization of SQL queries, modification of ETL scripts, SPs, and reports.
• Managed end-to-end daily production operations, client's weekly sales report, audit logs, and trigger-based early warning framework to achieve operational excellence.
• Created and delivered comprehensive technical documentation for the entire project, ensuring clear understanding and knowledge transfer.
ETL / BI Developer
TenX.aiJuly 2019 - January 2021,
• Partnered with financial institutes on complex EDW architecture, strong data warehouse development lifecycle competency, and expertise to explore business problems and craft solutions.
• Designed table schemas for data sinks to retrieve data in a time-efficient way, decided partitioning columns & projections.
• Implementation of FSLDM & architectural principles of data warehouse, MDM, CDC & SCD.
• Performed Unit testing, Integration testing, exception handling, and logging of all the ETLs and transformation functionalities added.
SKILLS
AWS Data Platform Redshift, RDS, S3, Athena, Glue, Lambda , EC2, QuickSight, Connect
Azure Data Platform Databricks, Data Factory, Data Lake, Synapse
Big Data Stack Spark, Hadoop, HDFS, Hive, YARN, Kafka
Visualization Tools PowerBI, Tableau, QuickSight, Qlik
Databases Vertica, IBM DB2, Greenplum, PostgreSql, Oracle
ETL & Other Tools Talend, IBM Datastage, Alteryx, Apache Airflow, Git, Cloudera, Salesforce, Jira
Languages & Libs SQL, Python, Java, JS, JSON, PySpark, Pandas, Numpy, Matplotlib
Hard Skills MDM, CDC, SCD, BI, CI/CD, Agile, Data Quality, Data Governance, Database
Management, Data Modeling, Client Relations
EDUCATION
BACHELOR OF SCIENCE IN COMPUTER SCIENCE
National University of Computer and Emerging Sciences • 2019
PROJECTS
Crisis24 - GardaWorld ( UK )
ADDO.AI • March 2023 - December 2023
• Led and managed a team of 5 data engineering resources as the Team Lead, providing guidance, coordination, and supervision to drive successful data engineering projects and initiatives.
• Actively collaborated with multi-disciplinary teams at Crisis-24, participating in requirement-gathering sessions to ensure a comprehensive understanding of project needs.
• Designed a scalable infrastructure and automated resource creation, glue jobs, and data ingestion using Infrastructure as Code (boto3), alongside creating ETL templates for dynamic data transfer to Salesforce with UPSERT logic.
• TECH STACK: AWS ( IAM, Glue, Athena, S3, RDS, Appflow, Python, Spark, Salesforce )
MTM Healthcare Solutions ( USA )
ADDO.AI • August 2022 - February 2023
• Built Deltalake, maintained source tables replica on S3 raw layer, and developed a metadata-driven data ingestion pipeline.
• Migration of custom-built MS SQL Server stored procedures to Amazon Redshift, refined data for ad hoc reporting, and integration with PowerBI.
• TECH STACK: AWS Redshift, RDS, S3, Athena, Glue, Crawlers, Catalog, PowerBI, Atlassian
Telenor | Pakistan's Largest Telecom Provider
ADDO.AI • October 2021 - September 2022
• Development of Spark Jobs, implemented big data solutions, writing complex SQL queries, GAP analysis, strategic planning for deployments.
• Audit key subject areas according to business needs, database optimization, and resolution of data quality issues.
• TECH STACK: Cloudera, Hadoop, Spark, Hive, Talend, Vertica, Qlik
Virgin Media ( UK )
Afiniti • February 2021 - September 2021
• Improved AI model performance, by enhancing data integration, build automated QA processes to ensure data quality.
• SQL to Greenplum migration, streamline architectural changes into the production.
• Improved data accuracy which helps clients achieve a 25% increase in retention & upsell.
• TECH STACK: Talend, Python, Greenplum, MSSQL, T-SQL, Parquet, Confluence
Interloop | Top Global Hosiery Manufacturer
ADDO.AI • August 2020 - January 2021
• Led the development team and implemented end-to-end ETL process and precedence, design & developed SCD & CDC.
• Performed integration testing to ensure data integrity, implemented best practices, successfully delivered the project.
• TECH STACK: Talend, ORACLE, MySQL, PowerBI, Erwin Data Modeler
Faysal Bank
TenX.Ai • January 2020 - August 2020
• Development of ETL jobs, data pipelines, unit testing, and integration testing of DWH.
• Managed secure data transfer, deadlock recovery, ensured data integrity and design strategy to deploy builds within tight deadlines.
• Analyzed the system and reduced the data quality issues down to 25% by modifications.
• TECH STACK: Talend, VERTICA, Tableau
CERTIFICATIONS
IBM Spark
IBM • 2023
IBM Big Data Foundations
IBM • 2023
IBM Data Science for Business
IBM • 2023
IBM Hadoop Data Access
IBM • 2023
IBM Data Visualization Using Python
IBM • 2023