I am a Data Engineer with 11+ years of experience specializing in ETL, Big Data, and cloud-based data solutions. My recent work at RBC focuses on building scalable PySpark pipelines to process and transform large datasets in Cloudera environments. I’ve integrated Spark workflows with Helios, enabling data lineage, governance, and compliance for regulatory reporting and audit needs.
With a strong foundation in DevOps, I’ve contributed to CI/CD processes using UCD, implemented job orchestration via Autosys, and maintained collaborative workflows using Git, Jira, and Confluence. My experience spans both batch and real-time data systems, involving technologies such as Hive, HDFS, SQL, Oracle, Snowflake, and legacy ETL tools like Informatica.
I’m known for my problem-solving mindset, attention to data quality, and ability to align technical solutions with business goals. My approach emphasizes reusability, modular design, and performance optimization to deliver end-to-end data solutions in enterprise settings.