Satya Swarup Satpathy- |-Odisha, India | linkedin.com/in/sss7722 | github.com/sss7722
Profile
Data Engineer with expertise in building large-scale data pipelines, ETL processes, and data warehouse solutions.
Proficient in Python, SQL, Spark, and Databricks, delivering scalable big data solutions.
Technical Skills
Languages:
Big Data Technologies:
Streaming Technology:
Cloud Computing:
Data Engineering Tools:
Orchestration:
Familiar:
Python, SQL, Scala
Spark, PySpark, Spark SQL, Hadoop, Hive, Databricks
Spark structured streaming
AWS, Azure, Databricks
ETL/ELT data pipeline, Git, Jenkins
CA Autosys
AGILE Methodology, JIRA
Experience
Associate Engineer
October 2022 - Present
IRIS Software
Noida, India
• Managed customer data, KYC information, transactions, and geographical data for a banking project.
• Implemented PySpark to manage large data volumes, migrated Hive scripts to PySpark, optimizing ETL processes and
reducing processing time by 40%.
• Resolved performance bottlenecks in Spark scripts, reducing processing time by 15%.
• Worked on AML (Anti-Money Laundering) system, generating alerts for suspicious customers and transactions, and
evaluating geo-risk data.
• Ensured smooth data flow across layers (L1 to L5) and participated in UAT/SIT testing and production deployments.
• Demonstrated expertise in automation, unit testing, big data management, and data warehousing.
Programmer Analyst
October 2019 - June 2022
Cognizant
Kolkata, India
• Managed the global flow of orders and apparel for a leading sports brand based on season and region.
• Developed and executed test cases for Automation use cases, handling data cleaning, bad data management, and
implementing data handling protocols.
• Utilized BIS MAPPER for coding, coordinated with the development team to address defects, and drafted test plans.
• Expanded Automation scope, participated in Agile Sprint ceremonies, and conducted intensive manual QA testing.
• Performed Debugging and provided 24-hour job monitoring support for COBOL jobs, facilitating global order transfers.
Education
Bachelor of Technology
Vellore Institute of Technology
Intermediate: 12th
Sri Chaitanya Techno
Aug. 2015 – May 2019
Vellore, Tamil Nadu
Aug. 2012 – May 2014
Visakhapatnam, India
Certificates
Databricks Certified Associate Developer for Apache Spark 3.0
Databricks Certified Data Engineer Associate
Academy Accreditation - Databricks Lakehouse Fundamentals
AGILE Methodology: Certified SAFe 5 Practitioner
Projects
Banking Data Processing
• Collected and prepared data from various upstream systems, ensuring thorough cleaning and transformation.
• Sent final data as JSON files to Kafka topics, detailing active accounts and associated parties.
• Leveraged Master Data Management (MDM) system to manage and integrate data from diverse sources.
• Delivered prepared data to business analysts and data scientists through Kafka topics.
COVID-19 Data Pipeline
• Developed a pipeline in Azure Data Factory to fetch and process COVID-19 data from Europe’s ECDC website.
• Utilized HTTPS Linked Service to extract and store data in BLOB storage.
• Applied Azure Data Flow and Spark Scripts to analyze data, generating insights.
• Transferred processed data to a different database or Hive for further use.