BALAKUMAR BALASUNDARAM
Associate Consultant
Mobile: -
Email:-Location: Stockholm, Sweden
PROFILE
EDUCATION
Working as Associate consultant at
Cognizant Technology Solutions PVT
Ltd. 7.5 years of extensive experience
in IT. Worked extensively with projects
using Kafka, Spark Streaming, ETL tools,
Big Data and DevOps. Engaged in all
phases of the software development
lifecycle which include: gathering and
analyzing
user/business
system
requirements, responding to outages and
creating application system models.
Participate in design meetings and
consult with clients to refine, test and
debug programs to meet business needs
and interact and sometimes direct third
party partners in the achievement of
business and technology initiatives.
Bachelors in Electronics & Instrumentation Engineering
August 2008 – June 2012
SKILLS
Associate Consultant, Cognizant, Kuala Lumpur, Malaysia
• Sept 2016 - Dec 2018
• Project: Standard chartered, Banking
Programming - Java 7/8 , Scala, Python,
R, Confluent Kafka, Kafka Streaming,
PySpark, Flink, Spark, BigData, Rest
API, Microservices, Web Services,
Oracle 12 c, SQL, Junit, Hibernate 3.x/4
JPA, Shell scripts, Docker, Jenkins
Pipeline.
Tools - Kubernetes, Docker
puppet, Ant, Maven, Gradle,
GIT, SVN, CI/CD Jenkins, Jira,
VMWare, JBOSS, Apache
WebLogic, Ambari, HDP
Ansible,
Eclipse,
Vagrant,
Tomcat,
Database - HBase, Hive, MongoDB,
Cassandra, Oracle
Cloud - AWS, Azure, GCP
•
Graduated with 8.17 CGPA from Anna university, Chennai
Tamilnadu, India.
WORK EXPERIENCE
Associate Consultant, Cognizant, Stockholm, Sweden
• August 2019 – Till now
• Project: Swedbank, Banking
Associate Consultant, Cognizant, Munich, Germany
• Jan 2019 – July 2019
• Project: Telefonica GmbH, Telecommunication
Senior Project Engineer, Wipro Technologies, Bangalore, India
• July 2015 – August 2016
• Project: Capital One, Banking
Project Engineer, Wipro Technologies, Bangalore, India
• March 2013 – June 2015
• Project: Cardinal Health, Healthcare
CERTIFICATION
•
•
Scaled Agile Framework (SAFE) 4.6 Practitioner
Udemy Course Completion certificate on Kubernetes, Spark,
Kafka, AWS, Azure, GCP and DevOps
EMPLOYMENT HISTORY
August 2019 - Till Date, Client: SWEDBANK, Domain: Banking
Project: Operational Data Lake (ODL)
Location: Stockholm, Sweden
Build a data lake operational frameworks for Sourcing Team to maintain business rules and privacy constraints for
source files, provision hadoop lookup data for abinitio graphs, handle data purging for Hadoop Data lake regions and
persist metastore information for sources subscribed to hadoop data lake in NoSQL database and develop framework to
govern the process of data integrity within hadoop data lake.
Responsibilities:
▪ Developed a metamodel framework to maintain business rules, privacy constraints at each source file level.
Framework is built on top of Scala SDK and uses Apache SPARK and HBASE API's to process the data and
persist metastore information in HBASE.
▪ Developed a technical reconciliation framework in Scala to automates the process of validating data Integrity
based on business rules which are predefined and maintained in the Meta model.
▪ Developed a REST based subscription API framework to maintain information of consumers and their
subscription to data within data lake.
▪ Utilized Kubernetes and Docker for the runtime environment of the CI/CD to build, test and deploy.
▪ Used Jenkins pipelines to drive all kafka/pyspark application builds out to the Docker registry and then
deployed to Kubernetes, Created Pods and managed using Kubernetes
▪ Maintained Docker container clusters managed by Kubernetes Linux, Bash, GIT, Docker, on on-premise HDP
cluster. Worked on ansible for Configuration Management and executed deployments in both active and
passive cluster.
▪ Worked on building Jenkins pipeline to manage Confluent Kafka Docker container services and manage custom
kafka apps to create Kafka topics, register schema, deploy Kafka connectors and plugins and dockerize custom
Kafka streaming applications.
▪ Configured logspout and system-ng to stream docker container logs and integrated the logs with Kibana.
▪ Worked on creating Jenkins Jobs for CI/CD and generate unit test reports of java/scala based applications and
integrate code quality tools like SolarQube and manage sonar rules.
▪ Developed a CI/CD pipeline to package and deploy objects of abinitio graphs, PySpark modules, Kafka
frameworks, microservices, rest applications and web applications in HDP Active and passive(Disaster
Recovery) cluster.
Skills: PySpark, Confluent Kafka, Spark Streaming, Rest API, microservices, HBase, Hive, Hadoop, Jenkins pipeline,
Ansible, Docker, Java, Scala and Maven.
January 2019 – July 2019, Client: Telefonica, Domain: Telecommunication
Project: Event Processing Engine (EPE)
Location: Munich, Germany
Build a streaming ingestion application to stream CDR events, categorize the events and aggregate CDR events.
Process the CDR events and makes the data purposeful by generating business reports of wholesale bills and customer
usage summaries. Developed CI/CD Jenkins pipeline to build, unit test, generate sonar reports and deploy applications
in HDP clusters.
Responsibilities:
▪ Developed EPE application modules to build the Flink streaming data ingestion pipeline.
▪ Developed Importer module to repeatedly check for new CDR files and perform a basic validation and convert
data to specific format.
▪
▪
▪
▪
Developed Processor module to aggregate the CDR events by repeatedly checking for CDR events in HDFS
and generate the aggregated results of usage summary, Partner billing and Dealer commissioning and persist
all this information into Oracle Database.
Developed utility module for processor component to validate the CDR events and map subscriber information
to CDR events so it can be identified by Subscriber ID.
Used Oracle DB to configure the job parameters for each and every module as well as to re-run the module if
required.
Worked on Vagrant, Puppet, Chef and Ambari Blue print to provision the HDP Standalone cluster and
executed the applications build and test.
Skills: Flink, Java, Scala, Git, Maven, Gradle, Jenkins Pipeline, Vagrant, Puppet, Chef, Hadoop, Hive, HBase and
Oracle.
September 2016 – December 2018, Client: Standard Chartered, Domain: Banking
Project: CnC BAM
Location: Kuala Lumpur, Malaysia
Build a Business Activity Monitoring dashboard for a collective Intelligence & Command Centre (CnC) business team to
facilitate their operations team across Asia markets to monitor real time transactions on a time basis and give them a
clear view of business processing to stakeholders through near real-time tracking of transaction flow and enable users to
proactively prevent / resolve impact to client transaction execution through early detection of stuck transactions,
transactions approaching cut-off and/or system issues which could result in payment failures or potential SLA breaches.
Responsibilities:
▪ Prepare a wireframe, data model, and functional specifications document.
▪ Develop a data streaming application in apache SPARK to process the data from message queues and fill the
HADOOP lake.
▪ Developed a data integration and data processing framework in Apache SPARK to apply the business
transformation across multiple levels operations and load the transformed data into No-SQL databases so for
every batch interval dashboards & analytics will be refreshed with real time updates from TP systems.
▪ Worked on requirement analysis to visualize the unified view of payment transactions, trades, sanctions, private
banking, unit trust banking, settlements and corporate actions. Work on HBase and Cassandra Data Model for
each application to facilitate data store across multiple level operations.
▪ Developed a utility framework to govern a data loss across Kafka message queue, handle job failures, govern
memory utilization error, shuffling errors across all SPARK jobs and govern alerts and reports generation
failures and report HDFS usage. Develop a module for each application to periodically archive older
transactions and create HIVE tables on top of it for further enquiries and BAU support.
▪ Developed a Data quality validation framework in apache drools to apply DQ rules on streaming data and
further process it for business validation.
▪ Developed a ETL framework to reuse the generic functionality of aggregation & transformation across multilevel operations and provide the ETL flow from XML based configuration file and define the application based
configuration in properties file.
▪ At various stage of data processing, create a HIVE stage and target table to validate the results based on load-id
sourced from TP systems for every batch interval. Based on transaction hierarchical defined, develop a
framework to validate transactional history and run checks on HBASE tables for verification of latest
transaction loaded.
Skills: Java, Scala, R, SVN, Maven, Kafka, Spark Streaming, Drools, REST API, SparkR, Time Series Forecasting,
HBase, Hive, Hadoop, Cassandra and Tableau.
July 2015 – August 2016, Client: Capital One, Domain: Banking
Project: Hadoop Data Lake and Fund Forecast
Location: Bangalore, India
Build data ingestion pipeline to fill Hadoop data lake on across various regions; such as RAW region, integrated key
region, split region and validated region and utilize the historical data for funds forecast estimation. Build a module to
estimate funds forecast by processing through a transactional data and project a lowest available balance in the next 31
days so user will have a clear view of how much they can actually spend or save. Use predictive analysis to analyse data
on transaction history, identify recurring bills and deposits, and provide a future view of your available balance.
Responsibilities:
▪ Work on requirement analysis, design, coding and testing.
▪ Worked on requirement analysis to develop a file watcher application to pull data from Linux servers to HDFS
location in edge node
▪ Prepared UNIX scheduling form to schedule the workflow in Control-M scheduler.
▪ Developed a Apache spark based ETL frameworks to support running ETL analyses and optimize the multiple
read, write destinations of different types and perform data quality validation on HDFS data to validate schema
validation, drools rules and record count check.
▪ Developed a Split framework and Integration key frameworks in apache spark to move the data to different
regions in HDFS: raw region, split region, Integration key region and validated region.
▪ Handle unstructured data, EBCDIC data, nested XML data and JSON data and convert the data into pipe
delimited files and perform data quality check.
▪ Developed a customized framework in spark to process the data and perform business transformation and
execute the data model and load the predicted analytics data into MongoDB database.
Skills: Java, Apache Spark, Map Reduce, Drools, Hive, Python, Shell Scripting, Apache PIG, Cloudera and MongoDB
March 2013 – June 2015, Client: Cardinal Health, Domain: Health care
Project: NLC Lean Replenishment & Supply Chain Optimization,
Location: Bangalore, India
Develop an application to process the MDM data and distribution tracking data of receiving and shipping divisions &
NLC hub and project the metrics of optimized safety stock, price variation, availability alerts, order projection, demand
forecasts and advance shipment notice and generate an auto balance reports of SKU's across various divisions.
Responsibilities:
▪ Work on the requirement analysis to design, develop and implement the map reduce data quality frameworks in
java to deliver a data ingest pipeline with data cleansing, validation and automatic profiling.
▪ Worked on Map reduce frameworks to apply grouping operations on historical data to provide insights on
inventory price changes, order projections on multi-echelon divisions, understand weekly/monthly inventory
demand on multi-echelon divisions and visualize the lead time changes on stock keeping units.
▪ Developed Apache PIG script to calculate the monthly average demand based on historical data of weekly sales,
lost sales and promotional sales and calculate the standard deviations for SKU's and map out the optimized
safety stock.
▪ Developed a Map Reduce job to load the optimized safety stock into HIVE tables for user query and develop
PIG script to calculate the item ranking based on monthly average demand and store the data in HDFS.
▪ Handle production issues, manage and review Hadoop files on RAW zone; Create partition and bucketing for
data loaded on HIVE stage tables and target tables.
▪ Manage and review Hadoop log files, Data transfer between Hadoop and other data source.
▪ Highlight the key issues/risks involved in each stage of the project and communicate the same to the onsite
counterpart to resolve it.
▪ Analyze, design and develop a lean replenishment application for NLC operations;
▪
▪
Developed AS400/I-series based programming tools for buyers to create holdouts, load planned demand, load
customer usage, create purchase orders, resend purchase orders and auto-balance inventories across multiechelon divisions.
Provided exceptions user support and production support to the business operation team and developed tools to
automate the business operations to adjust the supply chain system factor settings and tracked and reported the
operations setting changes on system factors. Worked on TIER 2 Application Support, enhancement and
analysis of problems/issues.
Skills: Java, Map Reduce, AS400, DB2, RPGLE, SQLRPGLE, CLLE, HIVE, PIG, Shell Scripting and Cloudera.
ACHIEVEMENTS
•
•
Received best performance awards from the collective intelligence and command centre operations team of
Standard Chartered for successfully implementing real time streaming data ingestion pipeline to visualize
integrated business monitoring dashboards of payment transactions, trades, sanctions, private banking, unit
trust banking, settlements and corporate actions.
Appreciated for successfully implementing disaster recovery plans to provision Hadoop, Hive and HBase objects
and package docker images of Kafka Streaming applications, rest service application and micro service
application and deploy docker images and execute containers in Disaster Recovery cluster with kubernetes
orchestration.
Declaration
I hereby declare that the information provided in this resume is true to the best of my knowledge.
Balakumar Balasundaram