Ashish Kumar
Software Engineer
Dynamic GCP Certified Professional Data Engineer with
over 7 years of experience in architecting and
managing scalable ETL pipelines, building interactive
dashboards, and deploying advanced data solutions for
large-scale data integration projects in industries like
fintech and enterprise IT. Proficient in cloud platforms
(AWS, GCP, Azure), analytics tools (Snowflake, dbt,
PubSub), and programming languages including Python
and SQL. Proven track record as a results-driven Data
Engineer, optimizing real-time data pipelines at Koodoo,
achieving a 25% improvement in processing efficiency,
and leading robust ETL processes at TCS, managing
Snowflake data warehouses with billions of rows of
data. Skilled in Metabase, advanced SQL querying, and
data visualization, with a passion for harnessing data to
drive business decisions and embracing cutting-edge
data technologies.
(-
Parvati Mansion, Hill Colony Kulti,
Near Durga Mandir, College More,
Dist- Burdwan,West Bengal 713343
Email:-Github:https://github.com/ashish2
085
Linkedin:https://www.linkedin.com
/in/ashishk1994
TECHNICAL SKILLS
Python, Advanced SQL, MySQL, Snowflake,BigQuery, Databricks, AWS, GCP, Airbyte, Fivetran, ETL, dbt,
Apache Spark, PySpark, Tableau, Power BI, Generative AI, LLMs, Langchain, HuggingFace, RAG, Pinecone,
Flask, Cloud Functions, CloudRun, DataFlow, Docker, Kubernetes, Jenkins, GitHub Actions, MLOps,
OpenAI, Postman, SAP Hana, Qlik Sense, Streamlit, Gradle, Talend Open Studio, AWS Glue, EMR, RedShift,
MongoDB, NoSQL, Metabase, Matillion, SQLfluff, Matplotlib, Plotly, Selenium, JAVA, C++, C, GIT, SVN, Azure
Data Factory,ADLS,Cloud Composer,Airflow,DAG,Terraform.
EXPERIENCE
Meraki Ventures Pune IN,Senior Data Engineer
Jan 2025 - Present
● Designed and implemented a Turbo Data Pipeline using Cloud Functions to process CSV data
from GCS with a data owner approval workflow. The pipeline streams large files in chunks and
processes them in memory-efficient batches to dynamically insert, update, or queue delete
operations in Cloud SQL, with on-the-fly table creation.
● Optimized Cloud SQL database flags to reduce operational costs for the Fluence Project.
● Enhanced performance of the Airflow DAG by streamlining data flow and eliminating redundant
processing steps.
● Leveraged BigQuery for ad-hoc analytics on processed datasets, enabling near-real-time
validation and business reporting.
● Developed a CI/CD pipeline using GitHub Actions for automated deployment of the GitHub
codebase.
● Integrated Datadog for centralized cloud logging, monitoring, and GCP FinOps cost optimization
insights.
● Built and deployed a DBT-based data transformation pipeline on PostgreSQL views, and
incorporated unit tests directly into the GitHub Actions workflow for automated test validation.
Blenheim Chalcot,Mumbai IN,Senior GCP Devops Data Engineer
2025
Feb 2022 - Jan
● Developed automation tools including a Call Checker for auditing broker calls and a Document
Checker to streamline mortgage submission validation.
● Built a Decision Engine to support dynamic rule creation for mortgage application evaluations.
● Implemented data ingestion pipelines using Airbyte, Kafka, Hadoop, and Spark to process and
stream large datasets from various sources into target systems.
● Transformed data using dbt, leveraging Jinja templating, modular SQL models, and custom tests
to ensure quality, consistency, and maintainability.
● Architected a BigQuery-based warehouse layer, using dbt for transformation, Jinja-templated
modular SQL, and custom tests to assure data quality.
● Created dashboards and APIs using Python Flask, and performed data visualizations using Qlik
and Python to deliver business insights.
● Set up CI/CD pipelines with Docker and GitHub Actions; deployed services to GCP Cloud Run,
orchestrated jobs via Pub/Sub, and scheduled processes with Cloud Scheduler.
● Managed orchestration using Apache Airflow, including ad hoc execution of Airbyte jobs through
GCP integration.
● Monitored and optimized pipelines using GCP logs, improving system performance and reducing
operational costs.
● Designed scalable data warehouses and applied advanced data cleansing and ML pipeline
automation techniques to support analytics and model training.
E-Zest Solution,Pune IN,Data Engineer
June 2021 - Feb 2022
● Designed a Snowflake-based data warehouse to support scalable data ingestion and
transformation workflows.
● Configured Apache Kafka producers and consumers to manage real-time data streams from
multiple sources.
● Developed a Dataflow API using Apache Spark and DataOps tools to clean, process, and
automate the machine learning stack for efficient data handling and model training.
Tata Consultancy Services.Kolkata IN,Software Engineer
July 2018- May 2021
● Applied machine learning algorithms for anomaly detection in financial data, leveraging
data mining techniques to uncover risk patterns.
● Preprocessed and visualized data from multiple sources using Qlik and Python to
support exploratory analysis and reporting.
● Built predictive models to classify invoice types, incorporating feature extraction and
importance analysis to enhance model accuracy.
● Provided audit-focused insights by identifying potential risk scenarios in financial
datasets.
● Created a cross-database automated testing framework using Java to validate column
counts and ensure data accuracy, reducing the effort required for unit and functional
testing, Worked collaboratively to enhance organizational efficiency.
EDUCATION
Bachelor of Technology
Computer Science and Engineering,Asansol Engineering College - 7.91 CGPA
August 2014 - July 2018 - 72.6%
Intermediate Loyola School Taldanga, July 2013
Matriculation - 82.6%
Loyola School Taldanga, July 2011
CERTIFICATIONS
● Google Certified Professional Data Engineer
PROJECTS
● PDF-QL
●
ETL-Snowflake