Ali Ahmed Shaikh
Position: Senior Data Specialist
M.Sc Data Science
Marketlytics
#
--§ Ali Ahmed Shaikh
ï Ali Ahmed Shaikh
Experience
Senior Data Specialist
Oct 2023 - Present
•
MarketLytics
Hybrid
– Leveraged a custom LLM model to enhance the functionality and value of a retail company’s products, resulting
in a 15% increase in customer satisfaction.
– Generated customized reports for clients to evaluate marketing and sales performance.
– Worked on Looker (Part of Google Cloud), Azure, MySQL
– Implement reports on as per business requirements.
– Present information using data visualization techniques.
– Propose solutions and strategies for business challenges.
– Built data engineering pipelines and performed load testing on it.
– Collaborate with product development teams.
– Designed data warehousing solution on the project keeping future requirements in view.
– Built custom marketing and sales conversion reports for clients from GA4 UA as source.
Data Warehouse Engineer
•
Apr 2023 - Oct 2023
360Factors
Hybrid
– Streamlined and automated complex workflows with Apache Airflow, resulting in 20% savings of time and resources.
– Stored data updates in MongoDB for compliance purposes, facilitating over 30 employees to complete their tasks
in less than 60% of the previous time.
– Constructed a data lake to store diverse data types effectively and facilitate additional product features that
boosted customer retention by 36% and acquisition by 25%.
Associate Data Engineer
•
Sept 2021 - Mar 2023
Blutech Consulting
Onsite
– Streamlined ETL processes on billions of data rows, saving 62 hours per month that were previously devoted to
manual operations.
– Achieved 54% efficiency in ETL spec completion time and performed fine-tuning to lower CPU usage by 45%.
– Implemented data quality checking with Python, which cut the timelines of Quality Assurance teams by 16%.
– Constructed a data pipeline to process semi-structured data by incorporating 100 million raw records from 23 data
sources.
– Reformed queries to reduce target marketing customer identification query times by 80%, leading to yearly savings
of $1.2M.
Trainee Software Engineer - Data
•
May 2021 - Sept 2021
NCCPL
Onsite
– Completed Hadoop training courses and additional training on other platforms.
– Assisted team in preparing training material and researching Hadoop case studies to build team knowledge.
– Conducted a successful Big Data case study project showcasing the use of TeraData as a data warehouse for the
Stock Exchange industry.
– Presented the Big Data case study to the team, highlighting the technical benefits of Teradata’s data warehousing
capabilities.
– Conducted research on various private clouds such as PTCL, QCloud and SHDT to expand knowledge of data
engineering services.
Data Science and Business Analytics Intern
•
Apr 2021 - May 2021
The Sparks Foundation
Remote
– Applied supervised and unsupervised learning techniques to analyze data and generate insights.
– Created a Power BI Dashboard for clear visualization of product sales data.
Freelance Data Consultant
•
Fiverr
Dec 2019 - Sept 2022
Remote
– Tutored students in SQL concepts and helped them implement database solutions.
– Instructed coding skills and problem-solving techniques to over 20 students from 5 countries through online platforms, increasing student engagement and performance by 50% and receiving positive feedback from 95% of the
participants.
– Engineered a data schema to integrate various data sources and measures into a consistent structure for simple
access and analysis.
– Built an IoT project that works on sensor information and alerts when someone passes within a 5m range.
Education
MS in Data Science
•
Jul 2022- Jul 2024
FAST-NU
B.Sc in Computer Science
•
Jul 2017- Jul 2021
FAST-NU
Technical Skills and Interests
Neo4J: Neo4j Certified Professional
Microsoft: Azure AI Fundamentals, Azure Fundamentals, Azure Data Fundamentals, Power Platform Fundamentals
AWS: AWS Certified Cloud Practitioner
Oracle: Oracle Cloud Infrastructure Foundations Associate
Personal Projects
Towards semantic-based word embedding for document clustering (Final Year Project)
•
Developed a NLP model to cluster based on Semantic meaning of the text
– Intended to learn the very specific type of embedding, using generalized language models in neural networks. Using
this specific embedding a document representation model will be defined. Document clustering using the proposed
embedding-based representation is performed to achieve a high purity of 86
Realtime Responding to Customers
•
The project was designed to respond to customers based on realtime data.
– The project was designed to respond to customers using realtime data from an api. The data was read through
RabbitMQ and processed by an ETL written in Apache Beam.
Analyzing Country Sales Data with AWS Data Lake Architecture
•
A end-to-end solution for the analysis of UK housing data
– Built a data analytics solution using S3, Glue, and Athena that enabled efficient and cost-effective storage, processing, and querying of large-scale structured and semi-structured data. Enhanced data-driven decision-making
and team collaboration by providing actionable insights.
Data Science Web App to visualize GeoSpatial Accidents Data
•
A web app help visualize and improve the most accidents locations
– Created a web app that transformed data into interactive images that allowed users to zoom in and out. Reduced
accidents and improved safety by providing deeper insights into the data.
Technical Skills and Interests
Languages: Python, C/C++, Java.
Data Analytics: Statistics, Probability, Data Visualization
Relational Databases: Oracle, MySQL, PostgreSQL
NoSQL: MongoDB, Neo4J
Big Data: Hive, Impala, Pig, Spark, Flume, Sqoop, Informatica, Kafka
Google Cloud: CloudSpanner, DataStore, DataProc, PubSub, BigQuery
AWS: S3, VPC, EC2, SNS, SQS, SES, RDS, DynamoDB
Azure: Synapse Analytics, Stream Analytics, Event Hubs, Azure Data Lake Storage
Data Warehouse: BigQuery, TeraData, Snowflake
Volunteering
Microsoft Student Ambassador
•
Microsoft
Sept 2023 - Present
Online
– Hosted a Microsoft Learning challenge named DataFeast. To broaden the participants view in the field of data.
Achievements
Speed Programming Finalist
•
In top 20 out of 300 participants
– Advanced to the finals of Dev Day speed programming competition hosted by ACM FAST chapter, ranking among
the top 20 out of 300+ participants.
Speed Programming Finalist
•
In top 10 out of 100 participants
– Achieved a finalist position in Coders Cup contest hosted by FAST-NU. In Competition from 100 participants