Data Analysis
DATA CLEANING &
VISUALIZATION
A SHOWCASE OF TASKS AND OUTCOMES FROM
EXCEL DATA PROCESSING
INEB RAHIM
PROJECT OVERVIEW
• Objective: Clean and visualize raw daily Excel files
provided by client
• Tools: Python, Pandas, Matplotlib, Seaborn, Jupyter
Notebooks
• Focus: Standardization, cleaning, and graphical
insights
DATA CLEANING TASKS
Removed irrelevant rows/columns
Standardized column headers
Converted data types (e.g., timestamps, numerics)
Filtered meaningful entries and handled missing
values
• Built reusable functions for batch sheet cleaning
•
•
•
•
• Note: Just 1st Feb as a sample is shown in
screenshots, but the same task was done on all
days from 1st Feb to 28th Feb, individually.
BEFORE VS. AFTER CLEANING
• Screenshots from messy Excel File (columns hided
for data privacy concerns)
BEFORE VS. AFTER CLEANING
• Screenshots from cleaned dataset stored as excel
file.
VISUALIZATION HIGHLIGHTS
1) Analysis of Physician Name Distribution Across Area and
Status (HEATMAP).
2)Analysis of Appointment Status by Time Slots (Bar Chart).
3) Analysis of Status Distribution by Nationality Group (Bar
Chart).
4) Analysis of Age Distribution by Appointment Status (Box
Plot).
5) Analysis of Status Distribution Across Different Areas and
Departments (Pie Charts).
Libraries used: Matplotlib, Seaborn
1) ANALYSIS OF PHYSICIAN NAME DISTRIBUTION
ACROSS AREA AND STATUS (HEATMAP)
FEB 1ST
2) ANALYSIS OF APPOINTMENT STATUS
BY TIME SLOTS (BAR CHART)
FEB 1ST
3) ANALYSIS OF STATUS DISTRIBUTION
BY NATIONALITY GROUP (BAR CHART)
FEB 1ST
4) ANALYSIS OF AGE DISTRIBUTION BY
APPOINTMENT STATUS (BOX PLOT)
FEB 1ST
5) ANALYSIS OF STATUS DISTRIBUTION
ACROSS DIFFERENT AREAS AND
DEPARTMENTS (PIE CHARTS)
• Next Slide
FEB 1ST
PROJECT OUTCOME
•
•
•
•
Delivered clean, insightful visuals
Built scalable workflow for daily data
Reported and explained the results to client in ppt
Enabled better insights and decisions for client
TOOLS & SKILLS USED
•
•
•
•
Python (Pandas, Numpy, Matplotlib, Seaborn)
Jupyter Notebook
Data wrangling, cleaning, visualization
Client-ready reporting