Business Report - 8
PG Program in Data Science and
Business Analytics
submitted by
Sangram Keshari Patro
BATCH:PGPDSBA.O.AUG24.B
Contents
1 Objective
4
2 Data Description
4
3 Data Overview
3.1 Importing necessary libraries and the dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Structure and type of data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Statistical summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
4
4
4
4 Exploratory Data Analysis
4.1 Univariate Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1 ’Rose’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.2 ’Sparkling’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Bivariate Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 Rose sales vs Year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2 Sparkling sales vs Year . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.3 Rose sales vs Month . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.4 Sparkling sales vs Month . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Plotting the Empirical Cumulative Distribution . . . . . . . . . . . . . . . . . .
4.3.1 Rose sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.2 Sparkling sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Plotting the average RetailSales per month and the month on month percentage
4.4.1 Rose sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2 Sparkling sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
change of RetailSales.
. . . . . . . . . . . .
. . . . . . . . . . . .
-
5 Data Pre-processing
5.1 Additive Model . . . . . . . .
5.1.1 Rose sales . . . . . . .
5.1.2 Sparkling sales . . . .
5.2 Multiplicative Model . . . . .
5.2.1 Rose sales . . . . . . .
5.2.2 Sparkling sales . . . .
5.3 Model Comparison Summary
6 Build forecasting models
6.1 Linear Regression . . . . . .
6.1.1 Model Building . . .
6.1.2 Model Evaluation . .
6.2 Simple Average . . . . . .
6.2.1 Model Building . . .
6.2.2 Model Evaluation . .
6.3 Moving Average . . . . . .
6.3.1 Model Building . . .
6.3.2 Model Evaluation . .
6.4 Exponential Models (Single,
6.4.1 Model Building . . .
6.4.2 Model Evaluation . .
6.4.3 Model Building . . .
6.4.4 Model Evaluation . .
6.4.5 Model Building . . .
6.4.6 Model Evaluation . .
6.4.7 Model Building . . .
6.4.8 Model Evaluation . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
Double, Triple)
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7 Check for Stationarity
28
7.1 Rose Sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.2 Sparkling Sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
8 Model building of stationary data
8.1 Plotting the Autocorrelation function plots . . . . . . . . . . .
8.1.1 Plotting PACF . . . . . . . . . . . . . . . . . . . . . . .
8.1.2 Plotting ACF . . . . . . . . . . . . . . . . . . . . . . . .
8.1.3 Plotting PACF . . . . . . . . . . . . . . . . . . . . . . .
8.1.4 Plotting ACF . . . . . . . . . . . . . . . . . . . . . . . .
8.2 Manual ARIMA Model . . . . . . . . . . . . . . . . . . . . . . .
8.2.1 (Rose Sales)ARIMA model with p=0,d=1 and q=2 . .
8.2.2 (Rose Sales)ARIMA model with p=1,d=1 and q=2 . . .
8.2.3 (Sparkling Sales)ARIMA model with p=2,d=1 and q=2
8.2.4 (Sparkling Sales)ARIMA model with p=2,d=1 and q=1
8.3 Auto ARIMA Model . . . . . . . . . . . . . . . . . . . . . . . .
8.4 Manual SARIMA Model . . . . . . . . . . . . . . . . . . . . . .
8.4.1 (Rose Sales)SARIMA model ((0,1,2)(2, 1, 2, 12)) . . .
8.4.2 (Sparkling Sales)SARIMA model ((1,1,2)(0, 1, 2, 12)) .
8.5 Auto SARIMA Model . . . . . . . . . . . . . . . . . . . . . . .
8.5.1 (Sparkling Sales)SARIMA(0,0,1)(0,1,1)[12] . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
9 Comparison of all different models
43
10 Final Forecast
44
11 Actionable Insights and Business Recommendations
45
List of Figures-
Table depicting the datatype for both types of wine. . . . . . . . . . . . . . . . . . . .
Statistical summary of the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Timeseries plot of ’Rose’ wine sales . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Timeseries plot of ’Sparkling’ wine sales . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales vs Year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales vs Year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales vs Month . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales vs Month . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ECDF plot of Rose sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ECDF plot of Sparkling sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales(Additive Model) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales(Additive Model) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales(Multiplicative Model) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales(Multiplicative Model) . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales(Linear Regression) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales(Linear Regression) . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales(Simple Average) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales(Simple Average) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales(Moving Average) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales(Moving Average ) . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales(Single Exponential Model and its parameters) . . . . . . . . . . . .
Sparkling sales(Single Exponential Model and its parameters) . . . . . . . . .
Rose sales(Double Exponential Model and its parameters) . . . . . . . . . . .
Sparkling sales(Double Exponential Model and its parameters) . . . . . . . .
Rose sales(Triple Exponential Model and its parameters) . . . . . . . . . . . .
Sparkling sales(Triple Exponential Model and its parameters) . . . . . . . . .
Rose sales(Triple Exponential Model) and its parameters . . . . . . . . . . . .
Sparkling sales(Triple Exponential Model and its parameters) . . . . . . . . .
Dickey-Fuller test results for Rose dataset with and without differencing the data . . .
Dickey-Fuller test results for Sparkling dataset with and without differencing the data
Rose sales(PACF plot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales(ACF plot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales(PACF plot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sparkling sales(ACF plot) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
(Rose Sales)ARIMA model with p=0,d=1 and q=2 . . . . . . . . . . . . . . . . . . .
(Rose Sales)ARIMA model with p=1,d=1 and q=2 . . . . . . . . . . . . . . . . . . . .
(Sparkling Sales)ARIMA model with p=2,d=1 and q=2 . . . . . . . . . . . . . . . . .
(Sparkling Sales)ARIMA model with p=2,d=1 and q=1 . . . . . . . . . . . . . . . . .
Comparison of all the ARIMA models . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rose sales(ARIMA model with p=1,d=1 and q=2) . . . . . . . . . . . . . . . . . . .
Sparkling sales(ARIMA model with p=2,d=1 and q=2 ) . . . . . . . . . . . . . . .
(Rose Sales)SARIMA model ((0,1,2)(2, 1, 2, 12)) . . . . . . . . . . . . . . . . . . . . .
(Sparkling Sales)SARIMA model ((1,1,2)(0, 1, 2, 12)) . . . . . . . . . . . . . . . . . .
Comparison of all the SARIMA models . . . . . . . . . . . . . . . . . . . . . . . . . .
(Rose Sales)SARIMA model ((0,1,2)(2, 1, 2, 12)) and residuals plot . . . . . . . . . .
(Sparkling Sales)SARIMA model ((1,1,2)(0, 1, 2, 12)) and residuals plot . . . . . . .
(Rose Sales)SARIMA(5,1,1)(1,0,1)[12] . . . . . . . . . . . . . . . . . . . . . . . . . . .
(Sparkling Sales)SARIMA(0,0,1)(0,1,1)[12] . . . . . . . . . . . . . . . . . . . . . . . . .
Comparison of all the AUTO SARIMA models . . . . . . . . . . . . . . . . . . . . . .
Comparison of all different models . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Final forecast plot for Rose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Final forecast plot for Sparkling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
1
Objective
The primary objective of this project is to analyze and forecast wine sales trends for the 20th century based on
historical data provided by ABC Estate Wines. We aim to equip ABC Estate Wines with the necessary insights and
foresight to enhance sales performance, capitalize on emerging market opportunities, and maintain a competitive
edge in the wine industry.
2
Data Description
The data provided is of sales of two types of wine Rose and Sparkling from the period of 1980 to 1995.
3
3.1
Data Overview
Importing necessary libraries and the dataset
Datasets for both types of wine are printed. Both have 187 rows & 2 columns.
3.2
Structure and type of data
Data is explored further. The dataset is free from duplicate rows and contains some null values.
Figure 1: Table depicting the datatype for both types of wine.
3.3
Statistical summary
Figure 2: Statistical summary of the data
4
4
Exploratory Data Analysis
4.1
4.1.1
Univariate Analysis
’Rose’
Figure 3: Timeseries plot of ’Rose’ wine sales
Observations
Sales Trend: There is a noticeable overall downward trend in the sales of rosé wine from 1980 to 1995.
Seasonality: Spikes in sales occur regularly, indicating a possible seasonal pattern.
Volatility: Early years -) show high fluctuation in sales, which stabilizes over time.
Peak Period: The highest sales were recorded around-, after which a gradual decline is observed.
Business Recommendations
Seasonal Marketing: Utilize peak seasons to promote rosé wine through targeted campaigns and discounts.
Revitalization Strategy: Investigate causes of declining demand and consider rebranding or new packaging
to rejuvenate consumer interest.
Diversify Portfolio: Consider expanding into other wine categories or beverages to offset the decline in rosé
sales.
Customer Analysis: Conduct market research to understand changing preferences and tailor offerings
accordingly.
5
4.1.2
’Sparkling’
Figure 4: Timeseries plot of ’Sparkling’ wine sales
Observations
Sales Trend: Overall, there is a clear upward trend in sparkling wine sales over the years.
Seasonality: Sales show consistent yearly spikes, likely during festive seasons.
Volatility: Peaks are sharp and regular, indicating predictable consumer behavior patterns.
Growth: From 1980 to 1995, both the baseline and peak sales values have increased steadily.
Business Recommendations
Seasonal Promotions: Intensify marketing during known high-demand months (e.g., holidays) to maximize
sales.
Capacity Planning: Scale production and distribution in advance of seasonal peaks.
Product Positioning: Leverage the premium and celebratory image of sparkling wine to appeal to aspirational consumers.
International Markets: Given strong domestic growth, explore export opportunities during peak global
festive seasons.
4.2
4.2.1
Bivariate Analysis
Rose sales vs Year
Figure 5: Rose sales vs Year
Observations
Box Plot: Sales of rosé wine declined over the years. Variability and median decreased, with more outliers
in early years indicating occasional high sales.
Monthly Trends: November and December show consistently higher sales, suggesting seasonality. All
months show a general downward trend.
7
Business Recommendations
Seasonal Targeting: Focus promotions around November–December due to peak sales in these months.
Revival Strategy: Revamp marketing for rosé wine to counter steady decline over time.
Outlier Utilization: Analyze high-sales months in early years to replicate successful strategies.
4.2.2
Sparkling sales vs Year
Figure 6: Sparkling sales vs Year
Observations
Box Plot (Sparkling Wine): Sales increased slightly from 1980 to 1995. Distribution remained steady, but
outliers rose after 1985, showing sporadic sales spikes.
8
Monthly Trends (Sparkling Wine): Clear seasonality with high sales in November and December. Other
months showed stable but lower sales over time.
Business Recommendations
Seasonal Focus: Intensify campaigns in November–December to leverage strong seasonal demand.
Peak Analysis: Investigate reasons behind outlier months with high sales to replicate success.
Stable Base: Maintain consistent supply across months while optimizing for year-end spikes.
4.2.3
Rose sales vs Month
9
Figure 7: Rose sales vs Month
Observations
Trend Plot: Clear upward trend from January to December; highest spikes in November–December.
Yearly Comparison: Sales peaked in the 1980s; gradual decline post-1990. Seasonality remains strong
across years.
Business Recommendations
Seasonal Focus: Boost marketing in late Q4 to leverage peak sales.
Sales Recovery: Study 1980s peak years to extract replicable growth strategies.
Product Strategy: Consider bundling or festive packaging for holiday months.
4.2.4
Sparkling sales vs Month
10
Figure 8: Sparkling sales vs Month
11
Insights
Trend Plot: Sales remain flat in the first half, then rise sharply from July to December, with steep peaks in
November–December.
Yearly Comparison: Consistent year-end spikes across all years; 1995 shows strongest finish. Overall
upward trend post-June.
Business Recommendations
Holiday Strategy: Prioritize inventory and campaigns for Q4 to align with demand surges.
Growth Momentum: Scale up operations mid-year to prepare for strong second-half sales.
Year-End Promotions: Launch targeted festive offers and high-value bundles to boost conversions.
4.3
Plotting the Empirical Cumulative Distribution
This particular graph tells us what percentage of data points refer to what number of Sales.
4.3.1
Rose sales
Figure 9: ECDF plot of Rose sales
4.3.2
Sparkling sales
Figure 10: ECDF plot of Sparkling sales
12
4.4
4.4.1
Plotting the average RetailSales per month and the month on month percentage
change of RetailSales.
Rose sales
Figure 11: Rose sales
4.4.2
Sparkling sales
Figure 12: Sparkling sales
13
5
Data Pre-processing
I have forward filled the 2 null values encountered in Rose dataset and no null value is found is Sparkling dataset.
After removing null values data is decomposed into trend,seasonality and residuals and then visualized by
plotting.
Perform Decomposition
5.1
5.1.1
Additive Model
Rose sales
Figure 13: Rose sales(Additive Model)
Insights
Trend Plot: Long-term decline observed in overall values post-1981, indicating a steady downward trend.
Seasonality: Clear recurring peaks annually, with strong cyclical behavior suggesting seasonal effects.
Residuals: Randomly scattered, indicating minimal pattern left; model captures most structure.
14
Business Recommendations
Seasonal Planning: Leverage predictable seasonal peaks for campaign timing and resource allocation.
Decline Strategy: Investigate causes for long-term decline and strategize for revival.
Model Reliability: Proceed with forecasting, as residuals show no significant patterns.
5.1.2
Sparkling sales
Figure 14: Sparkling sales(Additive Model)
Insights
Trend Plot: Mild dip till 1983, followed by a steady rise peaking around-, then slight decline.
Seasonality: Strong, consistent seasonal spikes each year, showing high demand fluctuations.
Residuals: Randomly scattered around zero with minor variance; model fits the data well.
Business Recommendations
Peak Timing: Focus marketing and logistics around predictable annual spikes.
Capacity Planning: Align operations with the rising trend observed mid-80s to early-90s.
Monitoring Needed: Investigate post-1990 plateau or decline to inform future strategy.
15
5.2
5.2.1
Multiplicative Model
Rose sales
Figure 15: Rose sales(Multiplicative Model)
Insights
Trend Plot: Noticeable decline throughout the period, indicating weakening growth over time.
Seasonality: Clear yearly patterns with relative seasonal effects remaining proportional over time.
Residuals: Fluctuations appear stable and centered, suggesting a good model fit.
Business Recommendations
Stabilize Decline: Identify and mitigate drivers of downward trend.
Seasonal Focus: Seasonal proportionality suggests timing remains key – optimize around peak months.
Model Validity: Continue using multiplicative models for accurate forecasting and planning.
16
5.2.2
Sparkling sales
Figure 16: Sparkling sales(Multiplicative Model)
Insights
Trend Plot: Gradual decline in long-term trend, indicating reduced popularity or demand.
Seasonality: Strong multiplicative seasonality – peaks maintain proportional relationship to trend.
Residuals: Stable and random around 1, supporting model appropriateness.
Business Recommendations
Address Downtrend: Explore marketing or innovation strategies to reverse trend.
Leverage Seasonality: Plan production and promotions around predictable seasonal peaks.
Model Reliability: Multiplicative model effectively captures structure – use for forecasting.
5.3
Model Comparison Summary
Rose Data: Multiplicative model shows clearer seasonal variation with stable residuals. Best fit.
Sparkling Data: Multiplicative model better captures proportional seasonality with a declining trend. Best
fit.
Conclusion
Multiplicative models are more suitable for both Rose and Sparkling datasets due to their ability to model varying
seasonal effects relative to trend levels. Then the data is split into train data and test data.
17
6
6.1
6.1.1
Build forecasting models
Linear Regression
Model Building
Linear regression model is built on train data and fitted on test data to get the following plot.
Rose Sales
Figure 17: Rose sales(Linear Regression)
Sparking Sales
Figure 18: Sparkling sales(Linear Regression)
18
6.1.2
Model Evaluation
The RMSE for the forecast is given below.
6.2
6.2.1
Simple Average
Model Building
Simple Average model is built on train data and fitted on test data to get the following plot.
Rose Sales
Figure 19: Rose sales(Simple Average)
19
Sparking Sales
Figure 20: Sparkling sales(Simple Average)
6.2.2
Model Evaluation
The RMSE for the forecast is given below.
6.3
6.3.1
Moving Average
Model Building
Moving Average model is built on the whole data to get the following plot. The model is built with different window
sizes i.e. different point moving averages and best fit is found to be 2-point moving average by looking at the least
RMSE.
20
Rose Sales
Figure 21: Rose sales(Moving Average)
Sparking Sales
Figure 22: Sparkling sales(Moving Average )
6.3.2
Model Evaluation
The RMSE for the forecast is given below.
21
The best models for both dataset are 2-point averages.
6.4
Exponential Models (Single, Double, Triple)
Exponential smoothing methods consist of flattening time series data. Exponential smoothing averages or exponentially weighted moving averages consist of forecast based on previous periods data with exponentially declining
influence on the older observations. Exponential smoothing methods consist of special case exponential moving
with notation ETS (Error, Trend, Seasonality) where each can be none(N), additive (N), additive damped (Ad),
Multiplicative (M) or multiplicative damped (Md). One or more parameters control how fast the weights decay.
These parameters have values between 0 and 1.
Single Exponential Model
The simplest of the exponentially smoothing methods is naturally called simple exponential smoothing (SES). This
method is suitable for forecasting data with no clear trend or seasonal pattern. In Single ES, the forecast at time
(t + 1) is given by Winters,1960
Ft+1 = αYt + (1 − α)Ft
Parameter is called the smoothing constant and its value lies between 0 and 1. Since the model uses only one
smoothing constant, it is called Single Exponential Smoothing.
Note: Here, there is both trend and seasonality in the data. So, we should have directly gone for the Triple
Exponential Smoothing but Simple Exponential Smoothing and the Double Exponential Smoothing models are
built over here to get an idea of how the three types of models compare in this case.
6.4.1
Model Building
Single Exponential Model model is built on train data and fitted on test data to get the following plot.
22
Rose Sales
Figure 23: Rose sales(Single Exponential Model and its parameters)
Sparking Sales
Figure 24: Sparkling sales(Single Exponential Model and its parameters)
6.4.2
Model Evaluation
The RMSE for the forecast is given below.
23
Double Exponential Model
One of the drawbacks of the simple exponential smoothing is that the model does not do well in the presence of
the trend. This model is an extension of SES known as Double Exponential model which estimates two smoothing
parameters. Applicable when data has Trend but no seasonality. Two separate components are considered: Level
and Trend. Level is the local mean. One smoothing parameter α corresponds to the level series A second smoothing
parameter β corresponds to the trend series.
6.4.3
Model Building
Double exponential model is built on train data and fitted on test data to get the following plot.
Rose Sales
Figure 25: Rose sales(Double Exponential Model and its parameters)
Sparking Sales
Figure 26: Sparkling sales(Double Exponential Model and its parameters)
24
6.4.4
Model Evaluation
The RMSE for the forecast is given below.
Here, we see that the Double Exponential Smoothing has actually done well when compared to the Simple
Exponential Smoothing. This is because of the fact that the Double Exponential Smoothing model has picked up
the trend component as well.
Triple Exponential Model
6.4.5
Model Building
Holt-Winters ETS(A,A,A) Model (Additive Error, Additive Trend, Additive Seasonality) model is
built on train data and fitted on test data to get the following plot.
Rose Sales
Figure 27: Rose sales(Triple Exponential Model and its parameters)
25
Sparking Sales
Figure 28: Sparkling sales(Triple Exponential Model and its parameters)
6.4.6
Model Evaluation
The RMSE for the forecast is given below.
Triple Exponential Smoothing has performed the best on the test as expected since the data had both trend
and seasonality.
But we see that our triple exponential smoothing is under forecasting. Let us try to tweak some of the parameters
in order to get a better forecast on the test set.
Now I have built another model i.e. Holt-Winters ETS(A,A,M) Model (Additive Error, Additive
Trend, Multiplicative Seasonality)
6.4.7
Model Building
Holt-Winters ETS(A,A,M) model is built on train data and fitted on test data to get the following plot.
26
Rose Sales
Figure 29: Rose sales(Triple Exponential Model) and its parameters
27
Sparking Sales
Figure 30: Sparkling sales(Triple Exponential Model and its parameters)
6.4.8
Model Evaluation
The RMSE for the forecast is given below.
Triple Exponential Smoothing has performed the best on the test as expected since the data had both trend
and seasonality.
7
Check for Stationarity
Stationarity in Time Series
A time series is considered to be stationary when its statistical properties such as the mean, variance, and
autocorrelation remain constant over time. Stationarity allows us to model and forecast time series using historical
data because the behavior of the series does not change over time. In a stationary series, the autocorrelation at
lag k depends only on k, and not on the specific time t. Let Xt denote the value of the time series at time t. The
autocorrelation at lag k is the correlation between Xt and Xt−k .
28
Checking for Stationarity
To test for stationarity, we use the Dickey-Fuller test (specifically, the Augmented Dickey-Fuller or ADF test).
Null Hypothesis (H0 ): The time series is non-stationary.
Alternative Hypothesis (H1 ): The time series is stationary.
Interpretation:
If the p-value < 0.05: Reject the null hypothesis ⇒ The time series is stationary.
If the p-value ≥ 0.05: Fail to reject the null hypothesis ⇒ The time series is non-stationary.
The test is depicted below. The data is also differentiated once and then checked for stationarity and we find that
on single order differencing we are able to get a stationary data.
7.1
Rose Sales
Figure 31: Dickey-Fuller test results for Rose dataset with and without differencing the data
29
7.2
Sparkling Sales
Figure 32: Dickey-Fuller test results for Sparkling dataset with and without differencing the data
8
Model building of stationary data
8.1
Plotting the Autocorrelation function plots
Using ACF and PACF in Time Series Forecasting
ACF (Autocorrelation Function) and PACF (Partial Autocorrelation Function) plots help identify the
order of ARIMA models.
ACF Plot: Shows correlation between the time series and its lags. Useful for identifying the MA (Moving
Average) order q.
PACF Plot: Shows correlation between the series and its lags after removing intermediate effects. Helps
determine the AR (AutoRegressive) order p.
General Guidelines:
If ACF cuts off after lag q and PACF tails off ⇒ MA(q) model.
If PACF cuts off after lag p and ACF tails off ⇒ AR(p) model.
30
If both tail off slowly ⇒ Consider ARMA or ARIMA model with differencing.
Note: Bars that lie outside the blue confidence region (typically at 95%) are considered statistically significant.
Rose Sales
8.1.1
Plotting PACF
Figure 33: Rose sales(PACF plot)
8.1.2
Plotting ACF
Figure 34: Rose sales(ACF plot)
31
Sparkling Sales
8.1.3
Plotting PACF
Figure 35: Sparkling sales(PACF plot)
32
8.1.4
Plotting ACF
Figure 36: Sparkling sales(ACF plot)
We can clearly observe the seasonality from the ACF plot.
Note: The data has some seasonality so ideally we should build a SARIMA model. But for demonstration
purposes we are building an ARIMA model both by looking at the minimum AIC criterion and by looking at the
ACF and the PACF plots.
8.2
Manual ARIMA Model
p,d,q values are changed and all permutations are tried to get the least AIC value. I have built the best 2 ARIMA
models to compare.
8.2.1
(Rose Sales)ARIMA model with p=0,d=1 and q=2
The model result is as follows.
33
Figure 37: (Rose Sales)ARIMA model with p=0,d=1 and q=2
All the coefficients are statistically significant.
8.2.2
(Rose Sales)ARIMA model with p=1,d=1 and q=2
The model result is as follows.
Figure 38: (Rose Sales)ARIMA model with p=1,d=1 and q=2
Two of the coefficients are not statistically significant considering 95% confidence.
34
8.2.3
(Sparkling Sales)ARIMA model with p=2,d=1 and q=2
The model result is as follows.
Figure 39: (Sparkling Sales)ARIMA model with p=2,d=1 and q=2
All of the coefficients are statistically significant considering 95% confidence.
8.2.4
(Sparkling Sales)ARIMA model with p=2,d=1 and q=1
The model result is as follows.
35
Figure 40: (Sparkling Sales)ARIMA model with p=2,d=1 and q=1
one of the coefficients is not statistically significant considering 95% confidence.
Figure 41: Comparison of all the ARIMA models
The best ARIMA model for Rose and Sparkling data are shown below.
The plots are obtained by forecasting the test data using the best models are attached below.
36
Rose Sales
Figure 42: Rose sales(ARIMA model with p=1,d=1 and q=2)
Sparking Sales
Figure 43: Sparkling sales(ARIMA model with p=2,d=1 and q=2 )
8.3
Auto ARIMA Model
Auto ARIMA Model is built by using pmdarima library. For Rose data we get the best model to be ARIMA model
with p=0,d=1 and q=2 which I have already built. And for Sparkling data the best model comes out to be ARIMA
model with p=0,d=0 and q=1.
RMSE is more than the best model that I built using ARIMA model with p=2,d=1 and q=2. Hence, I can
safely ignore this model.
37
8.4
Manual SARIMA Model
p,d,q parameters and Seasonal parameters (P,D,Q)S values are changed and all permutations are tried to get the
least AIC value. I have built the best SARIMA model to compare.
8.4.1
(Rose Sales)SARIMA model ((0,1,2)(2, 1, 2, 12))
The model result is as follows.
Figure 44: (Rose Sales)SARIMA model ((0,1,2)(2, 1, 2, 12))
Some of the coefficients are statistically insignificant.
38
8.4.2
(Sparkling Sales)SARIMA model ((1,1,2)(0, 1, 2, 12))
The model result is as follows.
Figure 45: (Sparkling Sales)SARIMA model ((1,1,2)(0, 1, 2, 12))
All of the coefficients are statistically significant considering 95% confidence.
Figure 46: Comparison of all the SARIMA models
39
The plots are obtained by forecasting the test data using the best models are attached below.
Rose Sales
Figure 47: (Rose Sales)SARIMA model ((0,1,2)(2, 1, 2, 12)) and residuals plot
40
Sparking Sales
Figure 48: (Sparkling Sales)SARIMA model ((1,1,2)(0, 1, 2, 12)) and residuals plot
8.5
Auto SARIMA Model
Auto SARIMA Model is built by using pmdarima library. For Rose data we get the best model to be SARIMA(5,1,1)(1,0,1)[12]
which I have built. (Rose Sales)SARIMA(5,1,1)(1,0,1)[12]
The model result is as follows.
41
Figure 49: (Rose Sales)SARIMA(5,1,1)(1,0,1)[12]
Some of the coefficients are statistically insignificant.
8.5.1
(Sparkling Sales)SARIMA(0,0,1)(0,1,1)[12]
The model result is as follows.
Figure 50: (Sparkling Sales)SARIMA(0,0,1)(0,1,1)[12]
All of the coefficients are statistically significant considering 94% confidence.
42
Figure 51: Comparison of all the AUTO SARIMA models
9
Comparison of all different models
Figure 52: Comparison of all different models
Out of all the models the best model is found to be
Best model for Rose - Triple Exponential Smoothing (with α=0.0995, β=1.3e-09 and γ=1.2e-07)
Best model for Sparkling - Triple Exponential Smoothing (with α=0.07569, β=0.0324 and γ=0.479)
Now, we will take our best model and forecast 12 months into the future with appropriate confidence intervals
to see how the predictions look. We have to build our model on the full data for this.
10
Final Forecast
Figure 53: Final forecast plot for Rose
Figure 54: Final forecast plot for Sparkling
44
11
Actionable Insights and Business Recommendations
Rosé Wine Analysis
Trend: A noticeable decline in overall sales is observed after 1982. Initial years show higher volatility, followed
by more stabilized patterns in later years.
Seasonality: Sales display consistent seasonal peaks, with significant surges around November and December,
aligning with festive demand.
Model Suitability: The multiplicative model provides a better fit due to the seasonal peaks maintaining a
proportional relationship with the overall trend.
Business Recommendations:
– Launch well-timed seasonal campaigns and festive bundles, especially targeting Q4.
– Address the long-term downward trend through innovative marketing, rebranding, or introducing limitededition variants.
– Investigate patterns during peak early years to extract strategies that can be revived or adapted.
– Consider diversifying into related beverage segments to counteract declining core demand.
Sparkling Wine Analysis
Trend: The sales trend shows steady growth up to 1995, with some slowdown visible after 1990, though
year-end performance remains strong.
Seasonality: Strong and predictable seasonal effects occur yearly, with the highest sales typically in November
and December, reflecting strong holiday influence.
Model Suitability: Both additive and multiplicative models fit reasonably well, but the multiplicative model
better captures the proportional nature of seasonal fluctuations.
Business Recommendations:
– Increase inventory and promotional efforts in the second half of the year, especially from July onwards.
– Capitalize on holiday momentum by introducing special sparkling wine editions or high-margin bundles.
– Examine strong-performing periods to replicate success factors across markets.
– Explore export opportunities and upscale positioning in premium product lines.