Apurv Jain | Freelancer Power Point Deck For Demand Forecasting

Power point deck for demand forecasting

Demand forecasting By Apurv Jain 1. Introduction IArt of predicating demand for a product or service at some future date on the basis of present and past behaviour of some related event Benefits of Demand Forecasting Planning and decision making Demand forecasting helps businesses maintain the right level of inventory to meet customer demand without overstocking or understocking. Business forecasting Accurate demand forecasts help optimize resource allocation across the organization. Production analysis Production analysis driven by demand forecasts can lead to cost reduction Types of Demand Forecasting Planning and decision making Buiness forecasting Short Term Production analysis Medium Term Long Term Types of Demand Forecasting Short Term Planning and decision making Medium Term Long Term Concerned with short time period usually less than a year Buiness forecasting Production analysis Needed When a company is considering expanding or modifying its production facilities, Needed for the capacity expansion like growth of the firm, recruitment and diversification policies., usually more than 3-5 years Problem Statement Create a forecast for what the visits would look like for the next year based on the historic data points Solution Read the data containing 2 columns ‘Date’ and ‘selected period’ and 119 rows in Python jupyter notebook Performed data manipulations, Exploratory data analysis and feature engineering Built a demand forecasting model using XGBoost and conducted hyperparameter tuning to get better accuracy Understanding the Time Series plot The data consist of a weekly data that spans from Dec 26, 2011 to Mar 31, 2014 containing random spikes and random dip across the whole time series. It can be assessed that there is no cyclic pattern i.e there is no seasonality in the data which is further strengthened by acf plot using statsmodel Understanding the Time Series plot The figures showing the distribution of data thru histogram and box plot and it can be clearly seen that there are some outliers which have to be taken care of Analysis of Trend and Seasonality ´ ´The trend of the data, which has a zig zag pattern i.e first it increases then decreased and again increases figure showing the acf plot of the data which tells us that there is no seasonality in the data as there as no repetitive spikes in the data Feature Engineering ´Carried out feature engineeing to create features such as quarter, month , year based on the date to get more accurate predictions ´df["day_of_week"] = df["Date"].dt.dayofweek ´df["day_of_year"] = df["Date"].dt.dayofyear ´df['month_year'] = df['Date'].dt.to_period('M') ´df["quarter"] = df["Date"].dt.quarter ´df["year"] = df["Date"].dt.year Analysis of distribution ´figure showing the trend as per the different months and diff years, but no general trend can be seen, the graphs show zigzag trend. Model building using XGBoost ● Divided the data into X and Y variable and splitted the data into train and test data with train size = 90 rows and test size = 24 rows. X variables= "day_of_year","month_year","quarter","year" Y variable = "Selected Period" ● Used XGBoost to build a demand forecasting model as it is known for its high predictive accuracy. It can capture complex relationships between demand and various factors, making it suitable for accurately forecasting demand even in situations with intricate patterns. Model building using XGBoost ● Conducted hyper parameter tuning to get more accurate predictions cv_split = TimeSeriesSplit(n_splits=4, test_size=10) model = XGBRegressor() parameters = { "max_depth": [3, 4, 6, 5, 10], "learning_rate": [0.01, 0.05, 0.1, 0.2, 0.3], "n_estimators": [100, 300, 500, 700, 900, 1000], "colsample_bytree": [0.3, 0.5, 0.7] grid_search = GridSearchCV(estimator=model, cv=cv_split, param_grid=parameters) grid_search.fit(X_train, y_train) Model Evaluation MAE:- MSE:- MAPE:- As the data is very less (119 rows), after removing outliers the training data has 90 rows which is very less for an algorithm to learn the trend between variables and predict the output with greater accuracy, so that’s why the actual and predicted output differ by a great amount and hence the higher value for errors.