skip to Main Content

Do you know how to measure forecast accuracy using business criteria? #ForecastChallengeHowdazz

The main goal of the Forecast Challenge was to identify the company that calculated the best sales forecasts for our client, in a period of 8 weeks in 2019 using historical data of more than 2 years. To evaluate the forecasts we’d received, we classify the item-store combinations according to their sales in units. Then, we defined acceptance ranges where a forecast was very good, acceptable, or unacceptable. Later and as a control, we added a very bad forecast range.

Different tolerances by acceptation range were defined for each sales group. For example, an item that sells 520 units per year, equivalent to 10 units per week, a deviation of 10% in one period would be very good, because it would imply that we need a safety stock of 10% approx. to cover sales. We consider an error of 30% is acceptable and an error greater than 30% we consider is not acceptable. Finally, we defined that the forecast was very bad if the error exceeded 300%. In the table below, the acceptance ranges used for this case are shown:

We defined the forecast evaluation horizon based on the lead times that our customer has at different levels: there are stores that receive orders weekly and others stores receive orders every two weeks, the suppliers’ lead times had different ranges: there are suppliers with same day deliveries, others that delivers in a week or two, and others that take two-month. For this reason, and to guarantee good stock management in the stores and in the distribution center, we measured the quality of the participants’ forecast for 1 week, 3 weeks and 8 weeks.

Following our previous example of the item that has weekly sales of 10 units, we would find the situations shown in the table below:

The metric we used to evaluate the forecast accuracy was the bias in percentage at item-store level, it is calculated with the following formula:

When there are 0 sales, the formula shown above doesn’t work. Products with no sales in the periods were assessed in a separate group, using the Bias formula in value instead of in percentage: Forecast – Sales.

We counted the number of item-store combinations in each forecast acceptance range and evaluated multiple scenarios weighing the weekly forecast, the 3-week forecast, the 8-week global forecast, the degree of accuracy and each item with different values according to their sales.

We developed multiple scenarios giving different weights to the evaluation criteria of the forecasts for each sales group. Finally, we chose the scenario with the criteria that the client considered the most suitable for their business. Below, the table with the ranking of 15 companies participating in the forecast challenge is shown for each assessment scenario:

The heatmap shows the ranking by scenario where the companies were consistently good, mediocre or bad in most of the evaluated one. After presenting and reviewing the results with the client, an offer was requested from the 4 companies that obtained the best result according to the scenario in the first column of the previous table, which we consider the most correct. The customer evaluated these offerings and chose the company for their new replenishment solution.

Author: Casandra Cabrera
Publication date: 6 May 2021