How Much Data Is Good For Forecasting?
Data for forecasting
When a demand planner thinks about Forecasting, the first thing that comes to mind is data. It is a common question by the planners about ‘how much data is required for forecasting’. In this article, we are going to discuss the amount of data required for reliable Forecasting. First, let us burst the bubble here, there is no golden rule for the amount of data required for Forecasting. It mostly depends on the type of product and noise in the data.
Product as a data driver
Let us discuss what do we mean by type of product. Type of product could mean the broad category or industry in which the product falls. If the product is a fashion product or other technology product like a smartphone, laptop, television, etc., then the frequency of revisions is quite high because of the competition, continuous R&D and changing consumer fashion & lifestyle needs.
This means that the sales pattern of this product would change quickly and thus it is not a good idea to take longer historical data during time series forecasting. If you take longer historical data then you would unnecessarily incorporate the sales behavior which is not truly a representative of the current situation. It would further make it a tedious job for the planner to correct the forecast by emphasizing more on the recent trend. For the majority of such products, it is a good idea to use models, which gives more emphasis on recent history rather than the distant past.
There are statistical models like exponential smoothing which can easily do this.Now let’s talk about those products which have stable sales behavior and don’t undergo a change frequently. It could be in case of product where the life cycle of the product is long and thus you don’t see much change in the sales behavior for example machinery product, tools, automotive, or core apparel products, etc.
In the case of these products, it is good to use longer data three years or more where the demand planner can easily see a stable demand behavior. Statistical models like ARIMA work well in case of these products as the noise is less and there is a stable seasonal sales behavior.
Demand as a data driver
If the product being forecasted is expected to show seasonal demand behavior then we must use at least 2 years of data history for the forecast to be reliable using methods like exponential smoothing. This has a very logical explanation and yet most organizations miss the point and expect the good forecast to be computed with much lesser history.
For your planning system to determine the seasonality of the demand it must have two data points for each season. For instance, the system does not understand what Christmas season is but, when it sees the pattern of sales going up every 12 months at the same time for two cycles, it detects seasonality in the monthly data and incorporates the same in the model.
If it has only one Christmas as a data point then it shall not be able to confirm this seasonality. Hence, it is recommended that to the extent possible one should have at least 2 years of data history for the purpose of good forecasts.
New products and locations
For the new product introductions, the initial sale could be very high due to launch promotion and after a few months, the sales become normal. It could be also in case of new store launch where the store may offer launch promotion. In such cases, it is important to discard the starting few months/days, which are not a true representation of the product or location demand. In such cases, new product forecasting methods like analogy is usually helpful.If the analogy method is not used, then the sound knowledge of the business becomes extremely important for the forecast.
A short available history may not be sufficient for your system’s expert selection to pick the right model for the product. However, you may choose the right model yourself by matching the business knowledge with the available models. For instance, if we know that the product is a seasonal one but there is not much historical data for the system to determine the same, we may manually select seasonal models like Winters. It is important to note that you should not use more than 7 years of history while using time series methods, as it would not improve your forecast. Earlier data can be discarded in these cases to get a much reliable forecast.