When a demand planner thinks about Forecasting the first thing that comes to mind is data. It is a common question by the planners about ‘how much data is required for forecasting’. In this article, we are going to discuss about the amount of data required for the reliable Forecasting. First let us burst the bubble here, there is no golden rule for amount of data required for Forecasting. It mostly depends on type of product and noise in the data.

Product as a data driver

Let us discuss what do we mean by type of product. Type of product could mean the broad category or industry in which the product falls. If the product is a fashion product or other technology product like smart phone, laptop, television etc., then the frequency of revisions is quite high because of the competition, continuous R&D and changing consumer fashion & lifestyle needs. This means that the sales pattern of this product would change quickly and thus it is not a good idea to take longer historical data during time series forecasting. If you take longer historical data then you would unnecessarily incorporate the sales behavior which is not truly a representative of the current situation. It would further make it tedious job for the planner to correct the forecast by emphasizing more on the recent trend. For majority of such products it is a good idea to use models, which gives more emphasis on the recent history rather than the distant past. There are statistical models like exponential smoothing which can easily do this.

Now let’s talk about those products which have stable sales behavior and don’t undergo a change frequently. It could be in case of product where the life cycle of the product is long and thus you don’t see much change in the sales behavior for example machinery product, tools, automotive, or core apparel products etc. In case of these products it is good to use longer data three years or more where the demand planner can easily see a stable demand behavior. Statistical models like ARIMA work well in case of these products as the noise is less and there is a stable seasonal sales behavior.

Demand as a data driver

If the product being forecasted is expected to show seasonal demand behavior then we must use at least 2 years of data history for the forecast to be reliable using methods like exponential smoothing. This has a very logical explanation and yet most organizations miss the point and expect the good forecast to be computed with much lesser history.

For your planning system to determine the seasonality of the demand it must have two data points for each season. For instance the system does not understand what Christmas season is but, when it sees the pattern of sales going up every 12 months at the same time for two cycles, it detects seasonality in the monthly data and incorporates the same in the model. If it has only one Christmas as a data point then it shall not be able to confirm this seasonality. Hence, it is recommended that to the extent possible one should have at least 2 years of data history for the purpose of good forecasts.

New products and locations

For the new product introductions the initial sale could be very high due to launch promotion and after few months the sales become normal. It could be also in case of new store launch where the store may offer launch promotion. In such cases it is important to discard the starting few months/days, which are not true representation of the product or location demand. In such cases new product forecasting methods like analogy is usually helpful.

If the analogy method is not used, then the sound knowledge of the business becomes extremely important for the forecast. A short available history may not be sufficient for your system’s expert selection to pick a right model for the product. However, you may choose the right model yourself by matching the business knowledge with the available models. For instance if we know that the product is a seasonal one but the there is not much historical data for system to determine the same, we may manually select seasonal models like Winters.

It is important to note that you should not use more than 7 years of history while using time series methods, as it would not improve your forecast. Earlier data can be discarded in these cases to get a much reliable forecast.