The Problem with Negative Values in Time-Series Forecasting
There is an old expression that is ubiquitous in human activities that involve the use of input data to create actionable information. That expression is: ‘Garbage In, Garbage Out’. The same axiom applies to time-series forecasting. Time-series forecasting is a technique which is predicated upon the tenet that a mathematical model which fits the pattern of the historical time-series (customer demand or shipment values for past time periods), can be extended beyond the current period and used as a forecast of yet to be realized demand. Read more about that from here. If the time-series used to create a forecast is based upon a faulty time-series, then the resultant forecast is likely to be corrupted. (Garbage In, Garbage Out)
Demand planners today have the benefit of a significant amount of highly clever and intelligent work in the form of pattern fitting models that use rather sophisticated mathematics to optimally fit historical time series for the purpose of ‘forecasting’ the future. These models coupled with new, lighting fast computing power and enabling tools that can quickly find the best model which fits any given time series give the demand planner power to create more accurate forecasts than ever before.
However, nothing will subvert the veracity of a forecast model like embedded negative time-series values. Indeed, when negative time-series values encroach upon a time-series, they truly become the bane of most time-series forecasting techniques.
In last week’s article, we discussed the notion of Intermittent Forecasting techniques. Upon closer inspection, you will notice that the negative values that appeared in that time-series, that was the focus of our article was subtlety absorbed in the gross average calculated by Croston’s and then those time periods in which the negative values appeared were ignored when the final forecast was calculated (by spreading the total series demand over non-negative observations.) This approach has the effect of pushing the problem downstream by sub-optimizing the forecast. If you are not convinced, take a simple model, a level exponential smooth for example, and manually calculate the fitted forecast on a time-series containing negative values. See what happens to the model when the negative values are encountered.
Negative Time-Series Forecasting Values
For our discussion, we are going to use the following time-series. This is a screen print from the PLANAMIND demand planning statistical model optimizing tool. PLANAMIND is one such enabling tool that is affordable and can quickly evaluate a multitude of models and associated parameters to arrive at a mathematical model that best fits the historical time-series observations.
Image Source : Planamind
If you will remember from last week’s article, I split the Intermittent Demand time-series into two obvious components. The first one was the sub-series consisting of the non-positive historical observations (see the red circle in the graph in Figure 1 above). Conversely, the second sub-series would be those observations consisting of positive demand values (see the blue circle in the graph in Figure 1 above).
This week I would like to focus on those negative observations, indicated by the black arrows in the figure above. If the time-series consists of customer order demand whose timing is based upon the ‘customer’s desired delivery date’, negative values are usually caused by returns in a given time period in excess of regular orders for the SKU. On the other hand, if the time-series consists of customer shipments, the negative values can be caused by returns of product on May 1st for an order that was shipped on April 30th. “My contention is that in this case, the demand collection logic should be modified to properly associate the return with the appropriate order in its original time frame. This association would then cause the original demand to be decremented. It only makes common sense that if one purchases 100 units of a product in February and returns 40 from that order in May that the ‘real’ demand should be adjusted for that order (customer, product and location) to 60 in February (i.e. 100-40).” While it is easy to allow the demand collection system to operate in this way, it is not right. Being easy does not mean something is the correct thing to do for your organization. This misattribution of demand or shipments affects not only the level of demand but the timing of demand (See last week’s article entitled ‘Forecasting Intermittent Demand’ here).
Conclusion and Further Thoughts
This article has focused primarily on negative time-series values. What we are suggesting is that instead of accepting this anomaly, it is worth the effort to re-cast historical demand to decrement the appropriate time-series value to account for a return or cancelation. The cost of making this association and recasting the FORECASTABLE time-series will be recovered with the increased forecast accuracy leading to increased customer service with decreased working capital investments in safety stock. This is not conceptual as I have been involved in such improvement projects that have added value to the organization far beyond the inconvenience cost of making such time-series recasting.
Please note that I have specifically stated that it is FORECASTABLE time-series and not modifying more permanent organizational variables.
Finally, negative values for time-series forecasting are not the only problem that can cause erroneous forecasting time-series values, they just happen to be the most obviously incorrect. In subsequent articles, we can discuss the notion of the other ways in which the time-series can be corrupted. One example that comes to mind is that of a customer who orders multiple times to ensure supply and then cancels any remaining balances upon receipt of adequate supply. This practice further exacerbates the accuracy of the time-series by inflating the time-series values in one period and potentially decrementing the time-series value in a different (incorrect) period. While the result, in this case, may not manifest itself in negative time-series values, the timing of the customer’s true requirements and its level have been corrupted. Consider the disruption caused to the customer by time-series forecasting requirements based on data that does NOT contain correct level of requirements and mistiming of said requirements.
As always, we at Anamind seek your thoughts and opinions on the foregoing. Have you ever encountered this situation? If so, how have you mitigated this problem and how did it improve customer service? Did you find the destination (an accurate time-series) worth the journey? Thank you for your consideration.