Forecasting Intermittent Demand – A Proposal

Arguably, SKUs with Intermittent Demand are some of the most challenging to forecast with any modicum of accuracy.  We generally define Intermittent Demand as follows:

Intermittent series has demand appear at random with many time periods showing no demand at all. The prominent characteristics of such series are:

  1. The time-series contains embedded zeroes
  2. The time-series does NOT exhibit any seasonal behavior

I have struggled for many years to find an alternative to the conventional approaches.  As far back as the 1990’s, I began reading about machine learning and particularly neural networks.  This work was in its infancy, relative to today, due to the lack of computing power required for such artificial intelligence techniques.  However, I felt that we may be able to leverage machine learning at some time in forecasting difficult time-series in the future.  My thoughts that I lay out here in this article were not well received at the time.  I do not know if it is because of the novelty of machine learning and lack of computing power at the time to effectively give adequate results, or if it was just a poor idea.  

We at the Anamind Consulting Company are constantly seeking to leverage any and all technologies that will yield superior results in demand planning for our clients.  To that end, I am documenting this proposal, not as a dogmatic approach, but for the purposes of starting a discussion in the demand planning community about this potential approach.  I will begin with a review of the current status of Intermittent Demand time-series forecasting and then summarize my thoughts in the second half of the article.  Again, these ideas are only summarized and there is room for suggested improvements.  Of that, I am very certain.

Current State of Forecasting SKUs with Intermittent Demand

Croston’s Intermittent Demand model is a favorite of today’s automated demand planning tools for those SKUs which have apparently randomly distributed non-positive demand values.  Essentially, Croston’s model will take the total of positive demand and distribute it over all past time periods, taking that average and extending it into the future.  This has been a good model that has stood the test of time and it does a good job of taking highly variable demand and smoothing it out for the upstream supply chain processes of production scheduling and distribution.

However, at the end of the day, we have a constant forecast that, by definition, will have high period-to-period error when compared to the actual demand values.  This drives the inventory up to maintain an adequate customer service.  This is usually not a large problem for companies as these SKUs tend to be in the minority and their average demands are small.  That said, I have often wondered if there would not be a better way to forecast these SKUs.  

Forecasting the Intermittent Demand Series – Another Thought

For our discussion, we are going to use the following time-series.  This is a screen print from the PLANAMIND Demand Planning Statistical Model Optimizing Tool.
Planamind Tool Sample Forecast for Intermittent Demand

Figure 1
Source: Planamind

Intermittent Demand Time-Series Forecasting Proposal

This proposal splits the Intermittent Demand time-series into two obvious components.  The first one is the sub-series consisting of the non-positive historical observations (see the red circle in the graph in Figure 1 above).  Conversely, the second sub-series would be those observations consisting of positive demand values (see the blue circle in the graph in Figure 1 above).

Forecasting the Non-Positive Component

Typically, the forecasting challenge with a time-series such as this is those pesky non-positive values.  To the naked eye they appear random.  However my 20-year old thought was to eliminate the negative demand values and consider them as zeroes.  As a parenthetical, negative demand values should compose a minimum amount of the total demand.  Secondly, these values are usually caused by returns in a given time period in excess of regular orders for the SKU.  My contention is that in this case, the demand collection logic should be modified to properly associate the return with the appropriate order in its original time frame.  This association would then cause the original demand to be decremented.  It only makes common sense that if one purchases 100 units of a product in February and returns 40 from that order in May that the ‘real’ demand should be adjusted for that order (customer, product and location) to 60 (i.e. 100-40).  It is misleading and will cause forecast inaccuracies and supply chain problems to allow the 100 units to remain in February and allow a -40 to be ‘booked’ in May.   While it is easy to allow the demand collection system to operate in this way, it is not right.  Being easy does not mean something is ‘right’ (correct).  I managed the demand system in a multi-billion dollar organization that had this demand collection problem.  Sure, it was difficult to fix, but we did and increased forecast accuracy measurably.

Having said all that, I am suggesting this portion of the demand series now consists of a sequence of time-sensitive zero values and that we can apply powerful machine intelligence routines (i.e. a neural net) to ‘tease out’ a pattern of subsequent zero demand values for the future time series.

Forecasting the Positive Component

If we are able to use machine intelligence to lay in the pattern of non-positive demand points, then the demand planner will have a significant number of options at her disposal to calculate the forecast for the positive component.  Remember one of the characteristics of the Intermittent Demand time-series is that: The time-series does NOT exhibit any seasonal behavior.  Therefore, we could apply a non-seasonal exponential model (either NN or LN) to the remaining points.  Another option would be to use one of the simpler models like the Simple Moving Average (SMA) or even the Same as Last Year (SALY) or SALY with growth if growth can indeed be detected.  The point here is that there are several tools in the demand planner’s tool-kit for solving this part of the problem.


While this article in no means attempts to disparage the brilliant work that has been done by Croston, we at Anamind are always seeking better ways to fit the forecast as close to actual demand with the proper lead time to improve performance in the Supply Chain.  Croston’s will do a fine job by providing a level forecast to build inventory for the peaks of demand and maintain lower levels of inventory for those periods in which no demand occurs.  However, a more accurate forecast on these time-series will further reduce the investment in working capital via maintenance of safety stock.

I will close with a quote from Charles Chase, a very respected demand planning expert:

‘The objective of demand-driven forecast is to predict unconstrained demand as accurately as possible.  This would include predicting the peaks and valleys that resonate in true demand.  When we smooth the forecasts, we normally overlook the peaks and valleys.’

Charles Chase, “Demand-Driven Forecasting – A Structured Approach to Forecasting” pp. 142

I am excited to hear from you on improvements to these thoughts.  I thank you for reading this article.


  1. James Anderson says:

    It is really great Idea Sir, but I have a doubt in implementing it.. let’s say I’m working with monthly sales data with embedded zeros or negative sale for 4 months..
    In this how to build a Smoothing model with just 8 periods of non zero values ?

    1. Hello Mr. Anderson:

      Yours is a very good scenario to consider. I would suggest that you could do a simple level exponential smooth or maybe even test for trend over those 8 periods. You could also take an easy way out until you gained more history by applying the simple moving average over 1 through 8 periods to determine which best fits the demand.
      Remember that your question is LOGICALLY EQUIVALENT to asking ‘what forecasting model could I use to forecast 8 non-zero periods of demand, albeit that demand has a bit of a gap in it filled by four periods of zero demand’. There are many tools, PLANAMIND(c) being one of them, that can handle the situation of finding an optimum model for eight period of non-zero demand.

      Again, thank you for the question. I am encouraged by your creative thought process. I am hoping that other people opine to better frame the problem and potentially create a more creative solution.

      Steve Miller

  2. Terry Harris says:

    Interesting. I feel, however, that the improvement you’ll make in better forecasts is marginal and can be overwhelmed by the improvement you could make in setting Safety Stocks–a primary purpose in forecasting.

    To this end you might look at

    Thanks for your thinking on this subject.

    1. Hello Terry:

      Thank you for your comments. One thing that I have feared about my proposal is that the zeroes, for which we are trying to ‘tease out’ a pattern using machine pattern recognition, may simply be random. It that is the case then we may exacerbate the forecast error, thereby increasing safety stock. However, if I can directly respond to your statement, you made the concession that we may marginally improve the forecast error (MAD). If that is the case, and the organization becomes confident with the ability of the routine to track demand closer, then we should be able to reduce safety stock. (Please see our article entitled, ‘Leveraging Demand Planning for Improved Supply Chain Performance.)

      While likely not the best solution to such variable demand, the notion of a level forecast automatically builds safety stock into the system that we, as demand planners, could reduce if it would be possible to create an algorithm or composite of algorithms to more closely fit this highly variable future demand.

      Thank you sincerely for your comments,
      Steve Miller

  3. If you don’t discard the residuals from a smoothing operation, you will be able to track the peaks and troughs. The problem is the analyst, not the data.

    How do you know intermittent data are NOT seasonal? The sparseness in the data may hide the pattern, but looking at the root source/causes of the demand should tell you whether the intermittent demand could be seasonal or not.

    Croston may have been the first to develop a forecasting method for intermittent data, but the idea was not brilliant; in fact, it has been shown that there is no model for which the Croston method is an optimal solution. .

    1. Dear Mr. Levenbach:

      Thank you greatly for your comments. I am hoping that I capture the essence of your thoughts in my reply. You are absolutely correct vis-a-vis the residuals in a smoothing situation.

      As far as the potential seasonality that may exist in the sparse data, that may be so; however, by its very definition (i.e. intermittent demand); it will be difficult to use any time-series technique to determine that with such few data points. That is why I am proposing some more sophisticated machine learning/pattern seeking algorithm. However, that said, as my article mentioned I am not content with the solution because I am a bit dubious of the veracity of such algorithms on such sparse data. You following statement:

      “The sparseness in the data may hide the pattern, but looking at the root source/causes of the demand should tell you whether the intermittent demand could be seasonal or not.’

      is exactly ON THE TARGET. The assumption in my thought/proposal was that I had no knowledge of any causative variables (I did not state that directly, so I can see why that may have lead to some confusion).

      However, causative modeling will always be superior to time-series forecasting if there is a true cause and effect between an independent variable and the demand that is generated. If such a variable was extant (or identifiable) and the relationship quantifiable, then I would definitely advocate for using that over what I have suggested.

      You are correct about the simplicity of Croston’s. It is really a large smoothed average of all non-zero data evenly distributed over all periods in the time-series. That, by definition, builds unnecessary inventory (in my opinion) and why I seek to find a way to narrow the gap between forecast and actual demand in these situations.

      Thank you for your insightful comments,

  4. Don Johnston says:

    Intermittent demand certainly complicates demand forecasting. Croston’s algorithm is the best known one for handling it. However, it is not without problems. One is that it produces biased estimates of the demand rate. If demand were to stop completely and never resume then the demand rate estimate would never decrease after that. The bias can be compensated for but that does not really deal with the problem fully. For fast moving items, Croston’s algorithm is the same as ordinary exponential smoothing. One problem with that technique is that it tends to result in over-ordering (See

    I have developed a modified version of Croston’s algorithm and called it the “Croston Medium algorithm”. I designed it to deal with the main shortcomings of Croston’s algorithm. I described it briefly in I have since improved it and will write a blog post in that regard in

    In your article, I see the log-normal distribution mentioned. I am wondering if you used that distribution for generating the time series. Negative binomial is a much better model for demands, especially if the demands are intermittent. For fast moving items, that distribution can be approximated by a gamma distribution.

    1. Hello Mr. Johnston:

      I wanted to reply quickly to your comments. I have not had a chance to review your referenced web-sites, but I will do that.

      Thank you for pointing out some weaknesses relative to Croston’s. I would like to respond to your following statement:

      “However, it is not without problems. One is that it produces biased estimates of the demand rate. If demand were to stop completely and never resume then the demand rate estimate would never decrease after that.”

      I just wanted to clarify that Planamind, our company’s demand planning enabling engine, recalculates and optimizes (based on minimum MAD) the forecast model to use for each time-series in an organization. In your scenario when the demand would stop, if Croston’s continues to be chosen as the optimal model then the fixed positive demand would be spread over ever increasing forecasting periods. The zero demand would not contribute to the total non-zero demand values in the historical time series, but the number of time periods would continue to increase, thereby reducing the level forecast each month.

      Log normal distribution was not used to generate the historical time-series, this was actual historical information that we encountered in our business on a given SKU. The product and organization, of course, have been changed to protect our client.

      One final note, we have found that negative binomial distribution is effective at forecasting time-series with discrete demand and small unit demand values.

      Again, thank you for your comments. We appreciate the thoughts and am interested to view your contribution to the intermittent demand debate 🙂

      With sincere gratitude,
      Steve Miller

Leave a Comment

Your email address will not be published. Required fields are marked *