Predictive modeling is a process used in the field of predictive analytics to create a statistical model of future behavior. Demand forecasting models are used to forecast the future sales demand of items as a function of past demand data. One challenge in large scale demand forecasting is to plan a forecasting strategy that minimizes the forecast error. Improving accuracy and efficiency of demand forecasting processes can improve overall sales and operational planning effectiveness. Further, improvements may improve the computational cost and accuracy of a generated forecast model.
Hierarchy information used to generate model forecasts from time series often reflect planning purposes, instead of modeling purposes. Focusing on planning aspects may make it easier to understand and manage the data, but might not be adequate to modeling demand in a time series. For example, a planning hierarchy may group multiple time series having different features together (e.g., a time series with a peak in spring and a time series with a peak in fall). Additionally, or alternatively, a planning hierarchy may group a time series with small variance with another time series having more volatility. In either example, such groupings are not ideal for building a forecasting model due to the time series being directed to different patterns in data.
In accordance with the teachings provided herein, systems and methods for improving the accuracy and the efficiency of demand forecasting processes.
For example a computer-program product tangibly embodied in a non-transitory machine-readable storage medium is provided that includes instructions that can cause a data processing apparatus to receive a plurality of time series included in a forecast hierarchy, each individual time series of the plurality of time series comprising one or more demand characteristics and a demand pattern for an item, the one or more demand characteristics including at least one of a demand lifecycle, an intermittence, or a seasonality, the demand pattern indicating one or more time intervals for which demand for the item is greater than a threshold value. The instructions can further cause the data processing apparatus to, for each time series of the plurality of time series, determine a classification for the individual time series based on the one or more demand characteristics, determine a pattern group for the individual time series by comparing the demand pattern to demand patterns other time series in the plurality of time series, and determine a level of the forecast hierarchy at which the each individual time series comprises an aggregate demand volume greater than a threshold amount. The instructions can further cause the data processing apparatus to generate an additional forecast hierarchy using the first forecast hierarchy, the classification, the pattern group, and the level. The instructions can further cause the data processing apparatus to provide, to a user of the computer-program product, forecast information related to at least one time series of the plurality of time series based on the additional forecast hierarchy.
In another example, a computer-implemented method is provided that includes receiving a plurality of time series included in a forecast hierarchy, each individual time series of the plurality of time series comprising one or more demand characteristics and a demand pattern for an item, the one or more demand characteristics including at least one of a demand lifecycle, an intermittence, or a seasonality, the demand pattern indicating one or more time intervals for which demand for the item is greater than a threshold value. The method further includes, for each time series of the plurality of time series, determining, by a computing device, a classification for the individual time series based on the one or more demand characteristics, determining, by the computing device, a pattern group for the individual time series by comparing the demand pattern to demand patterns other time series in the plurality of time series, and determining, by the computing device, a level of the forecast hierarchy at which the each individual time series comprises an aggregate demand volume greater than a threshold amount. The method further includes generating, by the computing device an additional forecast hierarchy using the first forecast hierarchy, the classification, the pattern group, and the level, where utilizing the additional forecast hierarchy generates more accurate demand forecasts than demand forecasts generated utilizing the first forecast hierarchy. The method further includes providing to a user of the computer-program product, forecast information related to at least one time series of the plurality of time series based on the additional forecast hierarchy.
In another example, a system is provided that includes a processor and a non-transitory computer readable storage medium containing instructions that, when executed on the processor, cause the processor to perform operations. The operations include receiving a plurality of time series included in a forecast hierarchy, each individual time series of the plurality of time series comprising one or more demand characteristics and a demand pattern for an item, the one or more demand characteristics including at least one of a demand lifecycle, an intermittence, or a seasonality, the demand pattern indicating one or more time intervals for which demand for the item is greater than a threshold value. The operations further include, for each time series of the plurality of time series, determining a classification for the individual time series based on the one or more demand characteristics, determining a pattern group for the individual time series by comparing the demand pattern to demand patterns other time series in the plurality of time series, and determining a level of the forecast hierarchy at which the each individual time series comprises an aggregate demand volume greater than a threshold amount. The operations further include generating an additional forecast hierarchy using the first forecast hierarchy, the classification, the pattern group, and the level, where utilizing the additional forecast hierarchy generates more accurate demand forecasts than demand forecasts generated utilizing the first forecast hierarchy. The operations further include providing to a user of the computer-program product, forecast information related to at least one time series of the plurality of time series based on the additional forecast hierarchy.
Like reference numbers and designations in the various drawings indicate like elements.
Certain aspects of the disclosed subject matter relate to demand classification and segmentation, which may enhance or generate a planning hierarchy so that forecast accuracy can be improved while maintaining ease of data management. Techniques discussed herein can enable users to analyze and classify time series into a set of pre-determined classifications based on certain criteria. “Time series,” as used herein, refers to a sequence of data points, typically consisting of successive measurements made over a time interval. References to “time series” is intended to refer to one or more individual time series unless otherwise specified. For each classification, each time series may be further grouped based on demand patterns and volume characteristics. An aggregation strategy can then be applied to the forecasting process to improve the forecast accuracy.
Demand classification can be accomplished using multiple modules. For example, a demand classification and segmentation engine may include three modules: a classification module, a pattern-clustering module, and a volume-grouping module. A classification module may analyze each time series and classify each time series based on characteristics such as demand lifecycle, intermittence, and seasonality, so that appropriate modeling techniques can be applied to each demand series. A pattern-clustering module may group one or more time series into different dusters based on similar yearly demand patterns as well as the demand characteristics derived from the classification module. Demand at lower levels such as SKU/store demand might often be insufficient to generate accurate forecasts due to low signal to noise ratio. Accordingly, a volume-grouping module may be used to automatically identify an appropriate aggregation level based on a user-defined hierarchy to generate robust and reliable forecasts. The generated forecasts may then be used to reconcile to lower level forecasts.
In one example, the environment 100 may include a stand-alone computer architecture where a processing system 110 (e.g., one or more computer processors) includes the system 104 being executed on it. The processing system 110 has access to a computer-readable memory 112.
In one example, the environment 100 may include a client-server architecture. Users 102 may utilize a PC to access servers 106 running a system 104 on a processing system 110 via networks 108. The servers 106 may access a computer-readable memory 112.
DCS engine 209 may include a number of modules (e.g., classification module 211, pattern-clustering module 213, and volume-grouping module 215). These modules may be software modules, hardware modules, or a combination thereof. If the modules are software modules, the modules can be embodied on a computer-readable medium and processed by a processor in any of the computer systems described herein. It should be noted that any module or data store described herein, may be, in some embodiments, a service responsible for managing data of the type required to make corresponding calculations. The modules may exist within the DCS engine 209 or may exist as separate modules or services external to the DCS engine 209. These modules may be directed to performing operations of the DCS engine 209 to accelerate the demand forecasting processes, resulting in improved computational performance of CPU 204 during operations of predictive modeling.
A disk controller 210 can interface one or more optional disk drives to the bus 202. These disk drives may be external or internal floppy disk drives such as storage drive 212, external or internal CD-ROM, CD-R, CD-RW, or DVD drives 214, or external or internal hard drive 216. As indicated previously, these various disk drives and disk controllers are optional devices.
A display interface 218 may permit information from the bus 202 to be displayed on a display 220 in audio, graphic, or alphanumeric format. Communication with external devices may optionally occur using various communication ports 222. In addition to the standard computer-type components, the hardware may also include data input devices, such as a keyboard 224, or other input/output devices 226, such as a microphone, remote control, touchpad, keypad, stylus, motion, or gesture sensor, location sensor, still or video camera, pointer, mouse or joystick, which can obtain information from bus 202 via interface 228.
The DCS engine (e.g., the DOS engine 209) can include at least three modules: a classification module (e.g., classification module 211), a pattern-clustering module (e.g., pattern-clustering module 213), and a volume-grouping module (e.g., volume-grouping module 215).
The classification module 211 can classify each demand time series based on characteristics such as demand lifecycle, intermittence, and seasonality. A “demand time series,” as used herein, is intended to refer to a time series in which data points represent a degree of demand of an item offered for sale. The classification results (e.g., demand time series statistics) can be output to users to enable the users to apply appropriate modeling techniques to each demand time series.
For example, regular candy and Valentine's day chocolates are usually stored in the same department in a grocery store since they are all candies, but should be put into different segments when modeled because regular candy is a long time-span product that sells all year round. When modeled, it may be possible to study the trend and seasonality of the candy throughout the whole year. In contrast, Valentine's day chocolates are short time-span products that typically sell only around Valentine's day, so when modeled, the user is likely only interested in focusing on a short period of time and is likely only interested in selecting a forecasting technique that is more suitable for time series having a short demand lifecycle. Classifying such items into different segments ensures that suitable factors are considered when modeling the demand for the item.
The pattern-clustering module (e.g., the pattern-clustering module 213) groups the demand series into different clusters based on similar yearly demand patterns as well as demand characteristics for each demand class derived from the classification module 211. The cluster defines each aggregate series and establishes the forecasting hierarchy so that each aggregated series may be a good representation of its child series.
For example, winter apparels (e.g., jackets) and summer apparels (swimsuits) are both short time-span products, but may have different demand patterns. A combined forecast approach for apparels might result in summer sales forecasts for winter wear items and winter sales forecasts for the swimming gear. Clustering such items separately, however, may ensure that the demand forecasts for the appropriate seasons are considered.
Demand volumes at lower levels in the hierarchy might be insufficient to generate accurate forecasts due to a low signal-to-noise ratio (SNR). In general, volume-grouping enables users to set a threshold level to aggregate sales, establish optimal reconciliation levels, and calibrate forecast models to generate reliable forecasts. The volume-grouping module 215 may reduce noise at lower levels in the hierarchy, so that robust demand signals can be obtained.
At 302, the classification module (e.g., the classification module 211 of the DCS engine 209) may classify each time series at specified level(s) into different classes, generate statistics of each of the demand series, and derive information about the demand characteristics for the time series.
After the demand classes are ascertained for a time series, the pattern-clustering process can be executed for each class of time series at 304. The pattern-clustering module (e.g., the pattern-clustering module 213 of the DCS engine 209) may generate a pattern attribute that is used to cluster the demand series. Demand series with the same, or similar, demand characteristic may be grouped together and clusters may be formed.
Volume group 308 and volume group 310 may be generated at 306 within the scope defined by the classification module 211 and the pattern-clustering module 213. In at least one embodiment, each volume group may be a group of nodes where the volume of an aggregated demand satisfies a minimum threshold. The volume-grouping module groups demand series with the same forecast reconciliation levels.
The classification module (e.g., the classification module 211 of DCS engine 209) may classify each time series at a specified level or levels into different classes as well as generate demand specific statistics of each time series. The purpose of demand classification is to provide information about each time series that will help in choosing the appropriate forecasting technique.
The classification of a time series may be important because different forecast techniques might be applied to different types of individual time series to improve forecast accuracy. For example, if a time series is known to be an intermittent time series, applying intermittent forecasting techniques (e.g., Croston's method) may produce more accurate forecast than selecting some other time series model (e.g., ARIMA). In addition, among all the intermittent forecasting techniques, some may be better suited to one time series over another. Ascertaining information about the time series by the classification module 211 may enable the classification module 211 to utilize the most suitable technique for forecasting the time series.
The demand classification process may have various class types that can be considered. For example, class types may include, but are not limited to, one of a short-history classification (SHORT), a low-volume classification (LOW_VOLUME), a short time-span non-intermittent classification (STS_NON_INTERMIT), a short time-span intermittent classification (STS_INTERMIT), a long time-span seasonal classification (LTS_SEASON), a long time-span non-seasonal classification (LTS_NON_SEASON), a long time-span intermittent classification (LTS_INTERMIT), a long time-span seasonal intermittent classification (LTS_SEASON), an optional long time-span unclassifiable classification (LTS_UNCLASS), an optional unclassified classification (UNCLASS), or an inactive classification (INACTIVE).
At decision block 708, classification module 211 may determine whether the seller is small and the occurrence is low as compared to a user-specified threshold value. If the seller is small and there is low occurrence, then the individual time series may be classified as a low-volume classification at block 710. If the seller is not small and there is not a low occurrence, then the flow may continue to decision block 712.
At decision block 712, classification module 211 may determine whether or not the time series is inactive based on a user-specified threshold. If the time series is inactive, then classification module 211 may classify the time series as an inactive classification at block 714. If the time series is not inactive, then the process may continue to decision block 716.
At decision block 716, classification module 211 may determine whether or not the individual time series has full demand cycles. A full demand cycle is a period during which the products are in-season/in-stock and may either be followed by a gap period, or a long inactive period. If the time series does not have full demand cycles, then the flow may proceed to decision block 718. At decision block 718, classification module 211 may determine the length of the current cycle (e.g., by comparing a current time to a latest demand period start). For example, if the latest demand period starts at week 10, and the current time is week 20, then the length of the current cycle is 10. If the length of the current cycle is greater than or equal to 48 weeks, then the classification module 211 may preliminarily classy the time series as a “Long Time-Span” time series at block 720. If the length of the current cycle is less than 48 weeks, then the time series may be classified as “Unclassifiable” at block 722. Though this example uses the example of forty-eight weeks as a threshold value, such a threshold may be any suitable period of time.
If the data set does have full demand cycles at decision block 716, then the process may proceed to decision block 724. At decision block 724, classification module 211 may determine a maximum demand cycle length. A maximum demand cycle length may be determined by computing the length of all full demand cycles followed by selecting the maximum of the computed lengths. If the length of the demand cycle is greater or equal to 48 weeks, or another suitable period of time, then the classification module 211 may classify the time series as a “Long Time-Span” time series at block 726. If the length of the current demand cycle is less than 48 weeks, or another suitable period of time, then the time series may be classified as “Short Time-Span” at block 726.
At decision block 808, classification module 211 may determine a length of time over which observations are included in the time series. If the number of observations spans less than, for example, 78 weeks, then classification module 211 may preliminarily classify the time series as a “Long Time-Span Unclassified” time series at block 810. If the number of observations spans less than at least, for example, 78 weeks or more, then the flow may proceed to decision block 812. Though 78 weeks is given as an example, it should be noted that any suitable period of time may be similarly utilized.
At decision block 812, classification module 211 may determine whether or not the time series passes a season test (e.g., SAS standard season test). If the time series passes the season test, then the time series may be classified as a “Long Time-Span Seasonal” time series at block 814. If the time series does not pass the season test, then classification module 211 may classy the time series as a “Long Time-Span Non-Seasonal” time series at block 816.
As shown in
For “Long Time-Span” time series with a characteristic of intermittency, it may be difficult to whether or not the time series is seasonal because of the sparseness of the observations. This is where Top-down Reclassification may be utilized. The seasonality information from the hierarchy can be used, but instead of analyzing sibling series, which could all be intermittent, the usually less-sparse parent series may be analyzed. If the parent series is seasonal, then the child series may also be considered to be seasonal.
To be more specific, the reclassification may be done solely at CLASS_LOW level based on the intermediate classification results for both the CLASS_LOW and CLASS_HIGH level. If the parent series at the CLASS_HIGH level has been classified as LTS_SEASON, and the child series at the CLASS_LOW level has been classified as LTS_INTERMIT, then the Top-down Reclassification may reclassify the CLASS_LOW level child series as LTS_SEASON_INTERMIT.
The pattern-clustering module (e.g., the pattern-clustering module 213 of DCS engine 209) may group demand series based on demand patterns such as year-over-year monthly demand proportions, a monthly demand average, or parameter estimates based on ARIMA models. Pattern groups can be used in building a forecast hierarchy and improve forecast accuracy.
For example, winter clothes and summer swimming suits can both be short time-span products, but these products may have different demand patterns. Forecasting these products together may lead to inaccuracies due to the differing demand patterns. Forecasting the products separately, however, can ensure that the correct seasonality is considered.
In at least one example, demand series with similar patterns may be clustered together for each “long time-span seasonal” and “short time-span” time series. Various techniques can be used for clustering. For example, hierarchical clustering, K-means clustering, or a combination of the two may be used to cluster demand series with other time series having the same, or similar, demand patterns.
Hierarchical clustering can automatically determine an optimal number of clusters. However, hierarchical clustering may produce performance issues especially when the number of items to cluster exceeds a certain limit. K-means methods are computationally efficient. However, K-means methods may involve having to pre-specify a number of clusters. Thus, a hybrid process may be considered that combines the two methods to make use of the advantages of each method.
In at least one example, pattern-clustering module 213 may utilize a k-means algorithm to generate an initial set of dusters. A hierarchical clustering algorithm may be used on the duster centers generated from the k-means algorithm to determine an optimal number of dusters. Pattern clustering module 213 may execute the k-means algorithm with the original data as input, using the optimal number of dusters as determined by the hierarchical clustering algorithm.
The pattern-clustering module 213 can separate short time-span products with different selling seasons. Additionally, the pattern-clustering module 213 may identify key features to be considered in the model.
For example, if pattern-clustering results in 14 dusters, among all clusters, 12 clusters reveal demand peaks in 12 different months, from January to December.
Traditional forecasting algorithm uses standard calendar/standard time intervals that often do not work well with highly seasonal time series data having many inactive periods. For example, an Easter toy may only sell during a particular time of year, where the precise dates may shift, making predictions difficult to ascertain. Techniques that require a user to define an interval for the event (e.g., the weeks before and after the Easter holiday for which the toy will be in demand) are cumbersome and may produce inaccurate forecasts. Identifying a custom interval of the event or a season (e.g., winter) within the time series from the time series data can produce more accurate forecasts. Additionally, predicting future event intervals or season intervals based on custom intervals can be more efficient and more accurate than requiring user-defined intervals.
Custom intervals may be determined by a separate custom intervals module 217, or by any of the modules discussed herein. A module responsible for determining custom intervals for the demand in a time series may be part of a DCS engine (e.g., the DCS engine 209) or a component separate from the DCS engine.
Custom intervals module 217, or alternatively, classification module 211, may identify demand gaps in the time series. Demand classification, discussed above, can be used to identify demand gaps. For example, consecutive low demands with a length exceeding some threshold (e.g., 1 week) may be identified as a demand gap. The identified demand gaps may be used to determine demand cycles (e.g., periods for which demand is over a threshold amount for a threshold period of time). Once demand cycles are determined, the time series may be classified (e.g., by custom intervals module 217 or classification module 211) as one of the classifications discussed above.
Custom intervals module 217, or alternatively, pattern-clustering module 213, may cluster time series having the same, or similar, demand classifications together. Through clustering similar products with the same, or similar, seasonal pattern together, a stronger seasonal signal may be obtained. A stronger seasonal signal can result in more accurate custom intervals. Any suitable aggregation technique may be utilized, for example, the pattern-clustering algorithm discussed above in connection with the pattern-clustering module 213.
A process utilized by custom intervals module 217 for determining custom intervals may first begin with identifying demand gaps of the time series, or alternatively, of the aggregated time series. Demand classification, as discussed herein, may be utilized to identify such demand gaps. Alternatively, a time series segmentation or representation algorithm may be used to first approximate the time series. A time series can be represented as a sequence of individual segments, each with its own characteristic properties. A time series segmentation algorithm may be utilized by custom intervals module 217 to split the time-series into a sequence of such segments.
In at least one embodiment, the identified demand cycles may be classified as “event” or “seasonal” (e.g., by custom intervals module 217 or classification module 211) For example, if the mean demand cycle is larger than a seasonal threshold (e.g., 4 weeks) than the time series may be classified as “seasonal.” Alternatively, if an event (e.g., a holiday) occurs during the demand cycle, then the time series may be classified as “event.”
In one example, custom intervals module 217 may modify the demand cycle periods so that each demand cycle length is substantially the same. For example, demand cycles 1402, 1404, and 1406, may be analyzed to calculate a custom interval. Various methods determining a custom interval may be employed. For example, a user may select an interval rule that governs the manner in which the custom interval may be determined. Example interval rules ay include, but are not limited to, a minimum interval rule, a maximum interval rule, a mean interval rule, and a mode interval rule. Applying a minimum rule may result in a custom interval length that is less than a custom interval length determined by application of a maximum rule. For example, over the course of several years, an event type time series may indicate that each time an event occurs, the demand cycle for such events are, for example, at least three weeks long, and, for example, at most six weeks long. In this case, applying a minimum interval rule may result in future event intervals being customized to three weeks long, while applying a maximum rule may result in future event intervals being customized to six weeks long. Similarly, a mean rule and a mode rule may analyze event occurrences in the event type time series and determine a custom interval length based on the mean length of event cycles in the time series, or a mode length of event cycles in the time series, respectively. In some embodiments, the interval rule used to calculate the custom interval length may be pre-specified.
Custom intervals module 217 may also apply Interval rules in a similar manner to seasonal-type time series to determine a season length. For example, a time series may indicate that a season typically starts on the week ten of a year and lasts at least sixteen weeks and at most twenty weeks. Application of a minimum interval rule may result in a custom interval for the seasonal type time series of sixteen weeks. Application of a maximum interval rule may result in a custom interval for the seasonal type time series of twenty weeks. The mean interval length over the course of the seasonal time series may be twelve weeks. Application of the mean interval rule may result in a custom interval for the seasonal-type time series of twelve weeks. The length of season occurring most often (e.g., a mode interval length) in the seasonal-type time series may be, for example, thirteen weeks. Thus, application of the mode interval rule may result in a custom interval for the seasonal-type time series of thirteen weeks.
In some cases, a time series may include an incomplete demand cycle. For example, Demand cycle portion 1306 of
Determined custom intervals may be used to predict a future event or season. For example, having determined a custom interval of three weeks for an event (e.g., Easter, Apr. 20, 2014), a future demand cycle for a similar future event may be calculated based on identifying the day on which the event occurs in the future (e.g., Easter, Apr. 5, 2015). Similarly, having determined a custom interval of sixteen weeks for a season (e.g., summer 2014) and a start index for the season (e.g., typically week 26), future demand cycles for the season may be predicted.
Demand forecasting for lower levels in the hierarchy might result in poor statistical forecasts due to insufficient demand volume and large random variations. Reliable forecasts can be generated if there is a sufficient volume of data. Volume-grouping can be used to aggregate data and minimize random variation in data. By aggregating data, stronger underlying demand signals can be obtained. This may make demand patterns easier to be detected by the models.
The volume-grouping module (e.g., the volume grouping module 215) enables users to determine the appropriate forecast reconciliation level to ensure that the forecasts are generated at a level with sufficient demand volume while retaining, as much as possible, specific patterns of each demand time series.
Volume-grouping module 215 may generate a number of volume groups. These volume groups may be generated based on the user-specified volume threshold, which can be based on the demand averages. A user can define a level in the hierarchy as the lowest grouping level. Starting from the lowest grouping level, if a series has sufficient volume, then a forecast may be generated at the lowest level to capture any series-specific patterns. Otherwise, the series may be aggregated to one level higher via the input hierarchy with other low volume series until it reaches a level with sufficient volume, or alternatively, it reaches the top level.
The process of volume-grouping can be run stand-alone, or after classification and pattern-clustering. Volume-grouping module 215 may generate forecasts at a volume-group level and disaggregate data down to lowest level. Two hierarchy-based volume-grouping types utilized by volume-grouping module 215 include dynamic grouping and dynamic grouping with hierarchy restriction.
In a dynamic grouping type there can be two parameters defined as the volume threshold. For example, avg_demand_threshold and min_frequency_threshold. If the average demand of an aggregated time series is greater than, or equal to the avg_demand_threshold, and the number of demand occurrences is greater than, or equal to, the min_frequency_threshold then the time series may be considered to have sufficient volume.
If a series at the lower level has sufficient volume, then the forecast may be generated at this particular level to capture any series-specific patterns. Otherwise, the time series may be aggregated to one level higher with other low volume series until an aggregated time series reaches a level with sufficient volume, or the aggregated time series reaches the top level. Some further details are illustrated through the following example.
In
Volume-grouping module 215 may repeat the process described above until each branch has a top-most node that exceeds the volume thresholds, or until the top of the hierarchy is reached.
Dynamic Grouping with Hierarchy Restriction
In the example shown in
As used herein, “qualified nodes” are nodes that pass the volume threshold, while “unqualified nodes” are nodes that do not pass the volume threshold. In at least one example, if the number of unqualified nodes exceed a certain percentage of the total number of siblings (min_unqualified_node_count_pct), or the total demand of the unqualified nodes is greater than a certain percentage of the total demand of all siblings (min_unqualified_volume_pct), then the sibling nodes may be aggregated up and continue the process. Otherwise, all siblings may be assigned a group and the current level may be selected as the level to reconcile. In one example, the same hierarchy may be used from
As depicted in
At block 2304, an individual time series of the multiple time series may be selected. For example, the forecast hierarchy may be traversed to select a time series. Alternatively, time series may be selected at random.
At block 2306, a classification for the individual time series may be determined. The classification may be determined (e.g., by classification module 211 of
At block 2308, a pattern group for the individual time series may be determined (e.g., by pattern-clustering module 213 of
At block 2310, a level of the forecast hierarchy at which the individual time series will have an aggregate demand volume greater than a threshold amount may be determined (e.g., by volume-grouping module 215 of
At block 2312, a determination as to whether or not more time series are in the forecast hierarchy is made. If more time series exist, then the flow may proceed back to block 2304 and block 2304 to block 2312 may be repeated until no more time series exist in the hierarchy that have not been classified, grouped, and aggregated according to block 2306 through block 2310.
When no more time series exist in the forecast hierarchy, the flow ay proceed to block 2314 where a second forecast hierarchy may be generated (e.g., by DCS engine 209 of
At block 2316, forecast information related to at least one time series of the multiple time series may be provided, for example, toe user. In at least one example, such forecast information (e.g., optional outputs, statistics regarding each time series, clustering measurements, and the like) may be useful for a downstream process.
At block 2404, a number of low-demand periods within the time series may be determined (e.g., by custom intervals module 217 of
At block 2406, custom intervals module 217 may determine a series type for the time series based on the determined low-demand period(s) from block 2404. The series type may be determined by identifying whether a demand period within the time series is above or below a seasonal threshold value. If the demand period is at or above the seasonal threshold length, then the time series' series type may be determined to be “seasonal.” If the demand period is below the seasonal threshold length, and a pre-defined event occurs during the demand period, then series type for the time series may be determined to be “event.”
At block 2408, custom intervals module 217 may determine an in-season interval of the time series based on the number of low-demand periods and the series type. The in-season interval may indicate a time interval or which the item has historically been in demand.
At block 2410 custom intervals module 217 may derive a future in-season interval based on the determined in-season interval. The future in-season interval may be a predicted time interval during which demand for the item is predicted to be greater than a threshold value.
Systems and methods according to some examples may include data transmissions conveyed via networks (e.g., local area network, wide area network, Internet, or combinations thereof, etc.), fiber optic medium, carrier waves, wireless networks, etc. for communication with one or more data processing devices. The data transmissions can carry any or all of the data disclosed herein that is provided to, or from, a device.
Additionally, the methods and systems described herein may be implemented on many different types of processing devices by program code comprising program instructions that are executable by the device processing subsystem. The software program instructions may include source code, object code, machine code, or any other stored data that is operable to cause a processing system to perform the methods and operations described herein. Other implementations may also be used, however, such as firmware or even appropriately designed hardware configured to carry out the methods and systems described herein.
The system and method data (e.g., associations, mappings, data input, data output, intermediate data results, final data results, etc.) may be stored and implemented in one or more different types of computer-implemented data stores, such as different types of storage devices and programming constructs (e.g., RAM, ROM, Flash memory, removable memory, flat files, temporary memory, databases, programming data structures, programming variables, IF-THEN (or similar type) statement constructs, etc.). It is noted that data structures may describe formats for use in organizing and storing data in databases, programs, memory, or other computer-readable media for use by a computer program.
A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, subprograms, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network. The processes and logic flows and figures described and shown in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output.
Generally, a computer can also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data (e.g., magnetic, magneto optical disks, or optical disks). However, a computer need not have such devices. Moreover, a computer can be embedded in another device, (e.g., a mobile telephone, a personal digital assistant (PDA), a tablet, a mobile viewing device, a mobile audio player, a Global Positioning System (GPS) receiver), to name just a few. Computer-readable media suitable for storing computer program instructions and data include all forms of nonvolatile memory, media and memory devices, including by way of example, semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices); magnetic disks (e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks). The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
The computer components, software modules, functions, data stores and data structures described herein may be connected directly or indirectly to each other in order to allow the flow of data needed for their operations. It is also noted that a module or processor includes, but is not limited to, a unit of code that performs a software operation, and can be implemented, for example, as a subroutine unit of code, or as a software function unit of code, or as an object (as in an object-oriented paradigm), or as an applet, or in a computer script language, or as another type of computer code. The software components or functionality may be located on a single computer or distributed across multiple computers depending upon the situation at hand.
The computer may include a programmable machine that performs high-speed processing of numbers, as well as of text, graphics, symbols, and sound. The computer can process, generate, or transform data. The computer includes a central processing unit that interprets and executes instructions; input devices, such as a keyboard, keypad, or a mouse, through which data and commands enter the computer; memory that enables the computer to store programs and data; and output devices, such as printers and display screens, that show the results after the computer has processed, generated, or transformed data.
Implementations of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Implementations of the subject matter described in this specification can be implemented as one or more computer program products (i.e., one or more modules of computer program instructions encoded on a computer-readable medium for execution by, or to control the operation of, data processing apparatus). The computer-readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated, processed communication, or a combination of one or more of them. The term “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question (e.g., code that constitutes processor firmware, a protocol stack, a graphical system, a database management system, an operating system, or a combination of one or more of them).
While this disclosure may contain many specifics, these should not be construed as limitations on the scope of what may be claimed, but rather as descriptions of features specific to particular implementations. Certain features that are described in this specification in the context of separate implementations can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be utilized. Moreover, the separation of various system components in the implementations described above should not be understood as requiring such separation in all implementations, and it should be understood that the described program components and systems can generally be integrated together in a single software or hardware product or packaged into multiple software or hardware products.
Some systems may use Hadoop®, an open-source framework for storing and analyzing big data in a distributed computing environment. Some systems may use cloud computing, which can enable ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. Some grid systems may be implemented as a multi-node Hadoop® cluster, as understood by a person of skill in the art. Apache™ Hadoop® is an open-source software framework for distributed computing. Some systems may use the SAS® LASR™ Analytic Server in order to deliver statistical modeling and machine learning capabilities in a highly interactive programming environment, which may enable multiple users to concurrently manage data, transform variables, perform exploratory analysis, build and compare models and score. Some systems may use SAS In-Memory Statistics for Hadoop® to read big data once and analyze it several times by persisting it in-memory for the entire session.
It should be understood that as used in the description herein and throughout the claims that follow, the meaning of “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise. Finally, as used in the description herein and throughout the claims that follow, the meanings of “and” and “or” include both the conjunctive and disjunctive and may be used interchangeably unless the context expressly dictates otherwise; the phrase “exclusive or” may be used to indicate situations where only the disjunctive meaning may apply.
The present disclosure claims the benefit of priority under 35 U.S.C. §119(e) to U.S. Provisional Application No. 61/981,174, filed Apr. 17, 2014 and titled “Classifying and Grouping Demand Series,” and U.S. Provisional Application No. 62/011,461, filed Jun. 12, 2014 and titled “Automatic Generation of Custom Intervals,” the entireties of which are incorporated herein by reference. This application is also related to and incorporates by reference for all purposes the full disclosure of co-pending U.S. patent application Ser. No. ______, filed concurrently herewith, entitled “AUTOMATIC GENERATION OF CUSTOM INTERVALS” (Attorney Docket No. 94926-024510US-913636).
Number | Date | Country | |
---|---|---|---|
62011461 | Jun 2014 | US | |
61981174 | Apr 2014 | US |