The present invention relates to a methods and systems for forecasting product demand using a causal methodology, based on multiple regression techniques, and in particular to a method for de-seasonalizing causal factors having strong seasonal patterns and using the deseasonalized variables within the causal demand forecasting methodology.
Accurate demand forecasts are crucial to a retailer's business activities, particularly inventory control and replenishment, and hence significantly contribute to the productivity and profit of retail organizations.
Teradata Corporation has developed a suite of analytical applications for the retail business, referred to as Teradata Demand Chain Management (DCM), which provides retailers with the tools they need for product demand forecasting, planning and replenishment. Teradata Demand Chain Management assists retailers in accurately forecasting product sales at the store/SKU (Stock Keeping Unit) level to ensure high customer service levels are met, and inventory stock at the store level is optimized and automatically replenished. Teradata DCM helps retailers anticipate increased demand for products and plan for customer promotions by providing the tools to do effective product forecasting through a responsive supply chain.
In application Ser. Nos. 11/613,404; 11/938,812; and 11/967,645, referred to above in the CROSS REFERENCE TO RELATED APPLICATIONS, Teradata Corporation has presented improvements to the DCM Application Suite for forecasting and modeling product demand during promotional and non-promotional periods. The forecasting methodologies described in these references seek to establish a cause-effect relationship between product demand and factors influencing product demand in a market environment. Such factors may include current product sales rates, seasonality of demand, product price changes, promotional activities, competitive information, and other factors. A product demand forecast is generated by blending the various influencing causal factors in accordance with corresponding regression coefficients determined through the analysis of historical product demand and factor information.
This process for forecasting product demand using a causal methodology is more complex when dealing with casual factors with seasonal patterns. It is more difficult to identify a significant relationship between product demand and a seasonal variable. These variables may appear to have strong relationships with seasonal products, however, such a correlation may be due to the seasonality of both demand and the causal variable, and thus may not be an indication of a true causal relationship.
Even when a significant relationship exists between a seasonal variable and product demand, a change in the value of the seasonal variables does not necessarily translate to a corresponding change in product demand. Research suggests that it is often an unexpected change in a causal factor that triggers significant uplifts in demand. Variable changes due to seasonality are often perceived by consumers as normal and hence do not generate an uplift in demand.
Typical examples of causal variables with seasonal patterns are temperature, precipitation and other weather-related factors. Again, research has shown that these variables can serve as leading indicators of demand change, and hence improve product demand forecast accuracy for relevant categories of products. However, variables with strong seasonality must be modeled properly.
Described below is a technique to de-seasonalize causal factors based on their historical values. It is believed that de-seasonalized variables are better predictors of demand change, whose use will result in improved product demand forecast accuracy. The technique introduced herein is generally applicable to most causal variables with a seasonal pattern. However, for simplicity the examples and illustrations are given for modeling temperature.
In the following description, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable one of ordinary skill in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that structural, logical, optical, and electrical changes may be made without departing from the scope of the present invention. The following description is, therefore, not to be taken in a limited sense, and the scope of the present invention is defined by the appended claims.
As stated above, the causal demand forecasting methodology seeks to establish a cause-effect relationship between product demand and factors influencing product demand in a market environment. A product demand forecast is generated by blending the various influencing factors in accordance with corresponding regression coefficients determined through the analysis of historical product demand and factor information. The multivariable regression equation can be expressed as:
y=b
0
+b
1
x
1
+b
2
x
2
+ . . . +b
k
x
k Equation (1);
where y represents demand; x1 through xk represent causal variables, such as current product sales rate, seasonality of demand, product price, promotional activities, and other factors; and b0 through bk represent regression coefficients determined through regression analysis using historical sales, price, promotion, and other causal data.
The Teradata Corporation DCM Application Suite may be implemented within a three-tier computer system architecture as illustrated in
Presentation tier 101 includes a PC or workstation 111 and standard graphical user interface enabling user interaction with the DCM application and displaying DCM output results to the user. Application tier 103 includes an application server 113 hosting the DCM software application 114. Database tier 103 includes a database server containing a database 116 of product price and demand data accessed by DCM application 114.
In the causal demand forecasting systems described herein, and illustrated in
The historical values of weather data are readily available. Historical and predicted weather data can be purchased through subscription to a weather service or can be downloaded from established websites. Such data is normally collected at weather stations located at airports. Therefore, the location of a retailer employing a causal demand forecasting system including weather related data as a set of causal factors should be mapped to the closest airport or weather station where weather data is collected.
In
In steps 211, 212 and 213, stored historical temperature data 201, precipitation data 202, and accumulated snow data 203 is transformed into a format that can be fed into the DCM causal framework. For instance, the collected historical temperature is in the form of maximum, minimum, and average daily values. These values are transformed into weekly average temperatures based on the fiscal retail calendar. Other mathematical transformations may be required from case to case.
Additional weather-related historical casual factor data, not shown, may also be saved, transformed, and fed into the DCM causal framework. Other, non-weather-related, historical casual factor data, represented by stored data 209, is transformed in step 219, and fed into the DCM causal framework.
Causal factor data is compiled for each product or product category as shown by table 221. Table 221 illustrates the collection of weather related causal factor data, e.g., temperature, precipitation, accumulated snow data, and extreme conditions for a portion of a retailers product line, e.g., umbrellas, snow tires, snow shovels, sunscreen, and bottled water. The information displayed in table 221 comprises just a portion of the retailer's product line and a subset of all weather, and non-weather, related causal variables.
In step 222, causal factor historical data is examined to identify the set of causal weather factors, and other causal factors, that have statistically significant effects on historical product demand, and hence are believed to be of greatest relevance in determining product demand changes in the future, are identified. Additional detail regarding the process for selecting causal variables is illustrated in
In step 223, regression analysis is performed to determine the regression coefficients for the variables selected in step 222, and to build the multivariable regression equation required for demand forecast calculation.
In step 226 of
Future weather data is generally predictable with sufficient accuracy up to one week into the future. The accuracy of such weather forecasts directly affects the accuracy of demand forecasts derived from the causal framework. A transformation 225 may be required to feed the future weather values into the DCM causal framework.
A stated earlier, research suggests that it is often an unexpected change in a causal factor that triggers significant changes in demand, and that de-seasonalized variables, particularly temperature and other variables with a season pattern, are better predictors of demand change than the unaltered seasonal variables.
In order to better predict the product demand changes associated with causal variables having seasonal patterns, such as temperature, a technique for removing the seasonal variation of causal variables, i.e., to de-seasonalize the causal factors, based on their historical values is proposed. Referring now to
The history of the causal variable, i.e., temperature, is collected and transformed into a weekly format compatible with DCM's causal framework. The historical temperature data is shown as stored data 405.
In step 410, the average historical weekly temperature is calculated using the available historical temperature data 405 in accordance with Equation (1) provided below:
De-seasonalized weekly temperature thereafter calculated in step 415 by subtracting the average historical weekly temperature from the current or forecast average weekly temperature:
DeseasonTempyear,wk=WklyTempyear,wk−AveHistTempwk Equation (3).
Please note that typically variables are de-seasonalized or normalized using multiplicative transformations such as:
DeseasonTempyear,wk=WklyTempyear,wk/AveHistTempwk Equation (4).
However, due to theoretical reasons, supported by empirical results, additive transformation Equation (3) is recommended for weather variables such as temperature and precipitation.
Incorporating the process for de-seasonalizing select causal variables having seasonal patterns into the process illustrated in
A process for selecting causal variables, including de-seasonalized variables and referred to in step 222 of
The regression variable selection process of
In step 503 data cleansing is performed to remove product demand data corresponding to a stock-out condition, and to remove incomplete weeks, e.g., when the value of one or more variables is missing.
Causal variables having seasonal variation, e.g. temperature or accumulated snow, are de-seasonalized according to the process of
In step 505 the correlation of demand with each of the causal variables is calculated. If the correlation is insignificant, the variable is removed from the regression equation.
In step 507, a multi-regression model is constructed with regression coefficients calculated for each of the causal factors that passed step 505. T-ratios are calculated for each coefficient (step 509) and the variables with smallest absolute t-ratios, are removed iteratively, until the absolute value of all t-ratios>1 (steps 611 and 613).
In step 515 an out-of-sample error calculation is performed to confirm that all the variables contribute to forecast accuracy, i.e., the accuracy is deteriorated if any of the variables is removed. It is recommended that the process be repeated with different variable sets to confirm that each variable is actually contributing to forecast accuracy.
A final evaluation to verify coefficient selection is performed in step 517. Tests are performed to verify that the amount of historical data is adequate to support the selection process, e.g. the number of complete weeks of history divided by the number of variables exceeds 20.
The Figures and description of the invention provided above reveal an improved method and system for forecasting product demand using a causal methodology, based on multiple regression techniques, the improvement including a process for de-seasonalizing causal factors having seasonal variations, such as temperature.
Instructions of the various software routines discussed herein, such as the methods illustrated in
Data and instructions of the various software routines are stored in respective storage modules, which are implemented as one or more machine-readable storage media. The storage media include different forms of memory including semiconductor memory devices such as dynamic or static random access memories (DRAMs or SRAMs), erasable and programmable read-only memories (EPROMs), electrically erasable and programmable read-only memories (EEPROMs) and flash memories; magnetic disks such as fixed, floppy and removable disks; other magnetic media including tape; and optical media such as compact disks (CDs) or digital video disks (DVDs).
The instructions of the software routines are loaded or transported to each device or system in one of many different ways. For example, code segments including instructions stored on floppy disks, CD or DVD media, a hard disk, or transported through a network interface card, modem, or other interface device are loaded into the device or system and executed as corresponding software modules or layers.
The foregoing description of various embodiments of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the above teaching. Accordingly, this invention is intended to embrace all alternatives, modifications, equivalents, and variations that fall within the spirit and broad scope of the attached claims.
This application is related to the following co-pending and commonly-assigned patent applications, which are incorporated by reference herein: Application Ser. No. 11/613,404, entitled “IMPROVED METHODS AND SYSTEMS FOR FORECASTING PRODUCT DEMAND USING A CAUSAL METHODOLOGY,” filed on Dec. 20, 2006, by Arash Bateni, Edward Kim, Philip Liew, and J. P. Vorsanger; Application Ser. No. 11/938,812, entitled “IMPROVED METHODS AND SYSTEMS FOR FORECASTING PRODUCT DEMAND DURING PROMOTIONAL EVENTS USING A CAUSAL METHODOLOGY,” filed on Nov. 13, 2007, by Arash Bateni, Edward Kim, Harmintar Atwal, and J. P. Vorsanger; Application Ser. No. 11/967,645, entitled “TECHNIQUES FOR CAUSAL DEMAND FORECASTING,” filed on Dec. 31, 2007, by Arash Bateni, Edward Kim, J. P. Vorsanger, and Rong Zong; and Application Ser. No. 12/255,696, entitled “METHODOLOGY FOR SELECTING CAUSAL VARIABLES FOR USE IN A PRODUCT DEMAND FORECASTING SYSTEM,” filed on Oct. 22, 2008, by Arash Bateni and Edward Kim. Application Ser. No. 12/512,071, entitled “CAUSAL PRODUCT DEMAND FORECASTING SYSTEM AND METHOD USING WEATHER DATA AS CAUSAL FACTORS IN RETAIL DEMAND FORECASTING,” filed on Jul. 30, 2009, by Arash Bateni and Edward Kim.