Various data forecasting tool exists for enabling a business to make informed business decisions in light of future expected business conditions. For example, revenue forecasting may enable a business to determine an allocation of resources among business units in light of the revenue expected to be generated by each business unit.
Business forecasts are often generated at various aggregation levels. For example, a revenue forecast may be generated for an entire enterprise and for individual units within the enterprise. Generating forecasts separately at each level of aggregation can be problematic because such an approach does not account for the structural relationship between the aggregation levels and, thus, loses the additive property from the upper level to the lower level. Further, the forecast performance at the lower level can be significantly inferior to the upper level, because the method does not have a systematical way to leverage the higher predictability power at the more aggregated upper level.
Certain embodiments are described in the following detailed description and in reference to the drawings, in which:
In an embodiment of the invention, a method is provided for generating data forecasts for various levels of data aggregation, for example, for different hierarchical levels of a business entity. The data forecast may include a revenue forecast, a product demand forecast, or a sales forecast, among others. Different levels of the entity to which the forecast applies are referred to as aggregation levels. For example, an upper level may relate to a business entity as a whole, while a lower level may apply to the various divisions with the business entity. Each individual division within the upper level entity may be referred to as a component of the upper level entity.
In embodiments, a lower level data forecast may be computed for each component of the upper level entity based, in part, on a data forecast generated for the upper level entity. In this way, the forecasting techniques described herein build a predictive relationship between different forecast levels. The forecast for the upper level entity may be referred to as the aggregate forecast, while each of the lower level forecasts may be referred to as component forecasts. The result of each component forecast may be expressed as a decomposition unit. For example, if the forecast is a revenue forecast, the decomposition unit may be a dollar value corresponding to the predicted revenue forecast for that component. Each component forecast corresponds to a decomposition rate, which is a percentage of the aggregate forecast represented by the component forecast. For example, a decomposition rate of 50 percent would indicate that 50 percent of the aggregate forecast value at the upper level is forecast to be generated by the corresponding component at the lower level. In an embodiment, the component forecasts are derived from the upper level forecast using decomposition rates observed from empirical time series data that are related to past observations and forward-looking judgment calls that are related to future expectations.
In embodiments, a multinomial distribution is used for modeling the component units from the aggregate forecast to a component forecast. Further, a Dirichlet distribution may be used to model the decomposition rates. For the Dirichlet distribution, judgment information may be specified to determine mean values and coefficient of variation values corresponding to the component percentages. An aggregate forecast may be generated, and component forecasts may be derived from the aggregate forecast using the decomposition rates. In embodiments, the forecasting techniques described herein may be applied, for example, to revenue forecasting from enterprise level to business unit level, and from enterprise level to different currencies or regions. Furthermore, it will be appreciated that the techniques described herein may be applied to forecasting models that include more than two hierarchical levels.
As illustrated in
The computing device 100 can also include one or more input devices 110, such as a mouse, touch screen, and keyboard, among others. In an embodiment, the device 100 includes a network interface controller (NIC) 112, for connecting the device 100 to a network through a local area network (LAN), a wide-area network (WAN), or another network configuration. In an embodiment, the computing device 100 is a general-purpose computing device, for example, a desktop computer, laptop computer, business server, and the like.
The computing device 100 can include a data forecaster 114 that can generate a data forecast, for example, a revenue forecast, a margin forecast, a cash-flow forecast, a product sales forecast in terms of orders or shipments, and the like. The data forecaster 114 may generate an aggregate forecast corresponding to an upper level of an organizational hierarchy such as the whole enterprise. The aggregate forecast may be decomposed into multiple component forecasts corresponding to a lower level of organizational hierarchy, for example, separate business units within the enterprise. Each component forecast corresponds with a decomposition rate that describes a ratio or percentage of the aggregate forecast that pertains to the corresponding component forecast. In other words, the component forecast for a particular component may be computed by multiplying the aggregate forecast by the components decomposition rate.
The data forecast may be based, at least in part, on empirical data 116 related to past performance such as sales data, revenue data, and the like. The empirical data may include data for each business level, including data for the entire enterprise and for each component or business unit. Accordingly, the empirical data may be used to assist in determining the most likely decomposition rate for each component of the aggregate forecast. In some embodiments, the decomposition rates for each component of the aggregate forecast may also be based, at least in part, on user input relating to personal judgment calls about a likely distribution of resources in the future. In embodiments, the empirical data 116 and judgment calls may be combined to generate more realistic decomposition rates for each of the component forecasts.
In Eqn. 1, the vector, {right arrow over (θ)}=(ƒ1, θ2, . . . , θK), is a parameter vector corresponding to the decomposition rates for the K components. The parameter vector, {right arrow over (θ)}, may be referred to herein as the “decomposition vector” and may be derived based on historically observed decomposition rates. The decomposition vector, {right arrow over (θ)}, in general is not static, as it can change from one period to the next. The vector, {right arrow over (X)}=(X1, X2, . . . , XK), is the vector of the decomposition units for all the components. Input to the model may be rounded to the nearest whole number in order to apply for the multinomial distribution. In Eqn. 1, {right arrow over (X)}=(n1, n2, . . . , nK) is a sample observation corresponding to previously observed decomposition units for the K components. For example, the sample observation may be the actual decomposition units, such as revenue amount, realized during a previous reporting period. Given the sample observation, the maximum likelihood estimator of {right arrow over (θ)} is
where |{right arrow over (X)}| denotes the L1 norm of {right arrow over (X)}. As described further below, the maximum likelihood estimator may be used to help determine the projected values for the future decomposition rates.
At block 204, the decomposition vector for the multinomial distribution shown in Eqn. 1 can be modeled by a probability distribution. In some embodiments, the probability distribution is a Dirichlet distribution, {right arrow over (θ)}˜Dir({right arrow over (α)}), which may be completely determined by the probability distribution parameter vector, {right arrow over (α)}. The density function of the Dirichlet distribution describes the likelihood of a random variable taking a particular value among all the possible values that the random variable can take and is defined according to Eqn. 2.
Given the models {right arrow over (X)}|{right arrow over (θ)}˜Mul({right arrow over (θ)}) and {right arrow over (θ)}˜Dir({right arrow over (α)}), it follows that the posterior distribution of the decomposition vector, {right arrow over (θ)}, given the demand vector, {right arrow over (X)}, is also a Dirichlet distribution. Specifically, {right arrow over (θ)}|{right arrow over (X)}˜Dir({right arrow over (α)}+{right arrow over (X)}). Thus, the expected value for the posterior distribution of the decomposition vector, {right arrow over (θ)}, may be determined according to Eqn. 3.
Furthermore, the expected value for the posterior distribution of the decomposition vector, {right arrow over (θ)}, can be expressed as the weighted average of the expected value of the prior distribution, which is {right arrow over (α)}/|α|, and the maximum likelihood estimator from the sample distribution, which is {right arrow over (X)}/|{right arrow over (X)}|. Thus, the expected values for the posterior distribution of the decomposition vector, {right arrow over (θ)}, may be computed according to Eqn. 4.
The expected value of the posterior distribution provides an updated estimate of the decomposition vector, {right arrow over (θ)}, given the observation that we have on the historical decomposition units as well as the prior distribution of the decomposition rates. We note that in Eqn. 4, {right arrow over (X)} represents the historical decomposition units, and {right arrow over (α)} is the parameter vector for the prior distribution of the decomposition rates.
At block 206, mean values and coefficient of variation values may be determined for the Dirichlet distribution. In an embodiment, a recent window of historical data may be used to derive an estimate for the mean values and coefficient of variation values. In an embodiment, the mean values and coefficient of variation values may be specified by a user. The mean values and coefficient of variation values may be used to compute an estimate for the probability distribution parameter vector, {right arrow over (α)}.
In an embodiment, the user may be prompted to provide expected or planned values of the decomposition vector, {right arrow over (θ)}. The planned values of the decomposition vector are denoted by the vector, {right arrow over (A)}, and may be used as the mean values of the decomposition vector, {right arrow over (θ)}. The planned values, {right arrow over (A)}, of the decomposition vector, {right arrow over (θ)}, may be specified, for example, based on business judgment regarding future planning or desires for the components of the business entity. The user may also be prompted to provide a coefficient of variation value corresponding to the planned values, {right arrow over (A)}, of the decomposition vector, {right arrow over (θ)}. The coefficient of variation value may be denoted by the parameter, λ, and may represent, for example, a degree of confidence that the user's planned values, {right arrow over (A)}, are actually achievable.
At block 208, the complete distribution for the probability distribution may be obtained by computing an estimate of the probability distribution parameter vector, {right arrow over (α)}, using the mean values and coefficient of variation values estimated or specified at block 206.
Given the Dirichlet distribution on the decomposition vector, {right arrow over (θ)}, it follows that E({right arrow over (θ)})={right arrow over (α)}/|{right arrow over (α)}|. Thus, the expected values of the decomposition vector, E({right arrow over (θ)}), may be expressed by the linear equation system of Eqn. 5.
Additionally, the variance for each individual decomposition rate may be expressed by Eqn. 6.
Solving Eqn. 6 for the probability distribution parameters yields Eqn. 7
Incorporating the coefficient of variation value, λ, into the solution for the probability distribution parameter vector of Eqn. 7 yields Eqn. 8.
In Eqn. 8, Ai refers to the i-th component of {right arrow over (A)}. Thus, an estimate for the probability distribution parameter vector, {right arrow over (α)}, may be computed according to Eqn. 9.
In another embodiment, upper and lower bound estimates for |{right arrow over (α)}| may be obtained by replacing the mean function of Eqn. 9 with maximum and minimum functions on the component values
With the upper bound and lower bound values determined this way, the robustness of the resulting component level forecasts can be analyzed. For example, if the upper bound and lower bound forecasts are not far away from each other (say within 5%), the forecasting procedure could be considered robust. In another embodiment, the mean function of Eqn. 9 can be replaced by the median function on the component values
At block 210, the expected future decomposition rates may be determined. Specifically, the posterior distribution of the multinomial distribution parameter vector may be derived using the completely specified Dirichlet distribution computed at block 208. In an embodiment, the posterior mean and variance are computed. The variance measures the variability of all the values that a random variable takes, relative to its mean value. The calculation for the posterior mean is given in Eqn. 4, and the calculation for the variance for each component is given below:
where E(θi|{right arrow over (X)}) can be obtained from Eqn. 4. The posterior mean, also referred to as the expected value of the posterior distribution, provides an updated estimate of the future decomposition rates, based on the historical decomposition units as well as the prior distribution. Further, the expected value of the posterior distribution can be expressed as the weighted average of the expected value of the prior distribution and the maximum likelihood estimate from the sample observation as shown in Eqn. 4.
At block 212, empirical data corresponding to the upper level may be used to generate an aggregate forecast. The aggregate forecast may be generated using techniques known in the art, for example, the ARIMA (autoregressive integrated moving average) models, or the Holt-Winters algorithms, among others.
At block 214, a component forecast may be generated for each lower level component based on the aggregate forecast and the posterior distribution of the future decomposition rates generated at block 210. Each component forecast may be computed by multiplying the aggregate forecast by the corresponding future decomposition rate. The component forecast may be a point forecast or an interval forecast. Specifically, the posterior distribution of the decomposition rates may be used to derive the mean and standard deviation for each of the decomposition rates. The point forecast for each component can be computed by multiplying the mean rate by the aggregate forecast. The point forecast for each component can be computed by multiplying the sum of the mean rate and certain multiples of the standard deviation by the aggregate forecast. Note that the point forecast and the interval forecast derived this way are conditional forecasts, conditional on the aggregate forecast. The resulting aggregate forecast and component forecasts may be displayed to the user and/or stored to an electronic storage medium such as the storage medium 108 or the memory 106.
At block 216, the user may optionally provide additional input to adjust the aggregate forecast. For example, with an empirically established price elasticity estimated at 2 for a product line, if the user decides to decrease the price by 5%, then we would expect to see an increase in the sales volume by 10%. This 10% increase in sales can be an input that the user provides once the price-cutting is planned. A simple updating method in this case would be to use the original aggregate forecast (without the price-cutting effect accounted) and increase the aggregate forecast by 10% to account for the new pricing. The aggregate forecast can also be updated based on the user input using other known and more advanced forecasting techniques such as Bayesian forecasting techniques. If the aggregate forecast is adjusted at block 216, the adjustment at the upper level aggregate forecast can be automatically reflected down to the lower level component forecast, in which case, the process flow may return to block 214, and new component forecasts may be generated based, in part, on the new aggregate forecast. If the aggregate forecast is not adjusted, the process flow may advance to block 218 and the process flow terminates.
As shown in
A region 308 can include a decomposition rate generator configured to determine the composition rates applicable to each component of the upper level entity. The decomposition rate generator 308 can be configured to determine mean values and a coefficient of variation for a probability distribution corresponding to future expected decomposition rates for each of the two or more components, as discussed in reference to block 206 of
A region 310 can include a component forecast generator configured to generate component forecasts corresponding to each of the two or more components based on the aggregate forecast and the expected future decomposition rates, as discussed in reference to block 216 of
This application is a continuation of co-pending PCT Patent Application Serial No. PCT/US2011/033941, filed Apr. 26, 2011, the entire contents of which are hereby incorporated by reference as though fully set forth herein.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/US2011/033941 | Apr 2011 | US |
Child | 14063918 | US |