This invention relates generally to analysis of multi-dimensional data and more particularly to dynamic multi-dimensional analysis of consolidated enterprise data supporting creating and analysis of predicted data.
On-Line Analytical Processing (OLAP) is a category of software technology that enables insight into enterprise data through access to a wide variety of views of the enterprise data. Enterprise data is a large collection of business data, such as historical sales data of commercial items based on such attributes as location, market, product, weather, etc. With the large amount of data available, an analyst typically seeks to discern trends or relationships in the business data, for example, how many units of a product sold over the summer in three Midwestern states. Typically, such a query in enterprise data is a laborious task. OLAP seeks to reduce the amount of time involved by pre-calculating common types of queries. The analyst uses the OLAP results to rapidly evaluate the desired historical relationships in data at a more meaningful level. OLAP reduces the enterprise data granularity by aggregating the enterprise data into larger aggregations. For example, if the enterprise data breaks down products sales at the store level for a particular chain, an OLAP pre-calculated query may only return the product sales for the chain.
OLAP has been used to analyze dependent data, such as, but not limited to, sales volume of product(s), revenue, profits, etc. The data for OLAP is typically organized into a volume cube representing sales volume of a product for different locations (or markets, depending on the granularity of the resulting aggregated volume data). OLAP operates across two large, general classes of data: dependent and causal. Dependent data is data that is determined by the values of the causal data. For example, sales volume of a product is a market at a point in time that may be the result of causal data (e.g. price, weather, advertising, etc.). Furthermore, OLAP uses causal data to develop insights into the factor affecting dependent data, such as product volume. OLAP simultaneously aggregates or determines dependent and causal data. For example, if OLAP aggregated volume in three Midwestern states, OLAP should also calculate an aggregate, or average price in those states. Causal data is a collection of data (e.g. price, advertising, weather, etc.) that affects the dependent data (e.g., sales, revenue, profits, etc.). OLAP is useful to an analyst because it provides the base data from which analysts may make their own predictions of future data by understanding past trends or relationships and drawing conclusions about the future through inference.
However, OLAP typically analyzes past trends and not future trends, because OLAP assumes the existence of historical data in the form of dependent and causal data in order to perform its analyses. In addition, OLAP reduces dependent data granularity by aggregating the dependent data with pre-calculated queries.
Methods and apparatuses for predicting a set of multi-dimensional dependent data and non-measurable data from a set of multi-dimensional historical dependent and causal data are described. In one embodiment, the method comprises receiving input data that comprises multi-dimensional historical dependent data and historical causal data and anticipated causal data, determining a set of multi-dimensional predicted dependent data using a predictive model and the input data, creating non-measurable data based on the set of multi-dimensional predicted dependent data and the input data.
The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings in which like references indicate similar elements.
In the following detailed description of embodiments of the invention, reference is made to the accompanying drawings in which like references indicate similar elements, and in which is shown by way of illustration specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that logical, mechanical, electrical, functional, and other changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims.
To estimate a predictive model of the data in actual historical dependent data 102, an analyst collects historical causal data 104. Historical causal data 104 includes business drivers that potentially affect actual historical dependent data 102. A business driver is an anticipated activity that could affect actual historical dependent data 102. Examples of business drivers are, but are not limited to, in-store activities (e.g., price, display, etc.), advertising (e.g., targeted rating points, gross rating points, print circulars, etc.), weather (e.g., temperature, change in temperature, precipitation, etc.), distribution, competitive activity (own similar products as well as competition products), etc. Typically, the causal data is employed in a predictive model that predicts the historical dependent data. In addition, the predictive model aids an analyst in better understanding how influential each business driver is in affecting dependent data. For example, one set of dependent data may be sensitive to price, while other sets of dependent data are sensitive to seasonal or weather changes.
The embodiment in
Returning to
Volume=α+β1x1+β2x2+ . . . +βnxn (1)
where α is the intercept to represent the base level of demand for the product, βi are coefficients to quantify the expected dependent data response to xi, and xi are the covariates. Covariates relate to the business drivers as described below. For example, in one embodiment, a simple predictive model for the sales volume of an item is based on display advertising, feature advertising (e.g. print advertising), price, weather and television advertising. The predictive model for this embodiment is:
Volume=α+βDisplay*Display+βAd*Ad+βPrice*Price+βTV*TV+βWeather*Weather (2)
From Equation (1) or some other predictive model, processing logic computes the predicted dependent data.
As mentioned above, each covariate (xi) relates to business drivers that potentially affect the dependent data. In one embodiment, the covariate is the business driver. Alternatively, processing logic mathematically transforms the business driver into the covariate. This is typically used when changes in the business driver do not affect the dependent data in a linear fashion. For example, the effect of product price on the volume may be large around $1.99/equivalent, but not large if the price were $3.99/equivalent. In this case, processing logic uses a covariate of ln(price) instead of price itself. Taking the example of the simple predictive model presented in Equation (2) above, processing logic would then use the predictive model of:
Volume=α+βDisplay*Display+βAd*Ad+βPrice*ln(Price)+βTV*TV+βWeather*Weather (3)
Processing logic supports numerous types of mathematical transforms of business drivers to covariates such as simple arithmetic transforms. Other covariates have time delaying effects. For example, an expenditure of advertising in one time period can continue to affect the dependent data for several successive time periods. To model this type of effect, a covariate is a decay function that decreases in time after an initial input value. Furthermore, more than one business driver can affect covariates. For example, a competing dependent data can affect a dependent data by increasing or decreasing the product's dependent data.
Processing logic can equivalently use other predictive models known in the art. For example, in one embodiment, processing logic uses a model (Equation (4)) that is a sum of five models related to the five in-store grocery merchandising conditions used in the US:
Volumetotal=VolumeDispFeat+VolumeDisplay+VolumeFeature+VolumeTPR+VolumeNoPromo (4)
where VolumeDispFeat is the volume due to a product offered with a feature advertisement and display, VolumeDisplay is the volume due to the product offered with a display but no feature advertisement, Volumefeature is the volume due to the product offered with a feature advertisement but no display, VolumeTPR is the volume due to the product offered with a temporary price reduction (TPR), and Volumefeature is the volume due to the product offered no display, feature advertising or TPR. Each volume equation has its own intercept, coefficients, and covariates as follows:
VolumeDispFeat=α+β1ACVDispFeat+β2x2+ . . .
VolumeDisplay=α+β3ACVDisplay+β4x4+ . . .
VolumeFeature=α+β5ACVFeature+β6x6+ . . .
VolumeTPR=α+β7ACVTPR+β8x8+ . . .
VolumeNoPromo=α+β9ACVNoPromo+β10x10+ . . . (5)
where β2, β4, β6, β8, and β10 are coefficients for other covariates and typically are the same (e.g. weather, price, etc.) for the five sub-volume equations in Equation (5).
At block 206, processing logic determines whether to use predicted causal data or historical causal data. If processing logic uses historical causal data, processing logic generates predicted historical dependent data at block 214. On the other hand, if processing logic uses predicted causal data, processing logic creates predicted causal data, at block 208. The predicted causal data represents the information affecting predicted future dependent data. The predicted causal data is typically the same type of information as for historical causal data 104, such as in-store activities, advertising, weather, competitive activity, etc. In one embodiment, processing logic generates the predicted causal data from the historical causal information. In this embodiment, the same values used for in-store activities, advertising, etc., from a similar time period in the past are used for a time period in the future. For example, processing logic uses the same historical causal data for a product from March 2005 for the predicted causal data in March 2006 is used. In another embodiment, processing logic uses the same historical causal information for the predicted causal information, but processing logic makes a change to some or all of historical causal data. For example, processing logic uses the same historical causal data from March 2004 plus an overall three percent (3%) increase for the predicted causal data in March 2006. As another example, processing logic uses the same historical causal data but decreases all marketing business drivers by five percent (5%). In a still further example, processing logic uses the same historical causal data, but predicts for an unusually warm summer. In a further embodiment, processing logic generates the predicted causal data from a market researcher's input. In another embodiment, processing logic generates the predicted causal data from another product's historical causal data. In another embodiment, processing logic generates the predicted causal data from a combination of the ways describe above.
To the right of time 108, the timeline 304 progresses into the future. Predicted causal data 302 starts at a specified time 108 and progresses to the right into the future. As stated above, the predicted causal data 302 is copied from the historical causal data 104, derived from the historical causal data 104, derived from some other product causal data, generated from user input or a combination thereof. This embodiment is meant to be an illustration of predicted causal data 304 and does not imply that predicted causal data 304 always starts at present time 108. Other embodiments of predicted causal data 304 can be for any future time period and of varying length, such as a days, weeks, months, years, etc. Furthermore, actual causal data 104 and predicted causal data 302 can have different time lengths.
Returning to
At block 214, processing logic generates predicted dependent data from the predictive model and either the historical or predicted causal data. In one embodiment, processing logic generates predicted historical dependent data using historical causal data. Alternatively, processing logic generates predicted future dependent data using predicted causal data.
In one embodiment, processing logic generates the predicted dependent data with the same granularity as the historical dependent data. As an example of dependent data prediction and by way of illustration, assume processing logic uses the simple predictive model in Equation (2). Further assume that business drivers and coefficients have the following values as listed in Table 1
Using the predictive model in Equation (2), processing logic predicts a dependent data of 86.7. If the price were to decrease to $1.99, then the predicted dependent data rises to 87.5. Although this is a simple example, predictive models are typically more complicated involving numerous business drivers and multiple product dependencies. For example, as shown in
Returning to
At block 218, processing logic determines if the predictive model should be validated. Although in one embodiment the analyst signals to the processing logic that the model should be validated, alternate embodiments may determine whether a model should be validated by different means (i.e., processing logic automatically determine whether the model should be validated, processing logic determines whether model should be validated with input from the analyst, etc.) If so, at block 220, processing logic validates the predictive model by comparing predicted historical dependent data information with actual historical dependent data information. Processing logic can compare with the actual historical dependent data in two ways: (i) accruing additional actual dependent data and comparing the additional historical dependent data with the predicted dependent data as shown in
Process 200 offers a powerful way to predict future dependent data and gain insight to the business drivers that predominantly affect the predicted dependent data. Because processing logic uses the full granularity of actual historical dependent data 102 and historical causal data 104 and propagates this granularity into the predicted causal data 302, predicted dependent data 306 and predicted historical dependent data 502, processing logic can calculate the analytical reports at any level of granularity supported by the underlying data. Thus, unlike traditional OLAP, processing logic allows an analyst the capability to calculate affects to the dependent data at a very low level of granularity, by marketing variable, for example. In addition, processing logic allows analytical reports based on predicted future dependent data. This is advantageous because future predictions of dependent data is performed on a set of granular dependent data and not based on predictions from aggregated historical data as with OLAP. Furthermore, process 200 allows an analyst the ability to calculate contributions to dependent data (e.g. volume changes) and data computed from dependent data (e.g. revenue changes). In addition, an analyst can still make inferences and/or speculations based on the predicted historical and/or future dependent data.
All eight cube types are present in
Returning to
At block 704, processing logic receives the predicted dependent data information. Processing logic uses this information plus other product information such as raw goods costs, manufacturing costs, distribution costs, etc. to generate the analytical reports. At block 706, processing logic calculates due-to reports. A due-to report identifies the amount of dependent data that is due to a specific business driver. Processing logic uses the scenario or a time period as a baseline for the due-to report. Processing logic manipulates the marketing business drivers to determine the dependent data contribution for each marketing business driver. For business drivers that have linear effects to the dependent data, processing logic manipulates that specific business driver to determine the dependent data change. For business drivers that have a non-linear effect and is dependent on other business drivers, processing logic manipulate the specific business driver along with the dependent business drivers to determine a dependent data contribution attributable to each business driver.
At block 708, processing logic generates a volume decomposition report. Similar to the due-to reports, the volume decomposition reports identifies the amount of dependent data that is due to marketing business drivers. The volume decomposition report is a special case of the due-to report. Processing logic starts from a known point where all marketing business drivers have zero contribution and varies the marketing business drivers to determine the volume contributions from each marketing business driver. Thus, processing logic calculates a baseline that represents no marketing activity. Relating back to the predictive model from block 212 in
At block 710, processing logic generates predicted financial information, typically in the form of a profit and loss statement that utilizes the predicted volume information from a scenario. In one embodiment, processing logic generates a profit and loss statement that includes gross revenue, cost of goods sold, net revenue, gross profit, contribution and operating income. Processing logic calculates the cost from fixed costs (i.e., overhead), variable costs (e.g., raw materials, packaging, etc.) and business driver costs (e.g., advertising costs, etc.). Because processing logic generates the financial information from the predicted volume information, processing logic generates the financial information based on the finest level of granularity available. This allows flexibility in analyzing the result and permits drilling down in the results to examine, for example, a market or financial contribution more closely.
Predicted causal module 804 processes the historical causal data and generates the predicted causal data by simply using the historical causal data from the same relative time period, applying changes to the corresponding historical causal data (e.g. add three percent to marketing business drivers), using historical causal data from another product and/or allowing the analyst to input the information. Referring back to
Returning to
Returning to
Returning to
Returning to
The processes described herein may constitute one or more programs made up of machine-executable instructions. Describing the process with reference to the flow diagrams in
The web server 908 is typically at least one computer system which operates as a server computer system and is configured to operate with the protocols of the World Wide Web and is coupled to the Internet. Optionally, the web server 908 can be part of an ISP which provides access to the Internet for client systems. The web server 908 is shown coupled to the server computer system 910 which itself is coupled to web content 912, which can be considered a form of a media database. It will be appreciated that while two computer systems 908 and 910 are shown in
Client computer systems 912, 916, 924, and 926 can each, with the appropriate web browsing software, view HTML pages provided by the web server 908. The ISP 904 provides Internet connectivity to the client computer system 912 through the modem interface 914 which can be considered part of the client computer system 912. The client computer system can be a personal computer system, a network computer, a Web TV system, a handheld device, or other such computer system. Similarly, the ISP 906 provides Internet connectivity for client systems 916, 924, and 926, although as shown in
Alternatively, as well-known, a server computer system 928 can be directly coupled to the LAN 922 through a network interface 934 to provide files 936 and other services to the clients 924, 926, without the need to connect to the Internet through the gateway system 920. Furthermore, any combination of client systems 912, 916, 924, 926 may be connected together in a peer-to-peer network using LAN 922, Internet 902 or a combination as a communications medium. Generally, a peer-to-peer network distributes data across a network of multiple machines for storage and retrieval without the use of a central server or servers. Thus, each peer network node may incorporate the functions of both the client and the server described above.
The following description of
Network computers are another type of computer system that can be used with the embodiments of the present invention. Network computers do not usually include a hard disk or other mass storage, and the executable programs are loaded from a network connection into the memory 1008 for execution by the processor 1004. A Web TV system, which is known in the art, is also considered to be a computer system according to the embodiments of the present invention, but it may lack some of the features shown in
It will be appreciated that the computer system 1000 is one example of many possible computer systems, which have different architectures. For example, personal computers based on an Intel microprocessor often have multiple buses, one of which can be an input/output (I/O) bus for the peripherals and one that directly connects the processor 1004 and the memory 1008 (often referred to as a memory bus). The buses are connected together through bridge components that perform any necessary translation due to differing bus protocols.
It will also be appreciated that the computer system 1000 is controlled by operating system software, which includes a file management system, such as a disk operating system, which is part of the operating system software. One example of an operating system software with its associated file management system software is the family of operating systems known as WINDOWS OPERATING SYSTEM from Microsoft Corporation in Redmond, Wash., and their associated file management systems. The file management system is typically stored in the non-volatile storage 1014 and causes the processor 1004 to execute the various acts required by the operating system to input and output data and to store data in memory, including storing files on the non-volatile storage 1014.
In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will be evident that various modifications may be made thereto without departing from the broader spirit and scope of the invention as set forth in the following claims. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.
Number | Name | Date | Kind |
---|---|---|---|
5491629 | Fox et al. | Feb 1996 | A |
5796932 | Fox et al. | Aug 1998 | A |
7010494 | Etzioni et al. | Mar 2006 | B2 |
7302421 | Aldridge | Nov 2007 | B2 |
7392248 | Bakalash et al. | Jun 2008 | B2 |
20020116237 | Cohen et al. | Aug 2002 | A1 |
20030046130 | Golightly et al. | Mar 2003 | A1 |