This disclosure relates generally to consumer modeling, and, more particularly, to methods, systems and apparatus to improve Bayesian posterior generation efficiency.
In recent years, detailed panelist data has been used by market researchers to identify information associated with purchase activity. The panelist data may identify types of consumer segments, while relatively more abundant point-of-sale (POS) data has been used by the market researchers to track sales and estimate price and promotion sensitivity. Although the POS data is relatively more abundant than the panelist data, the POS data does not include segment and/or demographic information associated with the sale information.
Market researchers have traditionally relied upon panelist data and/or U.S. Census Bureau data to determine segment information associated with one or more locations (e.g., trading areas) of interest. Segment information helps to map descriptive segments of consumers (e.g., Hispanic, price sensitive, impulsive purchasers, or other descriptions that may be used to characterize groups of consumers with similar characteristics) to one or more other purchasing categories that may indicate an affinity for certain products, geography, store, brand, etc. Thus, the segment information may provide, for example, an indication that a first percentage of shoppers in a market of interest are Hispanic and a second percentage of the shoppers in a market of interest are non-Hispanic, where the ethnic descriptions may correlate with particular purchasing characteristics.
Armed with segment information and point-of-sale (POS) data, market researchers may multiply the relevant POS data with the fractional segment values corresponding to the demographic segment of interest to determine a decomposition (decomp) of sales of product(s) by segment. For example, POS data includes detailed information associated with sales in each monitored store, and such POS data may include an accurate quantity of products sold per unit of time, a price for which each item was sold and/or whether one or more promotions were present at the store. Such POS data does not, however, typically include information related to demographics and/or segment information related to the consumers that purchased the products/items of interest. Instead, market researchers typically rely on panelist data to reveal details related to consumer demographics. The mathematical product of total sales (e.g., total universal product code (UPC) sales) and the segment percentage of the corresponding location of interest (e.g., a market, a store, a region, a town, a city, a nation, etc.) yields a value indicative of how many units of each of a set of UPCs in the corresponding location are purchased by shoppers associated with each segment.
In some circumstances, the panelist data does not reconcile with the retail sales data. In other words, the abundant and accurate POS data (which is devoid of segment information) identifies values (e.g., dollar amounts, quantities of UPCs sold, etc.) of purchasing behavior, and the associated panelist data (which includes segment information) associated with that same market of interest is inconsistent with the POS data. In view of such discrepancies, one or more techniques may be applied to align the panelist data in a manner that is consistent with the POS data. For example, a Bayesian analysis is applied to anchor the panelist data with the POS data. Generally speaking, a Bayesian analysis traditionally uses one or more starting point data sets, sometimes referred to herein as “priors” (e.g., panelist data indicative of what a portion of the consumers represent (e.g., particular demographics, particular segments, etc.)), to generate a likelihood function to predict a posterior value based on the POS data. The priors represent a starting point of the Bayesian analysis, and represent starting point values associated with segments of interest, relative preferences within segments (e.g., a first product is preferred over a second product), and/or relative sizes of each segment of interest. The posterior value includes a “corrected” or modified representation of the priors. Using the posterior data, decompositions can be calculated in view of actual sales data to identify proportions of the consumers on a segment by segment basis.
The traditional Bayesian analysis introduces substantial computational burdens by, in part, requiring mapping (linking) of the panel data to corresponding POS data (also known as retail measurement sales (RMS) data) for corresponding time periods of interest (e.g., store week). POS/RMS data typically includes a product code, a market code and a time code (e.g., UPC per store per week). When traditional mapping/linking is applied to a Bayesian process, the priors can be modified in an effort to align the starting point estimation with actual empirical store sales data. For example, to allow the Bayesian analysis to generate posteriors capable of estimations for markets of interest, several thousands of panelist data points must be mapped in time, product and/or market to corresponding data points of the POS data. In some examples, the panel data mapping can take days to process, in which iterative verification operations must be performed to identify missing mapping information and/or correct erroneous mapping information. The traditional Bayesian analysis may also fail to adjust modifications and/or corrections of the prior data in a manner that retains one or more valuable insights to the prior data. In some examples, the traditional Bayesian analysis adjusts modeling parameters to align with the POS data without adhering and/or otherwise giving deference to the priors.
However, in some circumstances available panelist data is too low to provide statistically significant coverage of how different segments treat and/or otherwise purchase different products (items) of interest. While panelist data includes thorough demographic information and/or information associated with segments of interest, some panelist data lacks a sufficient degree of coverage to obtain detailed granular data regarding product purchases and their respective segments of interest. For example, in relatively large metropolitan areas (e.g., Chicago), several thousand panelists may be used to generate panelist data regarding UPC purchases and to associate those purchases with segment information. However, the number of candidate UPCs that each panelist could purchase greatly outnumbers available panelists, which may lead to inaccuracies and/or lack of coverage for granular data about which segments purchase which UPCs for a given trading area.
Example methods, apparatus, systems and articles of manufacture disclosed herein generate Bayesian posterior estimations with prior data that does not require rigorous control and/or management that is associated with panelist data. In other words, examples disclosed herein allow Bayesian posterior estimations to occur with any type of prior data, which includes panelist data, non-panelist data, survey data and/or starting point data related to expert judgements (e.g., store manager heuristics, estimations, educated guesses, etc.). Additionally, examples disclosed herein generate Bayesian posterior estimations without computational burdens associated with panel data mapping/linking that is required for traditional Bayesian analysis techniques. Instead, examples disclosed herein employ penalty modifiers to balance modification of iterative estimations of modeling coefficients without any need to merge the prior data (e.g., panelist data) with store-level condition information, thereby improving a computational efficiency when calculating posterior estimations and reducing an amount of time to do the same. Additionally, example disclosed herein generate and/or otherwise calculate Bayesian posterior estimations that balance (a) recovery of observed store sales while (b) adhering as close to possible to prior data via penalty functions, as described in further detail below.
In operation, the example prior data store 108 includes and/or otherwise provides prior data to be used in the Bayesian analysis. While the prior data may include panelist data, examples disclosed herein are not limited to the rigorous quality requirements typically associated with panelist data. Generally speaking, panelist data typically requires a requisite amount of panelist control and volume (e.g., a number of data points associated with one or more demographics/segments of interest) to provide results that are statistically significant. In some instances, marketing budgets and/or marketing computing resources preclude this level of control or volume. As such, examples disclosed herein remove such stringent control requirements for large and robust data samples based on panelists. The prior data stored in the example prior data store 108 may include partial panelist data (e.g., relatively low sample sizes), survey data, empirical observation data (e.g., from a store manager), heuristics and/or educated guesses (e.g., from a store manager, an industry expert, etc.). As discussed above, prior data serves as a starting point when generating posterior data, in which the posterior data is a modified result of the prior data in view of truth data.
In the illustrated example of
In the illustrated example of
Based on the summary data calculated by the example raw data summary engine 114, a corresponding percent share of the first segment 218 (see “Segment 1%” showing a value of 35.5%) and a corresponding percent share of the second segment 220 (see “Segment 2%” showing a value of 64.5%) is also calculated. The example percent share of the first segment 218 and the example percent share of the second segment 220 are sometimes referred to as a first panel segment share (PSS1) and a second panel segment share (PSS2), respectively and as described in further detail below. Generally speaking, the example prior data 202 reflects an expectation that the first segment of interest is responsible for 35.5% of the purchases made in the store of interest (PSS1), and that the second segment of interest is responsible for 64.5% of the purchases made in that store of interest (PSS2).
In addition to calculating segment share values, the example raw data summary engine 114 calculates “within segment shares” of each item of interest. In the illustrated example of
Although the prior data may not be derived from tightly controlled panelist data and, consequently, include a degree of error, market researchers find substantial value in the predictive nature of prior data. At the same time, while the market researchers acknowledge that the prior data may include this degree of error, examples disclosed herein enable the generation of posterior data that is based on the truth data without overreliance upon (a) the truth data or (b) the prior data in a manner that is more computationally efficient than standard Bayesian analysis techniques. In particular, rather than application of one or more Bayesian analysis techniques that applies too much adherence to the truth data, examples disclosed herein enable an estimation that is balanced between both the (a) truth data and (b) the prior data when generating posterior data.
In the illustrated example of
In view of the above-mentioned prior data 202 and truth data 204, consumerization refers to the application of posterior data and observed sales data to generate one or more estimates of which segments are responsible for the observed sales. Traditional techniques to accomplish consumerization require panelist data that must be mapped to corresponding store weeks before accurate modeling can occur. For instance, an example set of panelist-level choice information is shown in the illustrated example of Table 1.
In the illustrated example of Table 1, two separate panelists are shown (e.g., a first panelist “1234” and a second panelist “1235”), in which the first panelist is associated with segment “A” (e.g., a segment associated with young, city dwellers) and the second panelist is associated with segment “B” (e.g., a segment associated with middle aged city dwellers). The illustrated example of Table 1 also indicates which items (products) are purchased on particular dates and in particular locations.
An example manner of consumerizing the panel data of the illustrated example of Table 1 includes applying observed percentages to store sales data. Continuing with the example above, assume that segment “A” is responsible for 30% of purchases of product 1 at Walmart, and that segment “B” is responsible for 70% of purchases of product 1 at Walmart. Thus, in the event that one-thousand sales of item 1 occur at Walmart in a first week, then a straightforward projection would apply 30%/70% of those one-thousand units to segments “A” and “B,” respectively. However, in the event the panel data is too small to permit a projection that aligns with statistical expectations, the panelist data must be mapped and/or otherwise linked to the store data (e.g., mapped to store conditions). In such circumstances, a model is developed to apply segment mixtures as a function of one or more store conditions, which is computationally intensive. For example, for all the panelist data, one or more store level conditions must be identified and correctly mapped to the panelist data.
In the illustrated example of Table 2, the panelist data of example Table 1 is shown with example appended store and time information (mapped data).
In the illustrated example of Table 2, every panelist datapoint is mapped to the store sales data, which is computationally burdensome. For example, Nielsen Homescan data may include several million panel observations that must be mapped to their corresponding store and/or time-period condition observations before a model can be built. As described in further detail below, examples disclosed herein obviate the need for panelist data mapping when performing consumerization, Bayesian analysis and/or posterior data generation.
The example logit model engine 116 builds a logit model by assigning initial logit coefficients for each segment of interest and product of interest. The illustrated example of
After the logit model has been generated by the example logit model engine 116, the example penalty engine 118 invokes the example store market share penalty engine 120 to build a store market share penalty. Generally speaking, prior data can deviate from actual truth data (e.g., POS store sales data) in three ways. Either (a) the product/item preferences are different, (b) the segments are different, or (c) the sizes of the segments are different. Accordingly, the example penalty engine 118 generates and applies three different penalties, a first of which considers an effect of the prior data deviating from store market share data. In other words, when the prior data deviates from empirical “truth” data 204, the example store market share penalty engine 120 applies a corresponding penalty value. However, examples disclosed herein do not address deviations from the empirical truth data 204 alone, but also consider whether estimated segment sizes of the prior data deviate from the truth data 204. If so, the example segment size penalty engine 122 builds and applies a second penalty (e.g., a segment size penalty) to the MLE process to more closely adhere coefficient modifications to the prior data 202. Additionally, examples disclosed herein also consider whether prior data 202 associated with estimated shares of a product of interest within each segment of interest deviate from the truth data 204. If so, the example within-segment penalty engine 124 builds and applies a third penalty (e.g., a within-segment penalty) to the MLE process to more closely adhere coefficient modifications to the prior data 202.
Taken together, the example penalty engine 118 develops an objective function of three separate penalties as log likelihood functions, the sum of which is maximized with respect to the logit coefficients during the MLE process. In operation, the example store market share penalty engine 120 selects an item of interest and a segment of interest and calculates an item ratio in a manner consistent with example Expression 1.
In the illustrated example of Expression 1, βiS represents an item-segment coefficient associated with respective items (i) and the selected segment (S) of interest, such as the example item-segment coefficients shown in the example first segment logit coefficient column 232 and the example second segment logit coefficient column 234 of the illustrated example of
In the illustrated example of Expression 2, βS represents the segment ratio associated with the selected segment (S) of interest, such as the example coefficient value for the first segment of interest (βS1) 236 and the example coefficient value for the second segment of interest (βS2) 238 of the illustrated example of
The example store market share penalty engine 120 calculates the mathematical product of the example item ratio (Expression 1) and the example segment ratio (Expression 2) in an iterative manner for each segment of interest. When all segments of interest have been calculated, their sum is multiplied with the truth data item share associated with the selected item/product of interest (e.g., a respective RMS item share in column 230 of
In the illustrated example of Equation 1, LLSTORE is the log likelihood store penalty value that is calculated by the example store market share penalty engine 120 as a function of the example item ratio and the example segment ratio. As described above, the example log likelihood store penalty value is one of three penalties that are summed and maximized with respect to the example prior data coefficients.
A second of three penalties is built and applied by the example segment size penalty engine 122. In particular, the example prior data 202 may not be numerically consistent with the example truth data 204 in terms of how large (or small) each segment of interest is believed to be. In the illustrated example of
In the illustrated example of Equation 2, LLSEGMENT is the log likelihood segment penalty value that is calculated by the example segment size penalty engine 122 as a function of the example segment ratio and the prior data 202 segment size.
A third of three penalties is built and applied by the example within-segment penalty engine 124. In particular, the example within-segment share values (see column 222 and/or 224 in the illustrated example of
In the illustrated example of Equation 3, LLWSEG is the log likelihood within-segment penalty value for the items of interest, and is calculated by the example within-segment penalty engine 124 as a function of the example item ratio and the individualized panel item segment share values.
The example Bayesian analysis engine 102 initiates a Bayesian optimization using MLE to maximize a sum of penalties in connection with the logit coefficients. In particular, the example Bayesian analysis engine 102 maximizes the sum of penalties in a manner consistent with example Equation 4.
LL
TOTAL
=LL
STORE
+LL
SEGMENT
+LL
WSEG Equation 4.
In the illustrated example of Equation 4, LLTOTAL is the sum of example Equation 1, Equation 2 and Equation 3. As the example Bayesian analysis engine 102 iterates the MLE, successive iterations of the example logit model item coefficients for each segment of interest (see columns 232 and 234 of
While an example manner of implementing the Bayesian analysis system 100 of
Flowcharts representative of example machine readable instructions for implementing the Bayesian analysis system 100 of
As mentioned above, the example processes of
The program 300 of
The example logit model engine 116 builds a logit model with respective coefficients for each product and segment combination (block 308). Example coefficient values may be initialized by the example logit model engine 116 in any number of ways, as those coefficient values (e.g., see columns 232 and 234, and βS1 and βS2 in the illustrated example of
The example market share penalty engine 120 calculates the mathematical product of the item ratio and the segment ratio (block 414) and determines if one or more additional segments of interest should be considered (block 416). If so, then the example first nested loop (item 406) iterates and control returns to block 404. On the other hand, if all segments of interest have been considered in connection with the item of interest (block 416), then the example store market share penalty engine 120 calculates the natural log of the sum of segments and multiplies it by an observed item share within the store of interest (block 418). In the event one or more additional items of interest are to be considered (block 420), then the example second nested loop (item 408) iterates and control returns to block 402. If all items of interest have been considered (block 420), then the example store market share penalty engine 120 calculates the store market share penalty value (LLSTORE) as the sum of items through the one or more iterations of the example second nested loop (item 408). As described above, the aforementioned calculations by the example store market share penalty engine 120 may occur in a manner consistent with example Equation 1.
Returning to the illustrated example program 300 of
The processor platform 700 of the illustrated example includes a processor 712. The processor 712 of the illustrated example is hardware. For example, the processor 712 can be implemented by one or more integrated circuits, logic circuits, microprocessors or controllers from any desired family or manufacturer. In the illustrated example of
The processor 712 of the illustrated example includes a local memory 713 (e.g., a cache). The processor 712 of the illustrated example is in communication with a main memory including a volatile memory 714 and a non-volatile memory 716 via a bus 718. The volatile memory 714 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM) and/or any other type of random access memory device. The non-volatile memory 716 may be implemented by flash memory and/or any other desired type of memory device. Access to the main memory 714, 716 is controlled by a memory controller.
The processor platform 700 of the illustrated example also includes an interface circuit 720. The interface circuit 720 may be implemented by any type of interface standard, such as an Ethernet interface, a universal serial bus (USB), and/or a PCI express interface.
In the illustrated example, one or more input devices 722 are connected to the interface circuit 720. The input device(s) 722 permit(s) a user to enter data and commands into the processor 712. The input device(s) can be implemented by, for example, an audio sensor, a microphone, a camera (still or video), a keyboard, a button, a mouse, a touchscreen, a track-pad, a trackball, isopoint and/or a voice recognition system.
One or more output devices 724 are also connected to the interface circuit 720 of the illustrated example. The output devices 724 can be implemented, for example, by display devices (e.g., a light emitting diode (LED), an organic light emitting diode (OLED), a liquid crystal display, a cathode ray tube display (CRT), a touchscreen, a tactile output device, a printer and/or speakers). The interface circuit 720 of the illustrated example, thus, typically includes a graphics driver card, a graphics driver chip or a graphics driver processor.
The interface circuit 720 of the illustrated example also includes a communication device such as a transmitter, a receiver, a transceiver, a modem and/or network interface card to facilitate exchange of data with external machines (e.g., computing devices of any kind) via a network 726 (e.g., an Ethernet connection, a digital subscriber line (DSL), a telephone line, coaxial cable, a cellular telephone system, etc.).
The processor platform 700 of the illustrated example also includes one or more mass storage devices 728 for storing software and/or data. Examples of such mass storage devices 728 include floppy disk drives, hard drive disks, compact disk drives, Blu-ray disk drives, RAID systems, and digital versatile disk (DVD) drives.
The coded instructions 732 of
From the foregoing, it will be appreciated that the above disclosed methods, apparatus, systems and articles of manufacture enable the generation of posterior data that is based on the truth data without overreliance upon (a) the truth data or (b) the prior data in a manner that is more computationally efficient than standard Bayesian analysis techniques. In particular, rather than application of one or more Bayesian analysis techniques that applies too much adherence to the truth data, examples disclosed herein enable an estimation that is balanced between both the (a) truth data and (b) the prior data when generating posterior data.
Although certain example methods, apparatus and articles of manufacture have been disclosed herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all methods, apparatus and articles of manufacture fairly falling within the scope of the claims of this patent.
This patent claims the benefit of U.S. Provisional Patent Application Ser. No. 62/264,440 filed on Dec. 8, 2015, which is hereby incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62264440 | Dec 2015 | US |