The present disclosure relates generally to monitoring media and, more particularly, to methods and apparatus to determine media impressions.
Audience measurement entities analyze audience engagement levels for media programming based on registered panel members. That is, an audience measurement entity enrolls people who consent to being monitored into a panel. The audience measurement entity then monitors those panel members to determine media (e.g., television programs or radio programs, movies, DVDs, advertisements, etc.) exposed to those panel members. Exposure of an expanded group (e.g., worldwide exposure, nationwide exposure, market-wide exposure, etc.) is then statically extrapolated from the panelist information.
For example, user access to Internet resources is often monitored through the use of panel software executing on panelist computers. The panel software may be installed by the user, may be installed by the audience measurement entity, may be installed in response to a user visiting a webpage, etc. The panel software transmits information about media (e.g., webpages) accessed by the panelist computers to a central facility for analysis.
Information collected from panelist computers access to media (e.g., webpage accesses known as pageviews) is often aggregated on a monthly basis for reporting. For example, a report may be generated indicating the number of pageviews for a given brand during the month of June). The monthly pageviews are often compared to determine volatility. This volatility in the number of pageviews may genuinely represent the number of visits to the webpage (e.g., due to seasonal behavior). For example, a webpage for a flower retailer will likely have a greater number of pageviews in months with holidays like Valentine's Day (February) and Mother's Day (May). Accordingly, it would be expected that a high volatility would be found by comparing April to May for the flower retailer's webpage.
In some instances, the pageview volatility (e.g., month to month volatility) may be caused by a small number of panelists that account for a large percentage of the total panelist pageviews. For example, a small number of panelists may visit a webpage more than the rest of the panelists combined. As used herein the relatively small number of panelists is known as the tail. For example, the tail may by the top 1% of panelists in terms of pageviews, the top 5% of panelists in terms of pageviews, the top 10% of panelists in terms of pageviews, or any other suitable percentage. If a member of the tail significantly changes their behavior, this change may cause a disproportionate change in the pageviews for the webpage.
The panelist computers 102 of the illustrated example are computing devices that access and present webpages on the internet. The panelist computers may include personal computers, desktop computers, laptop computers, tablet computers, mobile computers, mobile phones, network enabled televisions, or any other suitable computing device. While two panelist computers 102 are illustrated in
The example panelist computers 102 include panel software 116. The example panel software 116 monitors the usage of the panelist computers 102 and transmits information about the usage to the panelist datastore 104. The panel software 116 may also transmit identifying information about the panelist (e.g., a unique or semi-unique identifier, demographic information, etc.) to the panelist datastore 104. The panel software 116 may be any type of software and may be installed on the panelist computers 102 in any suitable manner. For example, the panel software 116 may be a standalone application, a plugin, a component of a webpage, a script, etc. The panel software 116 may be installed by a user of the panelist computers 102, may be installed by a manufacturer of the panelist computers 102, may be installed by or in response to visiting media such as a webpage, may be installed by an audience monitoring entity, etc. The panel software 116 may monitor any aspect of the panelist computers 102. For example, the panel software 116 may monitor access to a media such as a webpage, may monitor input devices such as keyboards and mice, may monitor information displayed on a monitor, may monitor sound output by speakers, may monitor processing performed by the panelist computers 102, etc.
The panelist datastore 104 of the illustrated example is a database that stores monitoring information received from the panelist computers 102. The panelist datastore 104 may be any type of data storage device and may use any type of data structure suitable for storing panelist information. While a single panelist datastore 104 is illustrated in
The network 106 of the illustrated example is the internet. However, any number or type of networks may be employed to communicatively couple the panelist computers 102 to the panelist datastore 104. For example, the network 106 may include one or more of a wireless network, a wired network, a wide area network, a local area network, a personal area network, etc.
The tail adjustment monitor 108 of the illustrated example monitors monitoring information from panelist computers 102 in the panelist datastore 104 to determine if tail adjustment of the monitoring information is to be performed. For example, as described in further detail in conjunction with
The tail adjuster 110 of the illustrated example adjusts the monitoring information in the panelist datastore 104 when triggered by the tail adjustment monitor 108. The tail adjuster 110 adjusts the monitoring information to reduce or eliminate the effects of volatility in the tail that is determined not to be genuine (e.g., volatility that is not representative of monitoring information as a whole). The tail adjuster 110 may adjust the monitoring information in the panelist datastore 104. Alternatively, the tail adjuster 110 may retrieve the monitoring information from the panelist datastore 104, adjust the monitoring information, and store the adjusted monitoring in the panelist datastore 104. Alternatively, any combination of retrieving and storing and modifying the data in the panelist datastore 104 may be employed. Example methods that may be performed by the tail adjuster 110 are described in conjunction with
As described above, some volatility in monitoring information from month to month is expected and may be caused by seasonal trends or other factors. The trend factor calculator 112 analyzes the monitoring information in the panelist datastore 104 to determine such trends and provides the information to the tail adjuster 110 for adjusting the monitoring information in a manner that includes the trends. An example trend factor calculated by comparing the pageviews of the current month to the pageviews of the previous 6 months may be computed as:
where fi,j is the trend factor for month i and brand j, bwpvsi,j is the weighted pageviews for the bottom 99% of panelists for month i and brand j determined from the monitoring information in the panelist datastore 104, and ci,j is the count of panelists who visited brand j during month i.
The tail adjustment monitor 108, the tail adjuster 110, and the trend factor calculator 112 may be separate components (e.g., separate devices) or may be implemented in a single component or apparatus (e.g., an adjustment manager 116). Additionally or alternatively, one or more of the tail adjustment monitor 108, tail adjuster 110, or the trend factor calculator 112 may be implemented with other components of a central facility such as, for example, the panelist datastore 104 and the report generator 114 described below.
The report generator 114 of the illustrated example generates reports of the monitoring information in the panelist datastore 104. For example, the report generator 114 may generate a report of monthly pageviews for a brand, annual pageviews for a brand, etc. The reports may be distributed to representatives of a brand or webpage, publications, industry groups, advertisers, or any other entity. The example report generator 114 generates reports after the tail adjustment monitor 108 has analyzed the monitoring information and any adjustment by the tail adjustment monitor 108 has been performed. The generation of reports of monitoring information is well known to those of ordinary skill and, thus, is not described in further detail herein.
While an example manner of implementing the system 100 is illustrated in
Flowcharts representative of example machine readable instructions for implementing the tail adjustment manager 116 of
As mentioned above, the example processes of
The program of
The tail adjustment monitor 108 then determines if volatility in the pageviews is caused by a tail (e.g., the top 1% of panelists by pageview count) (block 204). For example, volatility may be caused by the tail when a small number of panelists (e.g., a single panelist) changes their behavior in a way that is not representative of the behavior of the whole or a larger set of panelists. For example, if a panelist in the tail for a brand were to go on vacation, their pageviews might drop drastically for the time they are on vacation and this drop is not representative of a general downward trend for the brand. An example program for determining if volatility is caused by the tail is described in conjunction with
When volatility in the pageviews is determined to be caused by the tail (block 204), the adjustment monitor 108 triggers the trend factor calculator 112 to determine a trend factor (block 206) and the tail adjuster 110 to adjust the pageviews (block 208). The trend factor calculator 112 may determine the trend factor by analyzing pageviews for previous time periods (e.g., previous months) to determine trends that are naturally occurring in the pageviews so that the trends can be accounted for by the tail adjuster 110. While the trend factor may not be included in all implementations, inclusion of the trend factor may reduce the changes of the tail adjuster 110 adjusting the data such that actual trends in the data are incorrectly removed. Example programs for implementing the tail adjuster 110 are described in conjunction with
After the pageviews are adjusted by the tail adjuster 110 the program of
The tail adjustment monitor 108 then compares the difference to a first threshold to determine if difference exceeds the first threshold (block 304). The first threshold is indicative of a maximum amount of volatility that will be acceptable without triggering adjustment. The lower the first threshold the more aggressive the program will be in triggering adjustment. For example, the first threshold may be 10% indicating that adjustment will not be triggered if volatility is less than 10%. When the difference or volatility does not exceed the first threshold, the program of
The pageviews may be normalized by the number of days in each month to ensure that pageviews in longer months do not appear as volatility (e.g., 31 days in January compared to 28 days in February). The calculation of volatility and comparison to the first threshold may be computed as:
where wpvsi,j is weighted pageviews for month i and brand j determined from the panelist database 104, d, is the number of days in month i, is the adjusted weighted pageviews for month i−1 and brand j that was previously adjusted by the adjustment manager 116 and stored in the panelist datastore 104, and Threshold 1 is the first threshold.
When the difference or volatility of the pageviews exceeds the first threshold (block 304), the tail adjustment monitor 108 determines the responsibility of the tail for the volatility (block 306). The tail adjustment monitor 108 determines if the responsibility of the tail for the volatility exceeds a second threshold (block 308). When the responsibility of the tail for the volatility does not exceed the second threshold the program of
The determination of the contribution of the tail to the volatility and comparison to the second threshold may be determined as:
where twpvsi,j is the weighted pageviews for the tail of panelists (e.g., the top 1% of panelists by pageview) for month i and brand j determined from the monitoring information in the panelist datastore 104, di is the number of days in month i, is the adjusted weighted pageviews for the tail for month i−1 and brand j that was previously adjusted by the adjustment manager 116 and stored in the panelist datastore 104, wpvsi,j is weighted pageviews for month i and brand j determined from the panelist database 104,
is the adjusted weighted pageviews for month i−1 and brand j that was previously adjusted by the adjustment manager 116 and stored in the panelist datastore 104, and Threshold 2 is the second threshold.
The second threshold will control how aggressively the tail adjustment monitor 108 will trigger adjustment for volatility caused by the tail. The amount of volatility naturally caused by the tail may vary from brand to brand. For example, the tail for a first brand may typically account for 40% of month over month change while the tail for a second brand may typically account for 20% of month over month change. Accordingly, the second threshold of the illustrated example is determined based a historical view of the brand to be analyzed. In particular, the second threshold of the illustrated example is determined based on an average of the tail contribution to overall weighted pageviews for the past 6 months for the brand with a maximum second threshold of 60%:
where is the adjusted weighted pageviews for the tail for month i and brand j that was previously adjusted by the adjustment manager 116 and stored in the panelist datastore 104 and
is the adjusted weighted pageviews for month i and brand j that was previously adjusted by the adjustment manager 116 and stored in the panelist datastore 104.
When the tail adjustment monitor 108 determines that the responsibility of the tail for volatility of the pageviews exceeds the second threshold (block 308), the tail adjustment monitor 108 triggers adjustment by the tail adjuster 110 (block 310). The program of
The example tail adjuster 110 then determines a logarithm transformation of the weighted pageviews (block 406). The logarithm is applied in the illustrated example to reduce the extent of the tail because the tail can have a very large number of pageviews relative to the rest of the panelists (e.g., the 99th percentile of pageviews may be 157,328 while the tail includes data points as high as 9 million pageviews). The tail adjuster 110 then determines a truncation threshold (block 408).
An example program for determining the truncation threshold is illustrated in
Returning to
where σ is the scale of the distribution and c is the shape of the distribution. A distribution for the 99th percentile may also be fit to the data. An example 99th percentile Weibull distribution is defined as:
where σ is the scale of the distribution and c is the shape of the distribution. Any other suitable distribution may be used based on the distribution of the data such as, for example, a Burr distribution, an exponential distribution, a Pareto distribution, a Generalized Pareto Distribution, or any other type of parametric distribution, etc.
Using the fitted distributions, the tail adjuster 110 determines two thresholds (block 414). A first threshold is determined for W95 as:
T
95=10U+W
where U is the truncation threshold determined in block 408. A second threshold is determined for W99 as:
T
99=10U+W
Next, the tail adjuster 110 determines an expected value for a panelist in the tail and adjusts the pageviews using the thresholds and the determined distributions (block 416). The expected value may be determined from the distribution data as:
where EV is the expected value, U is the truncation threshold determined in block 408, F(x) is the cumulative density function from the fitted distribution, f(x) is the fitted probability density function from the fitted distribution. If the tail volatility is due to the tail being greater than expected, the tail is adjusted downward by capping the weighted pageviews in the tail at one of the thresholds estimated above. The threshold may be selected based on the threshold that results in the least volatility as compared with the previous month's adjusted fail. For example, the adjustment may be performed according to:
where wpvsi,j,k is the weighted pageviews for month i, brand j and panelist k and is the adjusted weighted pageviews for month i, brand j and panelist k.
Alternatively, if the tail volatility is due to the tail being less than expected, the tail is adjusted upward to the expected value based on the trend factor. For example, the adjustment may be performed according to:
if twpvsi,j<fi,j×EVi,j×0.01×Ci,j
then =fi,j×EVi,j
else =wpvsi,j,k
After the adjustments are performed, the program of
The example tail adjuster 110 then determines the months preceding the month under analysis that include more than a threshold number of panelists (block 606). For example, the threshold according to the illustrated example is 200. Alternatively, a different threshold may be selected based on the relative size of a panel where a higher threshold is selected for larger panels. The tail adjuster 110 averages the pageviews of the panelists in the tail for the months that meet the threshold (block 608). For example, the average may be calculated as:
where EVi,j is the calculated expected value for month i and brand j, K is a list of the indices of the months in the past 6 months for which the number of panelists exceeds the threshold (e.g., 200), k is the number of months for which the number of panelists exceeds the threshold, is the adjusted weight pageviews of the tail for month i, brand j, and Ci,j is count of raw panelists who visited brand j during month i.
The tail adjuster 110 then adjusts the weighted pageviews using the calculated expected value (block 610). If the tail volatility is due to the tail being greater than expected, the tail is adjusted downward by the expected value. For example, the adjustment may be performed as:
if twpvsi,j>fi,j×EVi,j×0.01×Ci,j
then =fi,j×EVi,j
else =wpvsi,j,k
where wpvsi,j,k is the weighted pageviews for month i, brand j and panelist k.
If the tail volatility is due to the tail being less than expected, the tail is adjusted upward by the expected value
if twpvsi,j<fi,j×EVi,j×0.01×Ci,j
then =fi,j×EVi,j
else =wpvsi,j,k
where wpvsi,j,k is the weighted pageviews for month i, brand j and panelist k.
After the adjustment is performed (block 610), the tail adjuster 110 determines if the adjustment was effective in adjusting the tail (block 612). For example, if the pageviews are to be adjusted upward, the tail adjuster 110 determines if the adjustment brings the weighted pageviews up for the aggregate tail. If the adjustment is effective, the adjustment is applied or committed (block 614). For example, the adjusted pageviews may be computed but not saved to the panelist datastore 104 until after the determination that the adjustment was effective. If the adjustment is not effective, the adjustment is not applied and the program of
The system 700 of the instant example includes a processor 712. For example, the processor 712 can be implemented by one or more microprocessors or controllers from any desired family or manufacturer.
The processor 712 is in communication with a main memory including a volatile memory 714 and a non-volatile memory 716 via a bus 718. The volatile memory 714 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM) and/or any other type of random access memory device. The non-volatile memory 716 may be implemented by flash memory and/or any other desired type of memory device. Access to the main memory 714, 716 is controlled by a memory controller.
The computer 700 also includes an interface circuit 720. The interface circuit 720 may be implemented by any type of interface standard, such as an Ethernet interface, a universal serial bus (USB), and/or a PCI express interface.
One or more input devices 722 are connected to the interface circuit 720. The input device(s) 722 permit a user to enter data and commands into the processor 712. The input device(s) can be implemented by, for example, a keyboard, a mouse, a touchscreen, a track-pad, a trackball, isopoint and/or a voice recognition system.
One or more output devices 724 are also connected to the interface circuit 720. The output devices 724 can be implemented, for example, by display devices (e.g., a liquid crystal display, a cathode ray tube display (CRT), a printer, etc.). The interface circuit 720, thus, typically includes a graphics driver card.
The interface circuit 720 also includes a communication device such as a modem or network interface card to facilitate exchange of data with external computers via a network 726 (e.g., an Ethernet connection, a digital subscriber line (DSL), a telephone line, coaxial cable, a cellular telephone system, etc.).
The computer 700 also includes one or more mass storage devices 728 for storing software and data. Examples of such mass storage devices 728 include floppy disk drives, hard drive disks, compact disk drives and digital versatile disk (DVD) drives. The mass storage device 728 may implement the panelist datastore 104.
The coded instructions of
From the foregoing, it will appreciated that the above disclosed methods, apparatus and articles of manufacture facilitate the adjustment of panelist monitoring information that includes volatility. The adjustments may be performed when the volatility is due to a small number of panelists that account for a large number of records (e.g., pageviews) in the panelist monitoring information. Accordingly, more accurate panelist monitoring information may be determined and reported.
Although certain example methods, apparatus and articles of manufacture have been described herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all methods, apparatus and articles of manufacture fairly falling within the scope of the claims of this patent.
This patent claims priority to U.S. Provisional Patent Application Ser. No. 61/509,009, filed on Jul. 18, 2011, which is hereby incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61509009 | Jul 2011 | US |