Online booking systems include data analysis and data manipulation systems that are used to review, for example, booking data to make informed decisions before purchasing a good or service. Improved data analysis and manipulation systems increase overall performance and the reach of such online booking systems.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
In the drawings, which are not necessarily drawn to scale, like numerals may describe similar components in different views. To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the figure number in which that element is first introduced. Some non-limiting examples are illustrated in the figures of the accompanying drawings in which:
The description that follows includes systems, methods, techniques, instruction sequences, and computing machine program products that embody illustrative embodiments of the disclosure. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide an understanding of various embodiments of the inventive subject matter. It will be evident, however, to those skilled in the art, that embodiments of the inventive subject matter may be practiced without these specific details. In general, well-known instruction instances, protocols, structures, and techniques are not necessarily shown in detail.
Business inquiries frequently revolve around causal inference, for example, seeking to understand the impact of particular business decisions on systems such as a listing network platform. To address such questions, three approaches employed include A/B testing, quasi-experimentation, and observational causal inference methods. While A/B testing and quasi-experimentation are often preferred due to their ability to provide exogenous variation for identification, their implementation can be less efficient and more costly or subject to biases resulting from business or technical constraints. Observational causal inference methods, such as matching methods, synthetic control, and double machine learning, can be used to mitigate biases but usually are not applicable to certain data properties and questions. Furthermore, the aforementioned approaches are more effective in measuring the causal impact of single interventions rather than attributing causal impact holistically across multiple interconnected factors that may contribute to the final outcome. In such scenarios, regression approaches become the alternative. Regression methods use aggregated panel data to concurrently identify the causal impact of multiple factors. However, two challenges undermine the ability to confidently affirm that the estimated parameters from regression methods represent the true causal impact. These challenges are the existence of confounding factors and multicollinearity among covariates. While common solutions such as shrinkage estimators, principal component regressions, and partial linear regressions are helpful in prediction problems, certain limitations hinder their applicability to causal inference problems-they cannot provide the original causal relationships.
The techniques described herein provide a novel approach that specifically addresses the second challenge, namely multicollinearity. To illustrate a practical application of the techniques described herein, an implementation in a marketing measurement scenario (Marketing Mix Marketing-MMM) within the context of a listing network platform is further described below. In marketing contexts, one question of importance is to causally attribute sales to spend across channels-such as Google™ Search, YouTube™ Display, and so on. However, advertisers often allocate their expenditures across ad channels in a correlated manner, particularly during peak seasons. When attempting to estimate impacts via a regression model, highly correlated variables result in larger estimate variances and imprecise attribution of channel contributions, for example, to sales. It is not uncommon to observe regression coefficients switching signs when highly correlated inputs are introduced, consequently undermining the confidence of business stakeholders in the model's results.
As further described below, access to panel data consisting of ad impressions (e.g., number of times an advert is displayed) categorized by channel and geographical location (e.g., Designated Market Area or DMA) over a specific time period is analyzed using certain techniques. When analyzing the data by pooling all geographical locations together, a high level of cross-channel correlation is observed. However, it is worth noting that certain geographical locations exhibit higher cross-channel correlations compared to others. To address the issue of multicollinearity, the techniques describe herein leverage the variations in correlation patterns across different geographical locations. The objective is to restructure the data in a way that significantly reduces the multicollinearity problem. In certain examples, systems and methods involve utilizing hierarchical clustering to group the geographical locations based on their correlation patterns. An important aspect of this approach lies in defining the distance metric used in the clustering algorithm. In the described novel techniques, the distance between two DMAs is defined as the sum of channel-specific distances. Each channel-specific distance measures the similarity in the cross-channel correlation between the two geographical locations.
By incorporating this distance metric (e.g., sum of channel-specific distances) into the hierarchical clustering process, the DMAs can be more effectively grouped in a manner that minimizes multicollinearity across channels. The hierarchical clustering approach allows for an improved practical application of certain models (e.g., marketing measurement models) that mitigates the challenges posed by multicollinearity. Accordingly, improvements in the understanding of the causal relationships between channels and more accurately attribution of their impact on sales or other relevant outcomes is realized. Illustrated improvement with both data descriptive results and regression results are provided herein.
Each user system 102 can include multiple user devices, such as a mobile device 114 and a computer client device 116 that are communicatively connected to exchange data and messages.
A client application 104 interacts with other client applications 104 and with the server system 110 via the network 108. The data exchanged between the client applications 104 and between the client applications 104 and the server system 110 includes functions (e.g., commands to invoke functions) and payload data (e.g., text, audio, video, or other multimedia data).
In some example embodiments, the client application 104 is a reservation application for temporary stays or experiences at hotels, motels, or residences managed by other end users, such as a posting end user who owns a home and rents out the entire home or a private room in the home. In some implementations, the client application 104 includes various components operable to present information to the user and communicate with the networked system 102. In some embodiments, if the reservation application is included in the client device 110, then this application is configured to locally provide the user interface and at least some of the functionalities with the application configured to communicate with the networked system 102, on an as-needed basis, for data or processing capabilities not locally available (e.g., access to a database of items available for sale, to authenticate a user, to verify a method of payment). Conversely, if the reservation application is not included in the client device 110, the client device 110 can use its web browser to access the e-commerce site (or a variant thereof) hosted on the networked system 102.
The server system 110 provides server-side functionality via the network 108 to the client applications 104. While certain functions of the networked system 100 are described herein as being performed by either a client application 104 or by the server system 110, the location of certain functionality either within the client application 104 or the server system 110 may be a design choice. For example, it may be technically preferable to initially deploy particular technology and functionality within the server system 110 but to later migrate this technology and functionality to the client application 104 where a user system 102 has sufficient processing capacity.
The server system 110 supports various services and operations that are provided to the client application 104. Such operations include transmitting data to, receiving data from, and processing data generated by the client applications 104. This data can include message content, client device information, geolocation information, reservation information, transaction information, message content. Data exchanges within the networked system 100 are invoked and controlled through functions available via user interfaces (UIs) of the client application 104.
Turning now specifically to the server system 110, an Application Program Interface (API) server 118 is coupled to and provides programmatic interfaces to application server 120, making the functions of the application server 120 accessible to the client application 104, other applications 106 and third-party server 112. The application server 120 are communicatively coupled to a database server 122, facilitating access to a database 124 that stores data associated with interactions processed by the application server 120. Similarly, a web server 126 is coupled to the application server 120 and provides web-based interfaces to the application server 120. To this end, the web server 126 processes incoming network requests over the Hypertext Transfer Protocol (HTTP) and several other related protocols.
The Application Program Interface (API) server 118 receives and transmits interaction data (e.g., commands and message payloads) between the application server 120 and the user systems 102 (and, for example, interaction clients 104 and other application 106) and the third-party server 112. Specifically, the Application Program Interface (API) server 118 provides a set of interfaces (e.g., routines and protocols) that can be called or queried by the client application 104 and other applications 106 to invoke functionality of the application server 120. The Application Program Interface (API) server 118 exposes various functions supported by the application server 120, including account registration; login functionality.
The application server 120 host the listing network platform 128 and a multicollinearity system 130 each of which comprises one or more modules or applications and each of which can be embodied as hardware, software, firmware, or any combination thereof. The application server 120 is shown to be coupled to a database server 122 that facilitates access to one or more information storage repositories or database(s) 124.
The listing network platform 128 provides a number of publication functions and listing services to the users who access the networked system 100. While the listing network platform 128 is shown in
The multicollinearity system 130 uses certain techniques, such as hierarchical clustering and distance metrics, to reduce multicollinearity and to derive causal relationships between certain products and their users and sales. For example, anonymized listing data collected via the listing network platform 128 is analyzed via the multicollinearity system 130 resulting in practical applications, such as systems and methods, that improve the listing network platform's engagement by increasing a number of users and impression impacts. For example, marketing products and services can now be more easily tracked to determine their impact on the listing network platform 128. Further details of the multicollinearity system 130 are described below.
The techniques described herein include a solution for the multicollinearity problem described above, by using hierarchical clustering. Hierarchical clustering data sets are then applied to an industry standard formulation of Marketing Mix Modeling (MMM), for example, by using Google's light weight MMM model structure with constrained Marketing Channels. Marketing channels include a variety of advertising distribution systems, such as a television channel, a social network platform, a website, a mobile application, and so on. A marketing or media channel is provided in one or more Designated Market Areas (DMAs), such as a geographic location. That is, a country, such as the United States, is split into various contiguous but non-overlapping geographic units or regions, such as cities or metro areas. In certain examples, a standard list of DMAs is used, such as the Nielsen ranking DMAs, which include a listing approximately 210 DMAs.
Model Setup-Sales y are modeled as a nonlinear function of seasonality and advertisement impressions of each channel k with a Bayesian Model shown in Equation 1 below. Let g denote a DMA from a list of DMAs, and t=1, . . . , where t denotes time, (weekly data points are used in the examples below).
Confounding factors that obfuscate or otherwise occlude results are taken into account using the techniques described herein when modeling natural trend and seasonality in order to more properly capture organic demand. When modeling natural trend and seasonality, it is beneficial to strike the right balance between flexibility and strictness—excessive flexibility may lead to overfitting in the model, whereas overly rigid parametric formulations can result in a poor fit of the model. In a marketing use case, a model can be easily overfit that performs poorly out of sample (e.g., outside of training data) because of a high dimensional parameters space, such as estimating four parameters per channel g. Keeping this tradeoff in mind, in addition to including exponential trend and sinusoidal seasonality, including as an additional covariate an index of search (e.g., internet search) query volume for travel and accommodation brands excluding some listing sites (e.g., listing network platform 128 which is represented by Zg,t in the above equation yg,t), is beneficial to capture confounding factors that affect organic demand contemporaneously.
Pre-process data-Two steps are taken to pre-process data in preparation for descriptive analysis and modeling. First is a normalization step. For example, bookings, channel impressions, and a desired covariate are normalized to establish a common scale across DMAs of different sizes. Normalization is a statistical technique used to adjust diverse data points to a common scale, making them comparable without distorting differences in the range of values or losing information. There are several methods to normalize data, such as Min-Max scaling, Z-score normalization or standardization, and unit vector normalization. Min-Max scaling rescales a feature to a fixed range, usually 0 to 1. Z-score normalization or standardization involves rescaling the data to have a mean (average) of 0 and a standard deviation of 1. Unit vector normalization scales the data such that the length of the vector of the data points (in multi-dimensional space) is 1. It is often used in text classification and clustering. This will make it easier to interpret the impact of a certain level of marketing activity, and to model a common trend and seasonality later. Second, channel impressions are decomposed into the common trend, the seasonality, and a residual variation across DMAs so that there is a focus on correlation in the residual variation.
In the context of statistical analysis, particularly in time series data like channel impressions across different Designated Market Areas (DMAs), “decomposed” refers to the process of breaking down the observed data into distinct components that each represent underlying patterns or influences in the data. This decomposition typically includes the common trend, seasonality, and residual variation. Common Trend: This component captures the long-term progression of the data, showing how the channel impressions increase or decrease over time, independent of seasonal fluctuations or irregular occurrences. The trend component is used for understanding the overall direction and growth or decline in channel impressions. Seasonality: This component reflects regular and predictable cycles or patterns that repeat over a specific period, such as weekly, monthly, or annually. In the context of marketing, seasonality might capture increased advertising impressions during holiday seasons, back-to-school periods, or other cyclic events that predictably affect consumer behavior. Residual Variation: After the trend and seasonality are accounted for, the residual variation includes the randomness or irregularities left in the data. These are the fluctuations that cannot be attributed to the systematic trend or seasonal effects. Decomposition can be achieved through various statistical methods depending on the nature of the time series data and the specific requirements of the analysis. For example, Principal Component Analysis (PCA) can be used to decompose the data into several components. PCA is used to reduce the dimensionality of a dataset by transforming it into a new set of variables called principal components. Decomposition can also be achieved via additive decomposition or multiplicative decomposition. Additive decomposition assumes that the components are added together to form the observed data. It's used when seasonal variations are roughly constant through the series. Multiplicative decomposition assumes that the components are multiplied together. It's useful when seasonal variations change proportionally over time.
Both a horizontal and vertical axes 202, 204, respectively, list the advertising channels, which are anonymized and labeled as channel a, channel b, channel c, channel e, and channel e (referred to as “channel_e_upperfunnel” in the figure). Each cell in the matrix shows the correlation coefficient between the channels specified by the corresponding row and column. The coefficients range from −1.0 to 1.0, where: 1.0 indicates a perfect positive correlation; 0.0 indicates no correlation; −1.0 indicates a perfect negative correlation. In some examples the cells are color-coded to visually emphasize the strength and direction of the correlation. For example, cells can be shaded more intensely for higher absolute values of correlation, with different colors representing positive and negative correlations. As graph 200 in
DMAs 302 are in the Xth ventile of baseline sales and exhibit extremely high correlation for four out of the five marketing channels. For example, graphs 306, 308, 310, and 314 representative of channels A, B, C, and E respectively, show very high correlation across all DMAs while only graph 312 representative of channel D shows a mix of low to high correlation across DMAs. DMAs 304 in the Yth ventile exhibit relatively little correlation for all channels. That is, all graphs 316, 318, 320, 322, and 324 show non-uniform coloration across cells. Accordingly,
Multicollinearity poses a challenge in separately measuring the impact of different marketing channels. While common solutions such as shrinkage estimators, principal component regressions, and partial linear regression are helpful in prediction problems, a limitation hinders their applicability to causal inference problems-they typically cannot provide the original causal relationships for business interpretability. To help overcome this limitation, the techniques described herein use a distance metric, such as a pairwise distance metric, that defines correlation-based distances between DMAs. The distance metric is then used to build hierarchically clusters of geo-graphical areas in a way that more effectively mitigates cross-channel multicollinearity. In certain examples, the distance metric includes a pairwise distance or dissimilarity metric that is calculated to summarize marketing spend correlation between DMAs. That pairwise metric is used to cluster the DMAs having moderate to strong correlation.
Turning now to
Distanceijk is a pairwise channel-based distance metric that calculates, for each channel k, a distance between two DMAs i and j, where Xik denotes the time series of residual impressions after eliminating common trend and seasonality, for channel k in DMA i. Calculating an overall distance across all channels is then done using the Distanceijk formula above. More specifically, the overall distance across all channels is the square-root of the sum of squared distances across the channels, and is referred to as Distanceij. Distanceij is a measure that reflects correlation between DMAs across multiple channels, while Correlation (Xik, Xjk) is a statistical correlation coefficient between Xik and Xjk, such as a Pearson correlation coefficient. An equation to calculate Distanceij is below.
An example complete-linkage hierarchical clustering algorithm used to hierarchically cluster DMAs based on Distanceij is as follows:
Distance or dissimilarity between two clusters is based on the farthest pair as calculated via Distanceij. This algorithm produces the dendrogram 400 in
More specifically,
A cutoff distance can be chosen to determine the number of clusters to retain. For example, cutting the dendrogram 400 at a specific height will define the clusters that are used in subsequent analysis. The cutoff distance is chosen to correspond to a desired correlation threshold, thus providing that DMAs within the same cluster have a moderate to strong correlation. In one non-limiting example for demonstration purposes, a cutoff distance of 1.5 is used herein, which corresponds to correlation of at least 0.33 on average for each channel and produces 42 clusters out of 210 or so Nielsen DMAs. Accordingly, DMAs are grouped so that they feature moderate to strong correlation into the same cluster.
The hierarchical clustering approach, e.g., via DMA's Distanceij, produces more intuitive results. It would be beneficial to visualize channel impressions over time across DMAs within each cluster, confirming that DMAs within the same cluster tend to have highly correlated impressions over time for at least some channels.
Each heat map 506, 508, 510, 512, 514 includes a horizontal axis representing time, with the depicted examples spanning several years (e.g., 2020 to 2023). This horizontal axis shows how the channel impressions change over time for a given channel. Each heat map 506, 508, 510, 512, 514 also includes a vertical axis representing different DMAs within a given cluster, e.g., cluster 1. Each row in a heat map corresponds to a specific DMA, showing a pattern of impressions for that DMA over the specified time period. The heat maps 506, 508, 510, 512, 514 use color to represent the intensity or volume of channel impressions. The scale might range from low (e.g., cooler colors like blue) to high (e.g., warmer colors like red). Consistency in color across a single row indicates stability or similarity in advertising impressions for that DMA over time. Variations in color might indicate fluctuations in advertising activity.
In the depicted embodiment, channel b, represented by heatmap 508, shows that compared to the other channels a, c, d, e, channel b is more dissimilar in impressions. This heatmap visualization helps in visualizing and analyzing the consistency and variability of advertising impressions across different channels and DMAs within cluster 1. It aids in understanding how clustered DMAs behave in terms of advertising over time.
Similar to
Each heat map 604, 606, 608, 610, 612 includes a horizontal axis representing time, with the depicted examples spanning several years (e.g., 2020 to 2023). This horizontal axis shows how the channel impressions change over time for a given channel. Each heat map 604, 606, 608, 610, 612 also includes a vertical axis representing different DMAs within a given cluster, e.g., cluster 4. Each row in a heat map corresponds to a specific DMA, showing a pattern of impressions for that DMA over the specified time period. The heat maps 604, 606, 608, 610, 612 also use color to represent the intensity or volume of channel impressions. The scale might range from low (e.g., cooler colors like blue) to high (e.g., warmer colors like red). Consistency in color across a single row indicates stability or similarity in advertising impressions for that DMA over time. Variations in color might indicate fluctuations in advertising activity.
In the depicted embodiment, channel a, represented by heatmap 604, shows that compared to the other channels b, c, d, e, channel a is more dissimilar in impressions for cluster 4. By examining each channel separately, stakeholders can assess which channels are more consistently utilized across DMAs within a cluster and which channels exhibit more variability. This detailed view confirms the effectiveness of the approach in clustering DMAs that share similar patterns in marketing activities over time across different channels. As shown in
Further, as
The curves displayed in the graphs 804, 806, 808, 810, 812 are centered at 0 of the vertical axes via removal of the global average. The curves are then displayed to illustrate the effectiveness of the hierarchical clustering in creating groups (clusters) of DMAs that exhibit similar advertising behaviors. For example, graph 812 shows a spike at week 814 representative of a week of very high impressions for channel e. This may be due to channel e being a television channel and the spike at 814 is when a sports event, such as the Super Bowl, is occurring. In comparison, graph 804 shows a range of weeks 816 with impressions remaining relatively high for channel a. It is to be noted that graphs 804, 806, 808, 810, 812 are provided, in some examples, via the web server 126 and/or the user system 102 through one or more graphical user interfaces.
Panel linear regression analysis affirms the effectiveness of this hierarchical clustering process in mitigating collinearity and facilitating the separate identification of the impact of different marketing channels. After clustering, channel coefficient estimates, the problem of flipped signs when included together with other channels, is minimized or eliminated, and the clustered data instead produces more intuitive results as shown in Table 1 below.
Table 1 summarizes panel linear regression results before and after clustering, with geographic (DMA or cluster) fixed effects and week fixed effects included throughout the different specifications. As Column 1 summarizes, most channel coefficients are negative if DMA-level data is used, while the channel coefficients would be positive if included individually. After clustering, in Column 2, the results become mostly intuitive-all four lower-funnel channels have positive estimated impact on sales (three of which are significant at 0.001 level). The remaining channel with a negative coefficient is upper-funnel, where an expected extreme difficulty in detecting a lower funnel impact is found. Finally, these findings are robust to weighting the cluster based on the natural log of their baseline size.
Now that both descriptive evidence and frequentist regression analysis affirm that the hierarchical clustering technique more effectively mitigates collinearity, an estimate of the Bayesian model described above using data at the cluster level is performed. Similar to the frequentist regressions, the Bayesian model also produces intuitive results, even with uninformative priors.
Likewise, graphs 906, 908 include a horizontal axis representing the range of possible values for a carryover metric. This horizontal axis is also scaled to show the entire range of values that the carryover metric or parameters could take based on the posterior distribution. Graphs 906 and 908 also include a vertical axis representing the density or probability of each metric value, indicating how likely each value (e.g., carryover) is given the data and the model. A carryover metric, also known as lag effects or persistence effects, describe how the influence of an advertising campaign extends beyond the immediate period during which the advertisement is run. This effect acknowledges that consumer behavior might not change instantly but may be influenced over time due to factors like increased brand recognition, improved brand perception, or delayed purchasing decisions.
Consistently with earlier findings, there is an estimate of a higher impact for channels a and b than the other channels, as shown in graph 902, 904. Note that the impact parameter estimates here should be interpreted differently from the frequentist panel linear regressions above, because the Bayesian structural model also estimates parameters that transform the impressions for each channel into adstock based on the lag, carryover, and shape parameters. Broad comparisons of the impact parameter across channels are made, taking into account the other parameters. Furthermore, the carryover estimates are also intuitive and consistent with previous knowledge about the different channels. For example, some carryover for channels c, d, and e is expected, but not for channels a and b, and it is reassuring that the estimates confirm that understanding even when using uninformative priors. It is to be noted that graphs 902, 904, 906, 908 are provided, in some examples, via the web server 126 and/or the user system 102 through one or more graphical user interfaces.
While focusing on an application in marketing mix modeling to demonstrate how and why hierarchical clustering of geographical areas can reduce multicollinearity is helpful, however, the techniques described herein are not constrained to only the setting of marketing and instead is generally applicable to observational causal inference problems featuring multicollinearity. This section overviews some other areas where the techniques presented herein can be applied to and where the dimension to cluster does not always have to be geographical. For example, in the context of customer service, one would like to understand how customers' interaction with support agents contributes to long term customer retention. Often, however, these interaction experience metrics are highly correlated, such as wait time and abandon rate, and so on. In this case, clustering is leveraged to segment customer issue types into groups that have different degrees of correlation between wait time and abandon rate.
In another example, in the context of pricing and product optimization in any highly differentiated marketplace like listings having a large number of product segments, one often needs to use observational and quasi-experimental data to understand how clastic demand is to prices and how much customers value various product attributes. Among other challenges, different features often exhibit correlation with each other as well as with prices. While there are other solutions such as instrumental variables to this problem, hierarchical clustering is performed in a way that reduces correlation and thus complements existing methods, especially for features or settings where one cannot find valid instruments.
Accordingly, the utilization of clustering as an innovative and effective approach to address the issue of multicollinearity in regressional causal inference studies is used in various other problem areas. Clustering has several advantages. Firstly, hierarchical clustering provides a systematic and comprehensive method for identifying clusters that exhibit varying levels of multicollinearity, thus reducing the covariates correlation across clusters. Furthermore, clustering circumvents the need to transform data into non-interpretable entities, as required by techniques such as Principal Component Analysis or Partial Linear Regressions. This ensures that the interpretability and meaningfulness of the variables are preserved throughout the analysis. In addition to its effectiveness, the proposed methodology is characterized by its case of implementation. It can be readily applied to diverse applications facing similar challenges related to multicollinearity. The key lies in understanding the inherent properties of the data to define an appropriate distance metric for clustering that effectively reduces multicollinearity. This contributes to enhancing the robustness and reliability of regressional causal inference studies.
At block 1004, the computing system pre-processes the retrieved data and prepares a model, such as the MMM model described earlier with respect to Equation 1. Model preparation involves the selection of one or more parameters, such as the correlation coefficient α, the channel specific impact parameter β, delayed realization of impact coefficient θk, and so on, based on the characteristics of the data. In some examples, two steps are taken to pre-process data in preparation for descriptive analysis and modeling. First, the computing system normalizes bookings, channel impressions, and the covariate, to establish a common scale across DMAs of different sizes. Normalization is a statistical technique used to adjust diverse data points to a common scale. To normalize this data the computing system applies one or more normalization algorithms, such as Min-Max scaling, Z-score normalization or standardization, and/or unit vector normalization. Min-Max scaling rescales a desired feature to a fixed range, usually 0 to 1. Z-score normalization or standardization involves rescaling the data to have a mean (average) of 0 and a standard deviation of 1. Unit vector normalization scales the data such that the length of the vector of the data points (in multi-dimensional space) is 1. Normalization then results in data that has a common scale. Normalizing the data improves the ability to interpret the impact of a certain level of marketing activity.
Second, the computing system decomposes channel impressions into common trend, seasonality, and residual variation across DMAs, to focus on correlation in the residual variation. For example, Principal Component Analysis (PCA) can be used to decompose the data into several components, namely common trend, seasonality, and residual variation. Common trend captures the long-term progression of the data, showing how the channel impressions increase or decrease over time, independent of seasonal fluctuations or irregular occurrences. The trend component is used for understanding the overall direction and growth or decline in channel impressions. Seasonality reflects regular and predictable cycles or patterns that repeat over a specific period, such as weekly, monthly, or annually. In the context of marketing, seasonality might capture increased advertising impressions during holiday seasons, back-to-school periods, or other cyclic events that predictably affect consumer behavior. After the trend and seasonality are accounted for, the residual variation includes the randomness or irregularities left in the data. These are the fluctuations that cannot be attributed to the systematic trend or seasonal effects. Decomposition can be achieved through various statistical methods depending on the nature of the time series data and the specific requirements of the analysis. PCA is used to reduce the dimensionality of a dataset by transforming it into a new set of variables called principal components. Decomposition can also be achieved via additive decomposition or multiplicative decomposition. Additive decomposition assumes that the components are added together to form the observed data. It's used when seasonal variations are roughly constant through the series. Multiplicative decomposition assumes that the components are multiplied together. Multiplicative decomposition is useful when seasonal variations change proportionally over time.
At block 1006, the computing system clusters the data. In certain embodiments, hierarchical clustering is performed. For example, the computing system utilizes hierarchical clustering to group geographical locations based on their correlation patterns as determined by a distance metric. As mentioned earlier, each DMA is placed inside its own cluster, so there are N clusters at level 0. Next, a second clustering step occurs, where all the clusters now have at least two DMAs, and the DMAs in each cluster have the most similar Distanceij of Equation 4. That is, level 0 DMAs having the smallest Distanceij between them are merged into a single cluster. This process is repeated, progressively reducing the number of clusters and increasing the distance at which clusters are combined. As distances increase, clusters merge together, indicating the grouping of DMAs into clusters based on their Distanceij similarity.
An aspect of this approach lies in defining the distance metric used in the clustering algorithm. In the described methodology, the distance between two DMAs is defined as the sum of channel-specific distances, such as Distanceij. Each channel-specific distance measures the similarity in the cross-channel correlation between the two geographical locations. Clustering at block 1006 additionally includes applying a desired cutoff distance to realize a desired correlation, thus limiting the total number of clusters. For example, a cutoff of 1.5 corresponds to correlation of at least 0.33.
Once the data is hierarchically clustered, the computing system analyzes the data at block 1008 based on the resulting clusters now having reduced multicollinearity. That is, each cluster now has DMAs that have reduced multicollinearity among the DMAs. Consider a scenario where a company advertises across multiple channels and wants to understand the impact of each channel on sales. If expenditures on these channels are highly correlated (e.g., increased spending on social media ads correlates with increased spending on online banners), traditional regression analysis might suffer from multicollinearity. By clustering geographic regions into groups that exhibit similar spending patterns across these channels, the company can analyze each cluster separately. This approach not only reduces multicollinearity but also allows for tailored marketing strategies that are more optimized for the characteristics of each cluster.
The analysis at block 1008 incorporates the use of MMM models, such as Equations 1 and 2. For example, DMAs and related data for a given cluster or set of clusters can be analyzed by calculating yg,t=μtk+seasonalityg,t+αZg,t+βΣk=1kAdStock(xg,t)+ϵg,t as well as AdStockk,g, and the resultant calculations can then be used to create certain visualizations, for example, of impression impact and carryover. That is, individual clusters are analyzed in isolation via MMM techniques, thus providing for an analysis that has a reduced multicollinearity. While only certain MMM techniques are described herein as non-limiting examples, such as via Equations 1 and 2, it is to be understood that any MMM technique can be applied to each individual cluster.
In certain examples, the computing system additionally generates visualizations helpful in determining, for example, spending effects on each individual cluster. For example, percent change relative to DMA-level graphs, such as the graph 704 are created and presented to the user. Likewise, impact parameter graphs and carryover parameter graphs, such as impact graphs 902, 904 and carryover parameter graphs 906, 908 are created and presented. It is to be noted that any number of other graph types can be created by analyzing each cluster, including radar graphs to visualize spend, pie charts to look at individual channel carryover effects, bar charts to describe impression impacts, and so on. By reducing multicollinearity, the techniques described herein will result in a more improved analysis, including geographic analysis, such as via MMM.
The machine 1100 may include processors 1104, memory 1106, and input/output I/O components 1108, which may be configured to communicate with each other via a bus 1110. In an example, the processors 1104 (e.g., a Central Processing Unit (CPU), a Reduced Instruction Set Computing (RISC) Processor, a Complex Instruction Set Computing (CISC) Processor, a Graphics Processing Unit (GPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Radio-Frequency Integrated Circuit (RFIC), another processor, or any suitable combination thereof) may include, for example, a processor 1112 and a processor 1114 that execute the instructions 1102. The term “processor” is intended to include multi-core processors that may comprise two or more independent processors (sometimes referred to as “cores”) that may execute instructions contemporaneously. Although
The memory 1106 includes a main memory 1116, a static memory 1118, and a storage unit 1120, both accessible to the processors 1104 via the bus 1110. The main memory 1106, the static memory 1118, and storage unit 1120 store the instructions 1102 embodying any one or more of the methodologies or functions described herein. The instructions 1102 may also reside, completely or partially, within the main memory 1116, within the static memory 1118, within machine-readable medium 1122 within the storage unit 1120, within at least one of the processors 1104 (e.g., within the processor's cache memory), or any suitable combination thereof, during execution thereof by the machine 1100.
The I/O components 1108 may include a wide variety of components to receive input, provide output, produce output, transmit information, exchange information, capture measurements, and so on. The specific I/O components 1108 that are included in a particular machine will depend on the type of machine. For example, portable machines such as mobile phones may include a touch input device or other such input mechanisms, while a headless server machine will likely not include such a touch input device. It will be appreciated that the I/O components 1108 may include many other components that are not shown in
In further examples, the I/O components 1108 may include biometric components 1128, motion components 1130, environmental components 1132, or position components 1134, among a wide array of other components. For example, the biometric components 1128 include components to detect expressions (e.g., hand expressions, facial expressions, vocal expressions, body gestures, or eye-tracking), measure biosignals (e.g., blood pressure, heart rate, body temperature, perspiration, or brain waves), identify a person (e.g., voice identification, retinal identification, facial identification, fingerprint identification, or electroencephalogram-based identification), and the like. The biometric components may include a brain-machine interface (BMI) system that allows communication between the brain and an external device or machine. This may be achieved by recording brain activity data, translating this data into a format that can be understood by a computer, and then using the resulting signals to control the device or machine.
Example types of BMI technologies, including:
Any biometric data collected by the biometric components is captured and stored only with user approval and deleted on user request. Further, such biometric data may be used for very limited purposes, such as identification verification. To ensure limited and authorized use of biometric information and other personally identifiable information (PII), access to this data is restricted to authorized personnel only, if at all. Any use of biometric data may strictly be limited to identification verification purposes, and the data is not shared or sold to any third party without the explicit consent of the user. In addition, appropriate technical and organizational measures are implemented to ensure the security and confidentiality of this sensitive information.
The motion components 1130 include acceleration sensor components (e.g., accelerometer), gravitation sensor components, rotation sensor components (e.g., gyroscope).
The environmental components 1132 include, for example, one or cameras (with still image/photograph and video capabilities), illumination sensor components (e.g., photometer), temperature sensor components (e.g., one or more thermometers that detect ambient temperature), humidity sensor components, pressure sensor components (e.g., barometer), acoustic sensor components (e.g., one or more microphones that detect background noise), proximity sensor components (e.g., infrared sensors that detect nearby objects), gas sensors (e.g., gas detection sensors to detection concentrations of hazardous gases for safety or to measure pollutants in the atmosphere), or other components that may provide indications, measurements, or signals corresponding to a surrounding physical environment.
With respect to cameras, the user system 102 may have a camera system comprising, for example, front cameras on a front surface of the user system 102 and rear cameras on a rear surface of the user system 102. The front cameras may, for example, be used to capture still images and video of a user of the user system 102 (e.g., “selfies”), which may then be augmented with augmentation data (e.g., filters) described above. The rear cameras may, for example, be used to capture still images and videos in a more traditional camera mode, with these images similarly being augmented with augmentation data. In addition to front and rear cameras, the user system 102 may also include a 360° camera for capturing 360° photographs and videos.
Further, the camera system of the user system 102 may include dual rear cameras (e.g., a primary camera as well as a depth-sensing camera), or even triple, quad or penta rear camera configurations on the front and rear sides of the user system 102. These multiple cameras systems may include a wide camera, an ultra-wide camera, a telephoto camera, a macro camera, and a depth sensor, for example.
The position components 1134 include location sensor components (e.g., a GPS receiver component), altitude sensor components (e.g., altimeters or barometers that detect air pressure from which altitude may be derived), orientation sensor components (e.g., magnetometers), and the like.
Communication may be implemented using a wide variety of technologies. The I/O components 1108 further include communication components 1136 operable to couple the machine 1100 to a network 1138 or devices 1140 via respective coupling or connections. For example, the communication components 1136 may include a network interface component or another suitable device to interface with the network 1138. In further examples, the communication components 1136 may include wired communication components, wireless communication components, cellular communication components, Near Field Communication (NFC) components, Bluetooth® components (e.g., Bluetooth® Low Energy), Wi-Fi® components, and other communication components to provide communication via other modalities. The devices 1140 may be another machine or any of a wide variety of peripheral devices (e.g., a peripheral device coupled via a USB).
Moreover, the communication components 1136 may detect identifiers or include components operable to detect identifiers. For example, the communication components 1136 may include Radio Frequency Identification (RFID) tag reader components, NFC smart tag detection components, optical reader components (e.g., an optical sensor to detect one-dimensional bar codes such as Universal Product Code (UPC) bar code, multi-dimensional bar codes such as Quick Response (QR) code, Aztec code, Data Matrix, Dataglyph™, MaxiCode, PDF417, Ultra Code, UCC RSS-2D bar code, and other optical codes), or acoustic detection components (e.g., microphones to identify tagged audio signals). In addition, a variety of information may be derived via the communication components 1136, such as location via Internet Protocol (IP) geolocation, location via Wi-Fi® signal triangulation, location via detecting an NFC beacon signal that may indicate a particular location, and so forth.
The various memories (e.g., main memory 1116, static memory 1118, and memory of the processors 1104) and storage unit 1120 may store one or more sets of instructions and data structures (e.g., software) embodying or used by any one or more of the methodologies or functions described herein. These instructions (e.g., the instructions 1102), when executed by processors 1104, cause various operations to implement the disclosed examples.
The instructions 1102 may be transmitted or received over the network 1138, using a transmission medium, via a network interface device (e.g., a network interface component included in the communication components 1136) and using any one of several well-known transfer protocols (e.g., hypertext transfer protocol (HTTP)). Similarly, the instructions 1102 may be transmitted or received using a transmission medium via a coupling (e.g., a peer-to-peer coupling) to the devices 1140.
The operating system 1212 manages hardware resources and provides common services. The operating system 1212 includes, for example, a kernel 1224, services 1226, and drivers 1228. The kernel 1224 acts as an abstraction layer between the hardware and the other software layers. For example, the kernel 1224 provides memory management, processor management (e.g., scheduling), component management, networking, and security settings, among other functionalities. The services 1226 can provide other common services for the other software layers. The drivers 1228 are responsible for controlling or interfacing with the underlying hardware. For instance, the drivers 1228 can include display drivers, camera drivers, BLUETOOTH® or BLUETOOTH® Low Energy drivers, flash memory drivers, serial communication drivers (e.g., USB drivers), WI-FI® drivers, audio drivers, power management drivers, and so forth.
The libraries 1214 provide a common low-level infrastructure used by the applications 1218. The libraries 1214 can include system libraries 1230 (e.g., C standard library) that provide functions such as memory allocation functions, string manipulation functions, mathematic functions, and the like. In addition, the libraries 1214 can include API libraries 1232 such as media libraries (e.g., libraries to support presentation and manipulation of various media formats such as Moving Picture Experts Group-4 (MPEG4), Advanced Video Coding (H.264 or AVC), Moving Picture Experts Group Layer-3 (MP3), Advanced Audio Coding (AAC), Adaptive Multi-Rate (AMR) audio codec, Joint Photographic Experts Group (JPEG or JPG), or Portable Network Graphics (PNG)), graphics libraries (e.g., an OpenGL framework used to render in two dimensions (2D) and three dimensions (3D) in a graphic content on a display), database libraries (e.g., SQLite to provide various relational database functions), web libraries (e.g., WebKit to provide web browsing functionality), and the like. The libraries 1214 can also include a wide variety of other libraries 1234 to provide many other APIs to the applications 1218.
The frameworks 1216 provide a common high-level infrastructure that is used by the applications 1218. For example, the frameworks 1216 provide various graphical user interface (GUI) functions, high-level resource management, and high-level location services. The frameworks 1216 can provide a broad spectrum of other APIs that can be used by the applications 1218, some of which may be specific to a particular operating system or platform.
In an example, the applications 1218 may include a home application 1236, a contacts application 1238, a browser application 1240, a book reader application 1242, a location application 1244, a media application 1246, a messaging application 1248, a game application 1250, and a broad assortment of other applications such as a third-party application 1252. The applications 1218 are programs that execute functions defined in the programs. Various programming languages can be employed to create one or more of the applications 1218, structured in a variety of manners, such as object-oriented programming languages (e.g., Objective-C, Java, or C++) or procedural programming languages (e.g., C or assembly language). In a specific example, the third-party application 1252 (e.g., an application developed using the ANDROID™ or IOS™ software development kit (SDK) by an entity other than the vendor of the particular platform) may be mobile software running on a mobile operating system such as IOS™, ANDROID™, WINDOWS® Phone, or another mobile operating system. In this example, the third-party application 1252 can invoke the API calls 1220 provided by the operating system 1212 to facilitate functionalities described herein.
This application claims the benefit of priority to U.S. Provisional Patent Application Ser. No. 63/528,349, filed Jul. 21, 2023, which is incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
63528349 | Jul 2023 | US |