The invention disclosed herein relates generally to marketing systems, and more particularly to a method and system for modeling consumer's activity areas based on social media and mobile activity data to enable a business to better communicate with potential customers.
In today's highly competitive business world, advertising to customers, both potential and previous, is a necessity. Businesses are always looking for ways to increase revenue, and increasing its sales to customers through advertising plays a large part in many business's plans for growth. Advertising has shown to be an effective method to inform, persuade or remind target buyers of the business' goods, services or goodwill, with the ultimate goal being that an advertisement will result in the sale of the goods or services.
Due to the costs associated with marketing campaigns, it is not possible for a business to send advertising material to an unlimited number of potential customers. It would be beneficial for a business to target its advertising to those people that may actually be potential customers, and to provide those potential customers with advertisements that are relevant and timely.
The present invention provides a system and method for modeling consumer activity based on social media and mobile activity data. The location of a consumer and time at which the consumer is present at such location can be obtained using social media and mobile activity data. This data can be analyzed to identify areas frequented by a consumer, and those areas can be classified into different groups based on, for example, the time of the day, the day of week, and other information that can be obtained from the social media and mobile activity data. The resulting analysis can provide an estimated home area and work area, and other insights into the consumer (e.g., where the consumer travels). This consumer activity can be used by a business to provide relevant, targeted advertising to potential customers based on the models obtained from the social media and mobile activity data. The present invention utilizes social media and mobile data to help a business to augment its customer profiling capabilities and provide a new source of customer insight.
The accompanying drawings illustrate presently preferred embodiments of the invention, and together with the general description given above and the detailed description given below, serve to explain the principles of the invention. As shown throughout the drawings, like reference numerals designate like of corresponding parts.
In describing the present invention, reference is made to the drawings, wherein there is seen in
The data points include a time that is preferably in the form of a date and time of day, as well as a location. The location can be provided as longitude and latitude coordinates, or also provided as clear text, e.g., an address such as 27 Waterview Drive, Shelton, Conn. In the latter situation, the address will be geocoded into a longitude/latitude coordinate. Such data can be stored in the database 14 or a memory device (not shown) within computer system 10. An example of a simplified location history is illustrated in Table 1.
In step 52, computer system 10 processes the location history input data to spatially cluster the data points. Clustering, as is well known, is the process of assigning a set of objects into groups (called clusters) so that objects in the same cluster are similar (in some sense or another) to each other, and the objects in different clusters are different than each other. There are multiple processes that can be used to spatially cluster a set of location points. The selection of an appropriate clustering process and parameter settings (including values such as the similarity measurement to use, process stop criteria, and the number of expected clusters) typically depends on the individual data set and intended use of the results. In one embodiment of the present invention, a modified density based clustering process can be used. Such a modified process includes the steps of calculating the heat value of each data point by (i) counting the number of neighbor data points within a predetermined radius r (e.g. 20 km, a relatively long distance for human mobility) of the current data point, and (ii) calculating the heat value y for the current data point using a uniform kernel function. There are multiple types of kernel functions that can be used. For the sake of speed, a uniform kernel of y=Σi1 was selected (basically, counting how many neighbor points within radius r). Once the heat value y has been calculated for each data point, those data points that have a heat value greater than a predetermined threshold (e.g., 10) can be identified as a Potential Cluster Center (FCC). This threshold is set according to the expected density in the cluster. For instance, if r=20 in the above equation, a threshold of 10 can be reached if there are 10 neighbors within the 20 km radius. In the clustering stage, first one cluster is created for each PCC, and all its neighbor points are assigned to the cluster. Second, if a PCC is within the r radius to another PCC, then the two clusters they belong to will be merged. The PCCs of the previous clusters then becomes the PCCs of the merged cluster. The clustering process stops when there are no clusters can be merged. It should be understood that the clustering process used is not limited to the above described process, and any clustering process can be utilized as part of the present invention.
Once the clustering process has been performed, the result will provide several location clusters of a person. Location points close to each other (e.g., within one residential area, along one street, etc.) are usually clustered together while remote locations (New York vs. Toronto) are usually separated into different clusters.
Using the clustering results from step 52, in step 54 the computer system 10 next identifies a “home” area and a “travel” area by time filtering. The processing performed in step 54 will identify a person's “home” clusters, defined as where one regularly lives and works, from these several clusters. To do this, the longevity and frequency of the person's location history data points (also referred to as check-in records) are used as key filters. Thus, within each cluster, the longevity of a cluster and frequency of a cluster are determined by the computer system 10 using the following equations:
Using the results from these calculations, a cluster is defined as a “home” area cluster if the following is true: (i) the longevity of a cluster is greater than or equal to one-half of the longevity of the total check-in history, i.e., if person has used check-in for 200 days, the time of his/her most recent check-in minus the time of his/her earliest check-in should be greater than or equal to 100 days, and (ii) the frequency of check-ins in this cluster is greater than the average frequency of the total check-in history multiplied by an adjustment factor (to ensure that this cluster is a cluster in which the user regularly checked-in. An exemplary adjustment factor that generates satisfactory results is 0.7. Any cluster that meets the requirements of equations (1) and (2) above is identified as a home area cluster, while those clusters that do not are identified as a travel area clusters. Once a home area and travel areas have been identified for a consumer, this information can be used by marketers in various ways. For example, a travel company can identify where a person lives and where he/she prefer to go for vacation from these results. A local grocery store may only want to market to persons who are local.
Once a home cluster has been identified, then in step 56 a convex hull polygon of all the points in this cluster is generated. The convex hull or convex envelope of a set X of points is the smallest convex set that contains X. An object is convex if for every pair of points within the object, every point on the straight line segment that joins them is also within the object. The resulting polygon is defined as the “home area” polygon.
Next, the hottest point is defined as the “overall” activity centroid of this person. A time filtering process is then utilized to further define the clusters as a living place or working place. The cluster with highest number of data points having a time (check-in) within regular working hours, which can be defined, for example, as being between 9 am and 5 pm during weekdays, can be labeled as the working cluster and its hottest point as the working centroid. Note this point may not necessarily be the actual office location of the person, but instead it could be a nearby coffee shop or lunch place. The cluster with highest number of data points having a time (check-in) within non-working hours, which can be defined, for example, as being between 8 pm and 5 am on weekdays plus all weekends, can be labeled as the living cluster, which is usually where a person lives. Note that it is possible for the living cluster to be the same as the working cluster. Additionally, the overall activity centroid is usually either the centroid of the living cluster or the working cluster. The resulting determinations can then be output by the computer system 10 in the form of a printed or displayed report.
From the processes above, it is possible to obtain one overall activity centroid of a person or multiple activity centroids with different context labels. These clusters and centroids can be used in various context aware mobile marketing campaigns. For instance, a business can send home-related advertisements to a person at their home location, and send work related advertisements to a person at their work location.
While preferred embodiments of the invention have been described and illustrated above, it should be understood that these are exemplary of the invention and are not to be considered as limiting. Additions, deletions, substitutions, and other modifications can be made without departing from the spirit or scope of the present invention. Accordingly, the invention is not to be considered as limited by the foregoing description but is only limited by the scope of the appended claims.