The present disclosure relates to augmenting a sparse user profile.
Many systems and services rely on user profiles of their users. However, oftentimes, users fail to adequately complete their user profiles, thereby resulting in incomplete or sparse user profiles. In order to increase the effectiveness of systems and services that rely on user profiles, there is a need for a system and method for augmenting sparse user profiles.
Systems and methods are provided for augmenting a user profile of a subject user. In general, the user profile of the subject user is augmented based on aggregate profile data for a group of users relevant to a current location of the subject user. In one embodiment, the group of users is a crowd of users currently located at a location that is relevant to the current location of the subject user. In another embodiment, the group of users is a number of users historically, or previously, located at locations relevant to the current location of the subject user.
In one embodiment, a profile augmentation function obtains an aggregate profile of a crowd of users currently located at or near the current location of the subject user. The aggregate profile of the crowd includes a number of keywords and a number of user matches, or occurrences, of each of the keywords in user profiles of the users in the crowd. The profile augmentation function then augments the user profile of the subject user based on the keywords and the number of user matches for the keywords in the aggregate profile of the crowd.
In another embodiment, the profile augmentation function obtains a historical aggregate profile for users previously, or historically, located at or near the current location of the subject user. In one embodiment, the historical aggregate profile includes a number of keywords and a number of user matches, or occurrences, of each of the keywords in user profiles of the users historically located at or near the current location of the subject user. The profile augmentation function then augments the user profile of the subject user based on the keywords and the number of user matches for the keywords in the historical aggregate profile.
Those skilled in the art will appreciate the scope of the present invention and realize additional aspects thereof after reading the following detailed description in association with the accompanying drawings.
The accompanying drawings incorporated in and forming a part of this specification illustrate several aspects of the invention, and together with the description serve to explain the principles of the invention.
The embodiments set forth below represent the necessary information to enable those skilled in the art to practice the invention and illustrate the best mode of practicing the invention. Upon reading the following description in light of the accompanying drawings, those skilled in the art will understand the concepts of the invention and will recognize applications of these concepts not particularly addressed herein. It should be understood that these concepts and applications fall within the scope of the disclosure and the accompanying claims.
As discussed below in detail, the MAP server 12 operates to obtain current locations, including location updates, and user profiles of the users 20-1 through 20-N of the mobile devices 18-1 through 18-N. The current locations of the users 20-1 through 20-N can be expressed as positional geographic coordinates such as latitude-longitude pairs, and a height vector (if applicable), or any other similar information capable of identifying a given physical point in space in a two-dimensional or three-dimensional coordinate system. Using the current locations and user profiles of the users 20-1 through 20-N, the MAP server 12 is enabled to provide a number of features such as, but not limited to, maintaining a historical record of anonymized user profile data by location, generating aggregate profile data over time for a Point of Interest (POI) or Area of Interest (AOI) using the historical record of anonymized user profile data, identifying crowds of users using current locations and/or user profiles of the users 20-1 through 20-N, and generating aggregate profiles for crowds of users at a POI or in an AOI using the current user profiles of users in the crowds. While not essential, for additional information regarding the MAP server 12, the interested reader is directed to U.S. patent application Ser. No. 12/645,535 entitled MAINTAINING A HISTORICAL RECORD OF ANONYMIZED USER PROFILE DATA BY LOCATION FOR USERS IN A MOBILE ENVIRONMENT, U.S. patent application Ser. No. 12/645,532 entitled FORMING CROWDS AND PROVIDING ACCESS TO CROWD DATA IN A MOBILE ENVIRONMENT, U.S. patent application Ser. No. 12/645,539 entitled ANONYMOUS CROWD TRACKING, U.S. patent application Ser. No. 12/645,544 entitled MODIFYING A USER'S CONTRIBUTION TO AN AGGREGATE PROFILE BASED ON TIME BETWEEN LOCATION UPDATES AND EXTERNAL EVENTS, U.S. patent application Ser. No. 12/645,546 entitled CROWD FORMATION FOR MOBILE DEVICE USERS, U.S. patent application Ser. No. 12/645,556 entitled SERVING A REQUEST FOR DATA FROM A HISTORICAL RECORD OF ANONYMIZED USER PROFILE DATA IN A MOBILE ENVIRONMENT, and U.S. patent application Ser. No. 12/645,560 entitled HANDLING CROWD REQUESTS FOR LARGE GEOGRAPHIC AREAS, all of which were filed on Dec. 23, 2009 and are hereby incorporated herein by reference in their entireties. Note that while the MAP server 12 is illustrated as a single server for simplicity and ease of discussion, it should be appreciated that the MAP server 12 may be implemented as a single physical server or multiple physical servers operating in a collaborative manner for purposes of redundancy and/or load sharing.
In general, the one or more profile servers 14 operate to store user profiles for a number of persons including the users 20-1 through 20-N of the mobile devices 18-1 through 18-N. For example, the one or more profile servers 14 may be servers providing social network services such as the Facebook® social networking service, the MySpace® social networking service, the LinkedIN® social networking service, and/or the like. As discussed below, using the one or more profile servers 14, the MAP server 12 is enabled to directly or indirectly obtain the user profiles of the users 20-1 through 20-N of the mobile devices 18-1 through 18-N. The location server 16 generally operates to receive location updates from the mobile devices 18-1 through 18-N and make the location updates available to entities such as, for instance, the MAP server 12. In one exemplary embodiment, the location server 16 is a server operating to provide Yahoo!'s FireEagle service. Before proceeding, it should be noted that while the system 10 of
The mobile devices 18-1 through 18-N may be mobile smart phones, portable media player devices, mobile gaming devices, or the like. Some exemplary mobile devices that may be programmed or otherwise configured to operate as the mobile devices 18-1 through 18-N are the Apple® iPhone, the Palm Pre, the Samsung Rogue, the Blackberry Storm, and the Apple® iPod Touch® device. However, this list of exemplary mobile devices is not exhaustive and is not intended to limit the scope of the present disclosure.
The mobile devices 18-1 through 18-N include MAP clients 30-1 through 30-N, MAP applications 32-1 through 32-N, third-party applications 34-1 through 34-N, and location functions 36-1 through 36-N, respectively. Using the mobile device 18-1 as an example, the MAP client 30-1 is preferably implemented in software. In general, in the preferred embodiment, the MAP client 30-1 is a middleware layer operating to interface an application layer (i.e., the MAP application 32-1 and the third-party applications 34-1) to the MAP server 12. More specifically, the MAP client 30-1 enables the MAP application 32-1 and the third-party applications 34-1 to request and receive data from the MAP server 12. In addition, the MAP client 30-1 enables applications, such as the MAP application 32-1 and the third-party applications 34-1, to access data from the MAP server 12.
The MAP application 32-1 is also preferably implemented in software. The MAP application 32-1 generally provides a user interface component between the user 20-1 and the MAP server 12. More specifically, among other things, the MAP application 32-1 enables the user 20-1 to initiate historical requests for historical data (e.g., historical aggregate profile data) or crowd requests for crowd data (e.g., aggregate profile data and/or crowd characteristics data) from the MAP server 12 for a POI or AOI. The MAP application 32-1 also enables the user 20-1 to configure various settings. For example, the MAP application 32-1 may enable the user 20-1 to select a desired social networking service (e.g., Facebook, MySpace, LinkedIN, etc.) from which to obtain the user profile of the user 20-1 and provide any necessary credentials (e.g., username and password) needed to access the user profile from the social networking service.
The third-party applications 34-1 are preferably implemented in software. The third-party applications 34-1 operate to access the MAP server 12 via the MAP client 30-1. The third-party applications 34-1 may utilize data obtained from the MAP server 12 in any desired manner. As an example, one of the third party applications 34-1 may be a gaming application that utilizes historical aggregate profile data to notify the user 20-1 of POIs or AOIs where persons having an interest in the game have historically congregated.
The location function 36-1 may be implemented in hardware, software, or a combination thereof. In general, the location function 36-1 operates to determine or otherwise obtain the location of the mobile device 18-1. For example, the location function 36-1 may be or include a Global Positioning System (GPS) receiver.
The subscriber device 22 is a physical device such as a personal computer, a mobile computer (e.g., a notebook computer, a netbook computer, a tablet computer, etc.), a mobile smart phone, or the like. The subscriber 24 associated with the subscriber device 22 is a person or entity. In general, the subscriber device 22 enables the subscriber 24 to access the MAP server 12 via a web browser 38 to obtain various types of data, preferably for a fee. For example, the subscriber 24 may pay a fee to have access to historical aggregate profile data for one or more POIs and/or one or more AOIs, pay a fee to have access to crowd data such as aggregate profiles for crowds located at one or more POIs and/or located in one or more AOIs, pay a fee to track crowds, or the like. Note that the web browser 38 is exemplary. In another embodiment, the subscriber device 22 is enabled to access the MAP server 12 via a custom application.
The third-party server 26 is a physical server that has access to data from the MAP server 12 such as historical aggregate profile data for one or more POIs or one or more AOIs or crowd data such as aggregate profiles for one or more crowds at one or more POIs or within one or more AOIs. Based on the data from the MAP server 12, the third-party server 26 operates to provide a service such as, for example, targeted advertising. For example, the third-party server 26 may obtain anonymous aggregate profile data for one or more crowds located at a POI and then provide targeted advertising to known users located at the POI based on the anonymous aggregate profile data. Note that while targeted advertising is mentioned as an exemplary service provided by the third-party server 26, other types of services may additionally or alternatively be provided. Other types of services that may be provided by the third-party server 26 will be apparent to one of ordinary skill in the art upon reading this disclosure.
Lastly, in this embodiment, the MAP server 12 includes a profile augmentation function 40. The profile augmentation function 40 is preferably implemented in software, but is not limited thereto. As discussed below in detail, the profile augmentation function 40 operates to augment user profiles of users, such as but not limited to the users 20-1 through 20-N, based on aggregate profile data for crowds of users and/or historical aggregate profile data. Using the user 20-1 as an example, the profile augmentation function 40 operates to augment the user profile of the user 20-1 based on aggregate profiles of crowds of users in which the user 20-1 is included, aggregate profiles of crowds of users 20-1 that are nearby to the user 20-1, and/or historical aggregate profile data for locations visited by the user 20-1.
As discussed below in detail, in this embodiment, the profile augmentation function 40 operates to augment user profiles of system users (i.e., one or more of the users 20-1 through 20-N), third-party users (e.g., one or more users associated with the profile server 14 other than the users 20-1 through 20-N, the subscriber 24, and/or one or more users associated with the third-party server 26), or a combination thereof. More specifically, for a particular user, the profile augmentation function 40 augments the user profile of the user based on aggregate profile data for crowds located at or near the current location of the user and/or historical aggregate profile data for the current location of the user.
Before describing the operation of the profile augmentation function 40 in detail,
The business logic layer 44 includes a profile manager 54, a location manager 56, a history manager 58, a crowd analyzer 60, and an aggregation engine 62, each of which is preferably implemented in software. In addition, in the embodiment of
The history manager 58 generally operates to maintain a historical record of anonymized user profile data by location. The crowd analyzer 60 operates to form crowds of users. In one embodiment, the crowd analyzer 60 utilizes a spatial crowd formation algorithm. However, the present disclosure is not limited thereto. In addition, the crowd analyzer 60 may further characterize crowds to reflect degree of fragmentation, best-case and worst-case degree of separation (DOS), and/or degree of bi-directionality of relationships. Still further, the crowd analyzer 60 may also operate to track crowds. The aggregation engine 62 generally operates to provide aggregate profile data in response to requests from the mobile devices 18-1 through 18-N, the subscriber device 22, the profile server 14, and/or the third-party server 26. The aggregate profile data may be historical aggregate profile data for one or more geographic locations (e.g., one or more POIs) or one or more geographic areas (e.g., one or more AOIs) or aggregate profile data for crowd(s) currently at one or more geographic locations or in one or more geographic areas.
The persistence layer 46 includes an object mapping layer 64 and a datastore 66. The object mapping layer 64 is preferably implemented in software. The datastore 66 is preferably a relational database, which is implemented in a combination of hardware (i.e., physical data storage hardware) and software (i.e., relational database software). In this embodiment, the business logic layer 44 is implemented in an object-oriented programming language such as, for example, Java. As such, the object mapping layer 64 operates to map objects used in the business logic layer 44 to relational database entities stored in the datastore 66. Note that, in one embodiment, data is stored in the datastore 66 in a Resource Description Framework (RDF) compatible format.
In an alternative embodiment, rather than being a relational database, the datastore 66 may be implemented as an RDF datastore. More specifically, the RDF datastore may be compatible with RDF technology adopted by Semantic Web activities. Namely, the RDF datastore may use the Friend-Of-A-Friend (FOAF) vocabulary for describing people, their social networks, and their interests. In this embodiment, the MAP server 12 may be designed to accept raw FOAF files describing persons, their friends, and their interests. These FOAF files are currently output by some social networking services such as Livejournal and Facebook. The MAP server 12 may then persist RDF descriptions of the users 20-1 through 20-N as a proprietary extension of the FOAF vocabulary that includes additional properties desired for the system 10.
At some point after authentication is complete, a user profile process is performed such that a user profile of the user 20-1 is obtained from the profile server 14 and delivered to the MAP server 12 (step 1002). In this embodiment, the MAP client 30-1 of the mobile device 18-1 sends a profile request to the profile server 14 (step 1002A). In response, the profile server 14 returns the user profile of the user 20-1 to the mobile device 18-1 (step 1002B). The MAP client 30-1 of the mobile device 18-1 then sends the user profile of the user 20-1 to the MAP server 12 (step 1002C). Note that while in this embodiment the MAP client 30-1 sends the complete user profile of the user 20-1 to the MAP server 12, in an alternative embodiment, the MAP client 30-1 may filter the user profile of the user 20-1 according to criteria specified by the user 20-1. For example, the user profile of the user 20-1 may include demographic information, general interests, music interests, and movie interests, and the user 20-1 may specify that the demographic information or some subset thereof is to be filtered, or removed, before sending the user profile to the MAP server 12.
Upon receiving the user profile of the user 20-1 from the MAP client 30-1 of the mobile device 18-1, the profile manager 54 of the MAP server 12 processes the user profile (step 1002D). More specifically, in the preferred embodiment, the profile manager 54 includes social network handlers for the social network services supported by the MAP server 12. Thus, for example, if the MAP server 12 supports user profiles from Facebook®, MySpace®, and LinkedIN®, the profile manager 54 may include a Facebook handler, a MySpace handler, and a LinkedIN handler. The social network handlers process user profiles to generate user profiles for the MAP server 12 that include lists of keywords for each of a number of profile categories. The profile categories may be the same for each of the social network handlers or different for each of the social network handlers. Thus, for this example assume that the user profile of the user 20-1 is from Facebook. The profile manager 54 uses a Facebook handler to process the user profile of the user 20-1 to map the user profile of the user 20-1 from Facebook to a user profile for the MAP server 12 including lists of keywords for a number of predefined profile categories. For example, for the Facebook handler, the profile categories may be a demographic profile category, a social interaction profile category, a general interests profile category, a music interests profile category, and a movie interests profile category. As such, the user profile of the user 20-1 from Facebook may be processed by the Facebook handler of the profile manager 54 to create a list of keywords such as, for example, liberal, High School Graduate, 35-44, College Graduate, etc. for the demographic profile category, a list of keywords such as Seeking Friendship for the social interaction profile category, a list of keywords such as politics, technology, photography, books, etc. for the general interests profile category, a list of keywords including music genres, artist names, album names, or the like for the music interests profile category, and a list of keywords including movie titles, actor or actress names, director names, movie genres, or the like for the movie interests profile category. In one embodiment, the profile manager 54 may use natural language processing or semantic analysis. For example, if the Facebook user profile of the user 20-1 states that the user 20-1 is 20 years old, semantic analysis may result in the keyword of 18-24 years old being stored in the user profile of the user 20-1 for the MAP server 12.
After processing the user profile of the user 20-1, the profile manager 54 of the MAP server 12 stores the resulting user profile for the user 20-1 (step 1002E). More specifically, in one embodiment, the MAP server 12 stores user records for the users 20-1 through 20-N in the datastore 66 (
Note that the while the discussion herein focuses on an embodiment where the user profiles of the users 20-1 through 20-N are obtained from the one or more profile servers 14, the user profiles of the users 20-1 through 20-N may be obtained in any desired manner. For example, in one alternative embodiment, the user 20-1 may identify one or more favorite websites. The profile manager 54 of the MAP server 12 may then crawl the one or more favorite websites of the user 20-1 to obtain keywords appearing in the one or more favorite websites of the user 20-1. These keywords may then be stored as the user profile of the user 20-1.
At some point, a process is performed such that a current location of the mobile device 18-1 and thus a current location of the user 20-1 is obtained by the MAP server 12 (step 1004). In this embodiment, the MAP application 32-1 of the mobile device 18-1 obtains the current location of the mobile device 18-1 from the location function 36-1 of the mobile device 18-1. The MAP application 32-1 then provides the current location of the mobile device 18-1 to the MAP client 30-1, and the MAP client 30-1 then provides the current location of the mobile device 18-1 to the MAP server 12 (step 1004A). Note that step 1004A may be repeated periodically or in response to a change in the current location of the mobile device 18-1 in order for the MAP application 32-1 to provide location updates for the user 20-1 to the MAP server 12.
In response to receiving the current location of the mobile device 18-1, the location manager 56 of the MAP server 12 stores the current location of the mobile device 18-1 as the current location of the user 20-1 (step 1004B). More specifically, in one embodiment, the current location of the user 20-1 is stored in the user record of the user 20-1 maintained in the datastore 66 of the MAP server 12. Note that in the preferred embodiment only the current location of the user 20-1 is stored in the user record of the user 20-1. In this manner, the MAP server 12 maintains privacy for the user 20-1 since the MAP server 12 does not maintain a historical record of the location of the user 20-1. As discussed below in detail, historical data maintained by the MAP server 12 is anonymized in order to maintain the privacy of the users 20-1 through 20-N.
In addition to storing the current location of the user 20-1, the location manager 56 sends the current location of the user 20-1 to the location server 16 (step 1004C). In this embodiment, by providing location updates to the location server 16, the MAP server 12 in return receives location updates for the user 20-1 from the location server 16. This is particularly beneficial when the mobile device 18-1 does not permit background processes, which is the case for the Apple® iPhone. As such, if the mobile device 18-1 is an Apple® iPhone or similar device that does not permit background processes, the MAP application 32-1 will not be able to provide location updates for the user 20-1 to the MAP server 12 unless the MAP application 32-1 is active.
Therefore, when the MAP application 32-1 is not active, other applications running on the mobile device 18-1 (or some other device of the user 20-1) may directly or indirectly provide location updates to the location server 16 for the user 20-1. This is illustrated in step 1006 where the location server 16 receives a location update for the user 20-1 directly or indirectly from another application running on the mobile device 18-1 or an application running on another device of the user 20-1 (step 1006A). The location server 16 then provides the location update for the user 20-1 to the MAP server 12 (step 1006B). In response, the location manager 56 updates and stores the current location of the user 20-1 in the user record of the user 20-1 (step 1006C). In this manner, the MAP server 12 is enabled to obtain location updates for the user 20-1 even when the MAP application 32-1 is not active at the mobile device 18-1.
At some point after authentication is complete, a user profile process is performed such that a user profile of the user 20-1 is obtained from the profile server 14 and delivered to the MAP server 12 (step 1102). In this embodiment, the profile manager 54 of the MAP server 12 sends a profile request to the profile server 14 (step 1102A). In response, the profile server 14 returns the user profile of the user 20-1 to the profile manager 54 of the MAP server 12 (step 1102B). Note that while in this embodiment the profile server 14 returns the complete user profile of the user 20-1 to the MAP server 12, in an alternative embodiment, the profile server 14 may return a filtered version of the user profile of the user 20-1 to the MAP server 12. The profile server 14 may filter the user profile of the user 20-1 according to criteria specified by the user 20-1. For example, the user profile of the user 20-1 may include demographic information, general interests, music interests, and movie interests, and the user 20-1 may specify that the demographic information or some subset thereof is to be filtered, or removed, before sending the user profile to the MAP server 12.
Upon receiving the user profile of the user 20-1, the profile manager 54 of the MAP server 12 processes the user profile (step 1102C). More specifically, as discussed above, in the preferred embodiment, the profile manager 54 includes social network handlers for the social network services supported by the MAP server 12. The social network handlers process user profiles to generate user profiles for the MAP server 12 that include lists of keywords for each of a number of profile categories. The profile categories may be the same for each of the social network handlers or different for each of the social network handlers.
After processing the user profile of the user 20-1, the profile manager 54 of the MAP server 12 stores the resulting user profile for the user 20-1 (step 1102D). More specifically, in one embodiment, the MAP server 12 stores user records for the users 20-1 through 20-N in the datastore 66 (
Note that while the discussion herein focuses on an embodiment where the user profiles of the users 20-1 through 20-N are obtained from the one or more profile servers 14, the user profiles of the users 20-1 through 20-N may be obtained in any desired manner. For example, in one alternative embodiment, the user 20-1 may identify one or more favorite websites. The profile manager 54 of the MAP server 12 may then crawl the one or more favorite websites of the user 20-1 to obtain keywords appearing in the one or more favorite websites of the user 20-1. These keywords may then be stored as the user profile of the user 20-1.
At some point, a process is performed such that a current location of the mobile device 18-1 and thus a current location of the user 20-1 is obtained by the MAP server 12 (step 1104). In this embodiment, the MAP application 32-1 of the mobile device 18-1 obtains the current location of the mobile device 18-1 from the location function 36-1 of the mobile device 18-1. The MAP application 32-1 then provides the current location of the user 20-1 of the mobile device 18-1 to the location server 16 (step 1104A). Note that step 1104A may be repeated periodically or in response to changes in the location of the mobile device 18-1 in order to provide location updates for the user 20-1 to the MAP server 12. The location server 16 then provides the current location of the user 20-1 to the MAP server 12 (step 1104B). The location server 16 may provide the current location of the user 20-1 to the MAP server 12 automatically in response to receiving the current location of the user 20-1 from the mobile device 18-1 or in response to a request from the MAP server 12.
In response to receiving the current location of the mobile device 18-1, the location manager 56 of the MAP server 12 stores the current location of the mobile device 18-1 as the current location of the user 20-1 (step 1104C). More specifically, in one embodiment, the current location of the user 20-1 is stored in the user record of the user 20-1 maintained in the datastore 66 of the MAP server 12. Note that in the preferred embodiment only the current location of the user 20-1 is stored in the user record of the user 20-1. In this manner, the MAP server 12 maintains privacy for the user 20-1 since the MAP server 12 does not maintain a historical record of the location of the user 20-1. As discussed below in detail, historical data maintained by the MAP server 12 is anonymized in order to maintain the privacy of the users 20-1 through 20-N.
As discussed above, the use of the location server 16 is particularly beneficial when the mobile device 18-1 does not permit background processes, which is the case for the Apple® iPhone. As such, if the mobile device 18-1 is an Apple® iPhone or similar device that does not permit background processes, the MAP application 32-1 will not provide location updates for the user 20-1 to the location server 16 unless the MAP application 32-1 is active. However, other applications running on the mobile device 18-1 (or some other device of the user 20-1) may provide location updates to the location server 16 for the user 20-1 when the MAP application 32-1 is not active. This is illustrated in step 1106 where the location server 16 receives a location update for the user 20-1 from another application running on the mobile device 18-1 or an application running on another device of the user 20-1 (step 1106A). The location server 16 then provides the location update for the user 20-1 to the MAP server 12 (step 1106B). In response, the location manager 56 updates and stores the current location of the user 20-1 in the user record of the user 20-1 (step 1106C). In this manner, the MAP server 12 is enabled to obtain location updates for the user 20-1 even when the MAP application 32-1 is not active at the mobile device 18-1.
Using the current locations of the users 20-1 through 20-N and the user profiles of the users 20-1 through 20-N, the MAP server 12 can provide a number of features. A first feature that may be provided by the MAP server 12 is historical storage of anonymized user profile data by location. This historical storage of anonymized user profile data by location is performed by the history manager 58 of the MAP server 12. More specifically, as illustrated in
As discussed below in detail, at a predetermined time interval such as, for example, 15 minutes, the history manager 58 makes a copy of the lists of users in the location buckets, anonymizes the user profiles of the users in the lists to provide anonymized user profile data for the corresponding location buckets, and stores the anonymized user profile data in a number of history objects. In one embodiment, a history object is stored for each location bucket having at least one user. In another embodiment, a quadtree algorithm is used to efficiently create history objects for geographic regions (i.e., groups of one or more adjoining location buckets).
After determining the location bucket for the location of the user 20-1, the history manager 58 determines whether the user 20-1 is new to the location bucket (step 1204). In other words, the history manager 58 determines whether the user 20-1 is already on the list of users for the location bucket. If the user 20-1 is new to the location bucket, the history manager 58 creates an entry for the user 20-1 in the list of users for the location bucket (step 1206). Returning to step 1204, if the user 20-1 is not new to the location bucket, the history manager 58 updates the entry for the user 20-1 in the list of users for the location bucket (step 1208). At this point, whether proceeding from step 1206 or 1208, the user 20-1 is flagged as active in the list of users for the location bucket (step 1210).
The history manager 58 then determines whether the user 20-1 has moved from another location bucket (step 1212). More specifically, the history manager 58 determines whether the user 20-1 is included in the list of users for another location bucket and is currently flagged as active in that list. If the user 20-1 has not moved from another location bucket, the process proceeds to step 1216. If the user 20-1 has moved from another location bucket, the history manager 58 flags the user 20-1 as inactive in the list of users for the other location bucket from which the user 20-1 has moved (step 1214).
At this point, whether proceeding from step 1212 or 1214, the history manager 58 determines whether it is time to persist (step 1216). More specifically, as mentioned above, the history manager 58 operates to persist history objects at a predetermined time interval such as, for example, every 15 minutes. Thus, the history manager 58 determines that it is time to persist if the predetermined time interval has expired. If it is not time to persist, the process returns to step 1200 and is repeated for a next received location update, which will typically be for another user. If it is time to persist, the history manager 58 creates a copy of the lists of users for the location buckets and passes the copy of the lists to an anonymization and storage process (step 1218). In this embodiment, the anonymization and storage process is a separate process performed by the history manager 58. The history manager 58 then removes inactive users from the lists of users for the location buckets (step 1220). The process then returns to step 1200 and is repeated for a next received location update, which will typically be for another user.
For anonymization, an anonymous user record 96 is created from the user record 92. In the anonymous user record 96, the user ID is replaced with a new user ID that is not connected back to the user, which is also referred to herein as an anonymous user ID. This new user ID is different than any other user ID used for anonymous user records created from the user record of the user for any previous or subsequent time periods. In this manner, anonymous user records for a single user created over time cannot be linked to one another.
In addition, anonymous profile category records 98-1 through 98-M are created for the profile category records 94-1 through 94-M. In the anonymous profile category records 98-1 through 98-M, the user ID is replaced with a new user ID, which may be the same new user ID included in the anonymous user record 96. The anonymous profile category records 98-1 through 98-M include the same category IDs and lists of keywords as the corresponding profile category records 94-1 through 94-M. Note that the location of the user is not stored in the anonymous user record 96. With respect to location, it is sufficient that the anonymous user record 96 is linked to a location bucket.
In another embodiment, the history manager 58 performs anonymization in a manner similar to that described above with respect to
In yet another embodiment, rather than creating anonymous user records 96 for the users in the lists maintained for the location buckets, the history manager 58 may perform anonymization by storing an aggregate user profile for each location bucket, or each group of location buckets representing a node in a quadtree data structure (see below). The aggregate user profile may include a list of all keywords and potentially the number of occurrences of each keyword in the user profiles of the corresponding group of users. In this manner, the data stored by the history manager 58 is not connected back to the users 20-1 through 20-N.
Each history object includes location information, timing information, data, and quadtree data structure information. The location information included in the history object defines a combined geographic area of the location bucket(s) forming the corresponding node of the quadtree data structure. For example, the location information may be latitude and longitude coordinates for a northeast corner of the combined geographic area of the node of the quadtree data structure and a southwest corner of the combined geographic area for the node of the quadtree data structure. The timing information includes information defining a time window for the history object, which may be, for example, a start time for the corresponding time interval and an end time for the corresponding time interval. The data includes the anonymized user profile data for the users in the list(s) maintained for the location bucket(s) forming the node of the quadtree data structure for which the history object is stored. In addition, the data may include a total number of users in the location bucket(s) forming the node of the quadtree data structure. Lastly, the quadtree data structure information includes information defining a quadtree depth of the node in the quadtree data structure.
In order to form the quadtree data structure, the history manager 58 determines whether there are any more base quadtree regions to process (step 1500). If there are more base quadtree regions to process, the history manager 58 sets a current node to the next base quadtree region to process, which for the first iteration is the first base quadtree region (step 1502). The history manager 58 then determines whether the number of users in the current node is greater than a predefined maximum number of users and whether a current quadtree depth is less than a maximum quadtree depth (step 1504). In one embodiment, the maximum quadtree depth may be reached when the current node corresponds to a single location bucket. However, the maximum quadtree depth may be set such that the maximum quadtree depth is reached before the current node reaches a single location bucket.
If the number of users in the current node is greater than the predefined maximum number of users and the current quadtree depth is less than a maximum quadtree depth, the history manager 58 creates a number of child nodes for the current node (step 1506). More specifically, the history manager 58 creates a child node for each quadrant of the current node. The users in the current node are then assigned to the appropriate child nodes based on the location buckets in which the users are located (step 1508), and the current node is then set to the first child node (step 1510). At this point, the process returns to step 1504 and is repeated.
Once the number of users in the current node is not greater than the predefined maximum number of users or the maximum quadtree depth has been reached, the history manager 58 determines whether the current node has any more sibling nodes (step 1512). Sibling nodes are child nodes of the same parent node. If so, the history manager 58 sets the current node to the next sibling node of the current node (step 1514), and the process returns to step 1504 and is repeated. Once there are no more sibling nodes to process, the history manager 58 determines whether the current node has a parent node (step 1516). If so, since the parent node has already been processed, the history manager 58 determines whether the parent node has any sibling nodes that need to be processed (step 1518). If the parent node has any sibling nodes that need to be processed, the history manager 58 sets the next sibling node of the parent node to be processed as the current node (step 1520). From this point, the process returns to step 1504 and is repeated. Returning to step 1516, if the current node does not have a parent node, the process returns to step 1500 and is repeated until there are no more base quadtree regions to process. Once there are no more base quadtree regions to process, the finished quadtree data structure is returned to the process of
Next, the history manager 58 determines whether the number of users in the child node 102-1 is greater than the predetermined maximum, which again for this example is 3. Since the number of users in the child node 102-1 is greater than 3, the history manager 58 divides the child node 102-1 into four child nodes 104-1 through 104-4, as illustrated in
The history manager 58 then determines whether the number of users in the child node 106-1 is greater than the predetermined maximum number of users, which again is 3. Since the number of users in the child node 106-1 is not greater than the predetermined maximum number of users, the child node 106-1 is identified as a node for the finished quadtree data structure, and the history manager 58 proceeds to process the sibling nodes of the child node 106-1, which are the child nodes 106-2 through 106-4. Since the number of users in each of the child nodes 106-2 through 106-4 is less than the predetermined maximum number of users, the child nodes 106-2 through 106-4 are also identified as nodes for the finished quadtree data structure.
Once the history manager 58 has finished processing the child nodes 106-1 through 106-4, the history manager 58 identifies the parent node of the child nodes 106-1 through 106-4, which in this case is the child node 104-1. The history manager 58 then processes the sibling nodes of the child node 104-1, which are the child nodes 104-2 through 104-4. In this example, the number of users in each of the child nodes 104-2 through 104-4 is less than the predetermined maximum number of users. As such, the child nodes 104-2 through 104-4 are identified as nodes for the finished quadtree data structure.
Once the history manager 58 has finished processing the child nodes 104-1 through 104-4, the history manager 58 identifies the parent node of the child nodes 104-1 through 104-4, which in this case is the child node 102-1. The history manager 58 then processes the sibling nodes of the child node 102-1, which are the child nodes 102-2 through 102-4. More specifically, the history manager 58 determines that the child node 102-2 includes more than the predetermined maximum number of users and, as such, divides the child node 102-2 into four child nodes 108-1 through 108-4, as illustrated in
As discussed above, the history manager 58 stores a history object for each of the nodes in the quadtree data structure including at least one user. As such, in this example, the history manager 58 stores history objects for the child nodes 106-2 and 106-3, the child nodes 104-2 and 104-4, the child nodes 108-1 and 108-4, and the child node 102-3. However, no history objects are stored for the nodes that do not have any users (i.e., the child nodes 106-1 and 106-4, the child node 104-3, the child nodes 108-2 and 108-3, and the child node 102-4).
In another embodiment, the historical request is for an AOI and a time window, where the AOI may be an AOI of a geographic area of a predefined shape and size centered at the current location of the user 20-1, an AOI selected from a list of AOIs defined by the user 20-1, an AOI selected from a list of AOIs defined by the MAP application 32-1 or the MAP server 12, an AOI selected by the user 20-1 from a map, an AOI implicitly defined via a separate application (e.g., AOI is implicitly defined as an area of a predefined shape and size centered at the location of the nearest Starbucks coffee house in response to the user 20-1 performing a Google search for “Starbucks”), or the like. If the AOI is selected from a list of AOIs, the list of AOIs may include static AOIs, dynamic AOIs which may be defined as areas of a predefined shape and size centered at the current locations of one or more friends of the user 20-1, or both. Note that the POI or AOI of the historical request may be selected by the user 20-1 via the MAP application 32-1. In yet another embodiment, the MAP application 32-1 automatically uses the current location of the user 20-1 as the POI or as a center point for an AOI of a predefined shape and size.
The time window for the historical request may be relative to the current time. For example, the time window may be the last hour, the last day, the last week, the last month, or the like. Alternatively, the time window may be an arbitrary time window selected by the user 20-1 such as, for example, yesterday from 7 pm-9 pm, last Friday, last week, or the like. Note that while in this example the historical request includes a single POI or AOI and a single time window, the historical request may include multiple POIs or AOIs and/or multiple time windows.
In one embodiment, the historical request is made in response to user input from the user 20-1 of the mobile device 18-1. For instance, in one embodiment, the user 20-1 selects either a POI or an AOI and a time window and then instructs the MAP application 32-1 to make the historical request by, for example, selecting a corresponding button on a graphical user interface. In another embodiment, the historical request is made automatically in response to some event such as, for example, opening the MAP application 32-1.
Upon receiving the historical request from the MAP application 32-1, the MAP client 30-1 forwards the historical request to the MAP server 12 (step 1602). Note that the MAP client 30-1 may, in some cases, process the historical request from the MAP application 32-1 before forwarding the historical request to the MAP server 12. For example, if the historical request from the MAP application 32-1 is for multiple POIs/AOIs and/or for multiple time windows, the MAP client 30-1 may process the historical request from the MAP application 32-1 to produce multiple historical requests to be sent to the MAP server 12. For instance, a separate historical request may be produced for each POI/AOI and time window combination. However, for this discussion, the historical request is for a single POI or AOI for a single time window.
Upon receiving the historical request from the MAP client 30-1, the MAP server 12 processes the historical request (step 1604). More specifically, the historical request is processed by the history manager 58 of the MAP server 12. First, the history manager 58 obtains history objects that are relevant to the historical request from the datastore 66 of the MAP server 12. The relevant history objects are those recorded for locations relevant to the POI or AOI and the time window for the historical request. The history manager 58 then processes the relevant history objects to provide historical aggregate profile data for the POI or AOI. In this embodiment, the historical aggregate profile data is based on the user profiles of the anonymous user records in the relevant history objects as compared to the user profile of the user 20-1 or a select subset thereof. In another embodiment, the historical aggregate profile data is based on the user profiles of the anonymous user records in the relevant history objects as compared to a target user profile defined or otherwise specified by the user 20-1 or as compared to one another. Once the MAP server 12 has processed the historical request, the MAP server 12 returns the resulting historical aggregate profile data to the MAP client 30-1 (step 1606). Upon receiving the historical aggregate profile data, the MAP client 30-1 passes the historical aggregate profile data to the MAP application 32-1 (step 1608). The MAP application 32-1 then presents the historical aggregate profile data to the user 20-1 (step 1610).
First, the crowd analyzer 60 establishes a bounding box for the crowd formation process (step 1700). Note that while a bounding box is used in this example, other geographic shapes may be used to define a bounding region for the crowd formation process (e.g., a bounding circle). In one embodiment, if crowd formation is performed in response to a specific request, the bounding box is established based on the POI or the AOI of the request. If the request is for a POI, then the bounding box is a geographic area of a predetermined size centered at the POI. If the request is for an AOI, the bounding box is the AOI. Alternatively, if the crowd formation process is performed proactively, the bounding box is a bounding box of a predefined size.
The crowd analyzer 60 then creates a crowd for each individual user in the bounding box (step 1702). More specifically, the crowd analyzer 60 queries the datastore 66 of the MAP server 12 to identify users currently located within the bounding box. Then, a crowd of one user is created for each user currently located within the bounding box. Next, the crowd analyzer 60 determines the two closest crowds in the bounding box (step 1704) and determines a distance between the two crowds (step 1706). The distance between the two crowds is a distance between crowd centers of the two crowds. Note that the crowd center of a crowd of one is the current location of the user in the crowd. The crowd analyzer 60 then determines whether the distance between the two crowds is less than an optimal inclusion distance (step 1708). In this embodiment, the optimal inclusion distance is a predefined static distance. If the distance between the two crowds is less than the optimal inclusion distance, the crowd analyzer 60 combines the two crowds (step 1710) and computes a new crowd center for the resulting crowd (step 1712). The crowd center may be computed based on the current locations of the users in the crowd using a center of mass algorithm. At this point the process returns to step 1704 and is repeated until the distance between the two closest crowds is not less than the optimal inclusion distance. At that point, the crowd analyzer 60 discards any crowds with less than three users (step 1714). Note that throughout this disclosure crowds are only maintained if the crowds include three or more users. However, while three users is the preferred minimum number of users in a crowd, the present disclosure is not limited thereto. The minimum number of users in a crowd may be defined as any number greater than or equal to two users.
Next, the crowd analyzer 60 determines whether the new and old bounding boxes overlap (step 1808). If so, the crowd analyzer 60 creates a bounding box encompassing the new and old bounding boxes (step 1810). For example, if the new and old bounding boxes are 40×40 meter regions and a 1×1 meter square at the northeast corner of the new bounding box overlaps a 1×1 meter square at the southwest corner of the old bounding box, the crowd analyzer 60 may create a 79×79 meter square bounding box encompassing both the new and old bounding boxes.
The crowd analyzer 60 then determines the individual users and crowds relevant to the bounding box created in step 1810 (step 1812). The crowds relevant to the bounding box are crowds that are within or overlap the bounding box (e.g., have at least one user located within the bounding box). The individual users relevant to the bounding box are users that are currently located within the bounding box and not already part of a crowd. Next, the crowd analyzer 60 computes an optimal inclusion distance for individual users based on user density within the bounding box (step 1814). More specifically, in one embodiment, the optimal inclusion distance for individuals, which is also referred to herein as an initial optimal inclusion distance, is set according to the following equation:
where a is a number between 0 and 1, ABoundingBox is an area of the bounding box, and number_of_users is the total number of users in the bounding box. The total number of users in the bounding box includes both individual users that are not already in a crowd and users that are already in a crowd. In one embodiment, a is ⅔.
The crowd analyzer 60 then creates a crowd for each individual user within the bounding box that is not already included in a crowd and sets the optimal inclusion distance for the crowds to the initial optimal inclusion distance (step 1816). At this point, the process proceeds to
Next, the crowd analyzer 60 determines the two closest crowds for the bounding box (step 1824) and a distance between the two closest crowds (step 1826). The distance between the two closest crowds is the distance between the crowd centers of the two closest crowds. The crowd analyzer 60 then determines whether the distance between the two closest crowds is less than the optimal inclusion distance of a larger of the two closest crowds (step 1828). If the two closest crowds are of the same size (i.e., have the same number of users), then the optimal inclusion distance of either of the two closest crowds may be used. Alternatively, if the two closest crowds are of the same size, the optimal inclusion distances of both of the two closest crowds may be used such that the crowd analyzer 60 determines whether the distance between the two closest crowds is less than the optimal inclusion distances of both of the two closest crowds. As another alternative, if the two closest crowds are of the same size, the crowd analyzer 60 may compare the distance between the two closest crowds to an average of the optimal inclusion distances of the two closest crowds.
If the distance between the two closest crowds is not less than the optimal inclusion distance, the process proceeds to step 1838. If the distance between the two closest crowds is less than the optimal inclusion distance, the two closest crowds are combined or merged (step 1830), and a new crowd center for the resulting crowd is computed (step 1832). Again, a center of mass algorithm may be used to compute the crowd center of a crowd. In addition, a new optimal inclusion distance for the resulting crowd is computed (step 1834). In one embodiment, the new optimal inclusion distance for the resulting crowd is computed as:
where n is the number of users in the crowd and di is a distance between the ith user and the crowd center. In other words, the new optimal inclusion distance is computed as the average of the initial optimal inclusion distance and the distances between the users in the crowd and the crowd center plus one standard deviation.
At this point, the crowd analyzer 60 determines whether a maximum number of iterations have been performed (step 1836). The maximum number of iterations is a predefined number that ensures that the crowd formation process does not indefinitely loop over steps 1818 through 1834 or loop over steps 1818 through 1834 more than a desired maximum number of times. If the maximum number of iterations has not been reached, the process returns to step 1818 and is repeated until either the distance between the two closest crowds is not less than the optimal inclusion distance of the larger crowd or the maximum number of iterations has been reached. At that point, the crowd analyzer 60 discards crowds with less than three users, or members (step 1838) and the process ends.
Returning to step 1808 in
where a is a number between 0 and 1, ABoundingBox is an area of the bounding box, and number_of_users is the total number of users in the bounding box. The total number of users in the bounding box includes both individual users that are not already in a crowd and users that are already in a crowd. In one embodiment, a is ⅔.
The crowd analyzer 60 then creates a crowd of one user for each individual user within the bounding box that is not already included in a crowd and sets the optimal inclusion distance for the crowds to the initial optimal inclusion distance (step 1846). At this point, the crowd analyzer 60 analyzes the crowds for the bounding box to determine whether any crowd members (i.e., users in the crowds) violate the optimal inclusion distance of their crowds (step 1848). Any crowd member that violates the optimal inclusion distance of his or her crowd is then removed from that crowd (step 1850). The crowd analyzer 60 then creates a crowd of one user for each of the users removed from their crowds in step 1850 and sets the optimal inclusion distance for the newly created crowds to the initial optimal inclusion distance (step 1852).
Next, the crowd analyzer 60 determines the two closest crowds in the bounding box (step 1854) and a distance between the two closest crowds (step 1856). The distance between the two closest crowds is the distance between the crowd centers of the two closest crowds. The crowd analyzer 60 then determines whether the distance between the two closest crowds is less than the optimal inclusion distance of a larger of the two closest crowds (step 1858). If the two closest crowds are of the same size (i.e., have the same number of users), then the optimal inclusion distance of either of the two closest crowds may be used. Alternatively, if the two closest crowds are of the same size, the optimal inclusion distances of both of the two closest crowds may be used such that the crowd analyzer 60 determines whether the distance between the two closest crowds is less than the optimal inclusion distances of both of the two closest crowds. As another alternative, if the two closest crowds are of the same size, the crowd analyzer 60 may compare the distance between the two closest crowds to an average of the optimal inclusion distances of the two closest crowds.
If the distance between the two closest crowds is less than the optimal inclusion distance, the two closest crowds are combined or merged (step 1860), and a new crowd center for the resulting crowd is computed (step 1862). Again, a center of mass algorithm may be used to compute the crowd center of a crowd. In addition, a new optimal inclusion distance for the resulting crowd is computed (step 1864). As discussed above, in one embodiment, the new optimal inclusion distance for the resulting crowd is computed as:
where n is the number of users in the crowd and di is a distance between the ith user and the crowd center. In other words, the new optimal inclusion distance is computed as the average of the initial optimal inclusion distance and the distances between the users in the crowd and the crowd center plus one standard deviation.
At this point, the crowd analyzer 60 determines whether a maximum number of iterations have been performed (step 1866). If the maximum number of iterations has not been reached, the process returns to step 1848 and is repeated until either the distance between the two closest crowds is not less than the optimal inclusion distance of the larger crowd or the maximum number of iterations has been reached. At that point, the crowd analyzer 60 discards crowds with less than three users, or members (step 1868). The crowd analyzer 60 then determines whether the crowd formation process for the new and old bounding boxes is done (step 1870). In other words, the crowd analyzer 60 determines whether both the new and old bounding boxes have been processed. If not, the bounding box is set to the new bounding box (step 1872), and the process returns to step 1842 and is repeated for the new bounding box. Once both the new and old bounding box have been processed, the crowd formation process ends.
The crowd analyzer 60 then identifies the two closest crowds 126 and 128 in the bounding box 122 and determines a distance between the two closest crowds 126 and 128. In this example, the distance between the two closest crowds 126 and 128 is less than the optimal inclusion distance. As such, the two closest crowds 126 and 128 are merged and a new crowd center and new optimal inclusion distance are computed, as illustrated in
Since the old bounding box 132 and the new bounding box 134 overlap, the crowd analyzer 60 creates a bounding box 140 that encompasses both the old bounding box 132 and the new bounding box 134, as illustrated in
Next, the crowd analyzer 60 analyzes the crowds 136, 138, and 142 through 148 to determine whether any members of the crowds 136, 138, and 142 through 148 violate the optimal inclusion distances of the crowds 136, 138, and 142 through 148. In this example, as a result of the user leaving the crowd 136 and moving to his new location, both of the remaining members of the crowd 136 violate the optimal inclusion distance of the crowd 136. As such, the crowd analyzer 60 removes the remaining users from the crowd 136 and creates crowds 150 and 152 of one user each for those users, as illustrated in
The crowd analyzer 60 then identifies the two closest crowds in the bounding box 140, which in this example are the crowds 146 and 148. Next, the crowd analyzer 60 computes a distance between the two crowds 146 and 148. In this example, the distance between the two crowds 146 and 148 is less than the initial optimal inclusion distance and, as such, the two crowds 146 and 148 are combined. In this example, crowds are combined by merging the smaller crowd into the larger crowd. Since the two crowds 146 and 148 are of the same size, the crowd analyzer 60 merges the crowd 148 into the crowd 146, as illustrated in
At this point, the crowd analyzer 60 repeats the process and determines that the crowds 138 and 144 are now the two closest crowds. In this example, the distance between the two crowds 138 and 144 is less than the optimal inclusion distance of the larger of the two crowds 138 and 144, which is the crowd 138. As such, the crowd 144 is merged into the crowd 138 and a new crowd center and optimal inclusion distance are computed for the crowd 138, as illustrated in
More specifically, as illustrated in
As illustrated in
Before proceeding, a variation of the spatial formation process discussed above with respect to
Then, in steps 1802 and 1804, sizes of the new and old bounding boxes centered at the new and old locations of the user 20-1 are set as a function of the location accuracy of the new and old locations of the user 20-1. If the new location of the user 20-1 is inaccurate, then the new bounding box will be large. If the new location of the user 20-1 is accurate, then the new bounding box will be small. For example, the length and width of the new bounding box may be set to M times the location accuracy of the new location of the user 20-1, where the location accuracy is expressed as a radius in meters from the new location of the user 20-1. The number M may be any desired number. For example, the number M may be 5. In a similar manner, the location accuracy of the old location of the user 20-1 may be used to set the length and width of the old bounding box.
In addition, the location accuracy may be considered when computing the initial optimal inclusion distances used for crowds of one user in steps 1814 and 1844. As discussed above, the initial optimal inclusion distance is computed based on the following equation:
where a is a number between 0 and 1, ABoundingBox is an area of the bounding box, and number_of_users is the total number of users in the bounding box. The total number of users in the bounding box includes both individual users that are not already in a crowd and users that are already in a crowd. In one embodiment, a is ⅔. However, if the computed initial optimal inclusion distance is less than the location accuracy of the current location of the individual user in a crowd, then the location accuracy, rather than the computed value, is used for the initial optimal inclusion distance for that crowd. As such, as location accuracy decreases, crowds become larger and more inclusive. In contrast, as location accuracy increases, crowds become smaller and less inclusive. In other words, the granularity with which crowds are formed is a function of the location accuracy.
Likewise, when new optimal inclusion distances for crowds are recomputed in steps 1834 and 1864, location accuracy may also be considered. As discussed above, the new optimal inclusion distance may first be computed based on the following equation:
where n is the number of users in the crowd and di is a distance between the ith user and the crowd center. In other words, the new optimal inclusion distance is computed as the average of the initial optimal inclusion distance and the distances between the users in the crowd and the crowd center plus one standard deviation. However, if the computed value for the new optimal inclusion distance is less than an average location accuracy of the users in the crowd, the average location accuracy of the users in the crowd, rather than the computed value, is used as the new optimal inclusion distance.
First, the MAP application 32-1 sends a crowd request to the MAP client 30-1 (step 1900). The crowd request is a request for crowd data for crowds currently formed near a specified POI or within a specified AOI. The crowd request may be initiated by the user 20-1 of the mobile device 18-1 via the MAP application 32-1 or may be initiated automatically by the MAP application 32-1 in response to an event such as, for example, start-up of the MAP application 32-1, movement of the user 20-1, or the like. In one embodiment, the crowd request is for a POI, where the POI is a POI corresponding to the current location of the user 20-1, a POI selected from a list of POIs defined by the user 20-1, a POI selected from a list of POIs defined by the MAP application 32-1 or the MAP server 12, a POI selected by the user 20-1 from a map, a POI implicitly defined via a separate application (e.g., POI is implicitly defined as the location of the nearest Starbucks coffee house in response to the user 20-1 performing a Google search for “Starbucks”), or the like. If the POI is selected from a list of POIs, the list of POIs may include static POIs which may be defined by street addresses or latitude and longitude coordinates, dynamic POIs which may be defined as the current locations of one or more friends of the user 20-1, or both. Note that in some embodiments, the user 20-1 may be enabled to define a POI by selecting a crowd center of a crowd as a POI, where the POI would thereafter remain static at that point and would not follow the crowd.
In another embodiment, the crowd request is for an AOI, where the AOI may be an AOI of a predefined shape and size centered at the current location of the user 20-1, an AOI selected from a list of AOIs defined by the user 20-1, an AOI selected from a list of AOIs defined by the MAP application 32-1 or the MAP server 12, an AOI selected by the user 20-1 from a map, an AOI implicitly defined via a separate application (e.g., AOI is implicitly defined as an area of a predefined shape and size centered at the location of the nearest Starbucks coffee house in response to the user 20-1 performing a Google search for “Starbucks”), or the like. If the AOI is selected from a list of AOIs, the list of AOIs may include static AOIs, dynamic AOIs which may be defined as areas of a predefined shape and size centered at the current locations of one or more friends of the user 20-1, or both. Note that in some embodiments, the user 20-1 may be enabled to define an AOI by selecting a crowd such that an AOI is created of a predefined shape and size centered at the crowd center of the selected crowd. The AOI would thereafter remain static and would not follow the crowd. The POI or the AOI of the crowd request may be selected by the user 20-1 via the MAP application 32-1. In yet another embodiment, the MAP application 32-1 automatically uses the current location of the user 20-1 as the POI or as a center point for an AOI of a predefined shape and size.
Upon receiving the crowd request, the MAP client 30-1 forwards the crowd request to the MAP server 12 (step 1902). Note that in some embodiments, the MAP client 30-1 may process the crowd request before forwarding the crowd request to the MAP server 12. For example, in some embodiments, the crowd request may include more than one POI or more than one AOI. As such, the MAP client 30-1 may generate a separate crowd request for each POI or each AOI.
In response to receiving the crowd request from the MAP client 30-1, the MAP server 12 identifies one or more crowds relevant to the crowd request (step 1904). More specifically, in one embodiment, the crowd analyzer 60 performs a crowd formation process such as that described above in
Once the crowd analyzer 60 has identified the crowds relevant to the crowd request, the MAP server 12 generates crowd data for the identified crowds (step 1906). As discussed below in detail, the crowd data for the identified crowds may include aggregate profiles for the crowds, information characterizing the crowds, or both. In addition, the crowd data may include spatial information defining the locations of the crowds, the number of users in the crowds, the amount of time the crowds have been located at or near the POI or within the AOI of the crowd request, or the like. The MAP server 12 then returns the crowd data to the MAP client 30-1 (step 1908).
Upon receiving the crowd data, the MAP client 30-1 forwards the crowd data to the MAP application 32-1 (step 1910). Note that in some embodiments the MAP client 30-1 may process the crowd data before sending the crowd data to the MAP application 32-1. The MAP application 32-1 then presents the crowd data to the user 20-1 (step 1912). The manner in which the crowd data is presented depends on the particular implementation of the MAP application 32-1. In one embodiment, the crowd data is overlaid upon a map. For example, the crowds may be represented by corresponding indicators overlaid on a map. The user 20-1 may then select a crowd in order to view additional crowd data regarding that crowd such as, for example, the aggregate profile of that crowd, characteristics of that crowd, or the like.
Note that in one embodiment, the MAP application 32-1 may operate to roll-up the aggregate profiles for multiple crowds into a rolled-up aggregate profile for those crowds. The rolled-up aggregate profile may be the average of the aggregate profiles of the crowds. For example, the MAP application 32-1 may roll-up the aggregate profiles for multiple crowds at a POI and present the rolled-up aggregate profile for the multiple crowds at the POI to the user 20-1. In a similar manner, the MAP application 32-1 may provide a rolled-up aggregate profile for an AOI. In another embodiment, the MAP server 12 may roll-up crowds for a POI or an AOI and provide the rolled-up aggregate profile in addition to or as an alternative to the aggregate profiles for the individual crowds.
Next, the profile augmentation function 40 obtains an aggregate profile for a crowd of users currently located at or near the current location of the subject user (step 2002). Crowds of users currently located at or near the current location of the subject user are referred to herein as crowds that are currently located at locations that are relevant to the current location of the subject user. More specifically, depending on the particular implementation, the crowd currently located at or near the current location of the subject user is a crowd in which the subject user is included, a crowd closest to the current location of the subject user, or a crowd that is within a geographic region of a predefined shape and size encompassing (e.g., centered at) the current location of the subject user. The aggregate profile of the crowd is preferably generated by comparing the user profiles of the users in the crowd to one another, and the aggregate profile of the crowd preferably includes a number of keywords and, for each keyword, a number of user matches, or occurrences, for that keyword in the user profiles of the users in the crowd. The profile augmentation function 40 then augments a user profile of the subject user based on the aggregate profile of the crowd (step 2004).
More specifically, in order to augment the user profile of the subject user based on the aggregate profile of the crowd at or near the current location of the subject user, the profile augmentation function 40 identifies, or selects, a predetermined number (NMAX) of keywords from the aggregate profile of the crowd having the highest number of user matches and not having a fixed probability in the user profile of the subject user (step 2100). Keywords in the user profile of the subject user having fixed probabilities are keywords that have been entered by the subject user into the user profile. In addition, as discussed below, the keywords having fixed probabilities may also include keywords in which the subject user has expressed an interest or disinterest in response to questions from the profile augmentation function 40.
For example, assume that the user profile of the subject user is:
Also, assume that the aggregate profile for the crowd is:
As such, the profile augmentation function 40 selects, at most, the predetermined number (NMAX) of the keywords from the aggregate profile having the highest number of user matches and not having a fixed probability in the user profile of the subject user. In this example, assume that NMAX is three (3). Here, the three keywords having the highest number of user matches and not having fixed probabilities in the user profile of the subject user are the keywords Sports, China, and Coffee. Note that the keyword Tennis has a fixed probability in the user profile of the subject user and as such is not selected by the profile augmentation function 40.
Next, the profile augmentation function 40 generates a question for each of the keywords identified in step 2100 (step 2102). In general, the questions are automatically generated in order to determine whether the subject user is interested in the identified keywords. Returning to our example, if the identified keywords are Sports, China, and Coffee, the profile augmentation function 40 may generate the following questions: “Do you have an interest in sports?,” “Do you have an interest in China?,” and “Do you like coffee?.” Alternatively, the questions for the identified keywords may be a list of the identified keywords where the subject user will be enabled to select “Yes” or “No” for each of the identified keywords, or the like. The profile augmentation function 40 then provides the questions for the identified keywords to the subject user (step 2104), and receives answers to the questions from the subject user (2106). Note that the subject user may choose to answer all, some, or none of the questions. Lastly, the profile augmentation function 40 updates the user profile of the subject user based on any answers received from the subject user and, in this embodiment, the number of user matches for the keywords in the aggregate profile of the crowd (step 2108).
where PROBABILITY is the probability of the keyword, USER_MATCHES is the number of user matches for the keyword from the aggregate profile, and TOTAL_USER_MATCHES is the sum of the number of user matches for all of the keywords in the aggregate profile. Note that the equation above is exemplary and is not intended to limit the scope of the present disclosure. The profile augmentation function 40 then adds the keyword and the probability computed for the keyword to the user profile of the subject user (step 2208). At this point, the process proceeds to step 2222, which is described below.
Returning to step 2204, if the keyword is already in the user profile of the subject user, the profile augmentation function 40 determines whether the keyword has a fixed probability in the user profile of the subject user (step 2210).
In this embodiment, the keyword will have a fixed probability if the subject user added the keyword to the user profile or if the subject user previously answered a question for the keyword. More specifically, in this embodiment, the keyword will have a fixed probability of 100% if the subject user previously added the keyword to the user profile, a fixed probability of 100% if the subject user answered a question for the keyword in a previous iteration of the profile augmentation process in a manner that indicated that the subject user has an interest in the keyword, or a fixed probability of 0% if the subject user answered a question for the keyword in a previous iteration of the profile augmentation process in a manner that indicated that the subject user does not have an interest in the keyword. If the keyword does not have a fixed probability, the profile augmentation function 40 computes a new probability for the keyword based on the probability for the keyword in the user profile (i.e., the old probability for the keyword) and the number of user matches for the keyword in the aggregate profile (step 2212). In one embodiment, the new probability is computed based on the following equation:
PROBABILITYNEW=AVG(PROBABILITYOLD,PROBABILITYTEMP),
where
and PROBABILITYNEW is the new probability for the keyword, PROBABILITYOLD is the old probability for the keyword, PROBABILITYTEMP is a temporary probability used for purposes of this exemplary calculation, USER_MATCHES is the number of user matches for the keyword from the aggregate profile, and TOTAL_USER_MATCHES is the sum of the number of user matches for all of the keywords in the aggregate profile. The function AVG(x,y) is the average of the values x and y. Note that other techniques may be used to combine the old probability and the temporary probability to provide the new probability (e.g., summing, weighted averaging, or the like). The profile augmentation function 40 then updates the user profile of the subject user to include the new probability for the keyword (step 2214). At this point, the process proceeds to step 2222, which is described below.
Returning to step 2202, if a question was asked for the keyword and an answer was received from the subject user for the question asked for the keyword, the profile augmentation function 40 determines whether the keyword is already in the user profile of the subject user (step 2216). If not, the profile augmentation function 40 adds the keyword to the user profile of the subject user along with a fixed probability for the keyword that reflects the answer given by the subject user to the corresponding question (step 2218). In this example, the fixed probability for the keyword is 100% if the subject user gave a positive answer indicating that the subject user has an interest in the keyword. In contrast, the fixed probability for the keyword is 0% if the subject user gave a negative answer indicating that the subject user does not have an interest in the keyword. At this point, the process proceeds to step 2222, which is described below.
Returning to step 2216, if the keyword is already in the user profile of the subject user, then the profile augmentation function 40 updates the user profile of the subject user with a fixed probability for the keyword that reflects the answer given by the subject user to the corresponding question (step 2220). Again, in this example, the fixed probability for the keyword is 100% if the subject user gave a positive answer indicating that the subject user has an interest in the keyword. In contrast, the fixed probability for the keyword is 0% if the subject user gave a negative answer indicating that the subject user does not have an interest in the keyword. The process then proceeds to step 2222, which is described below.
At this point, whether proceeding from step 2208, step 2210, step 2214, step 2218, or step 2220, the profile augmentation function 40 determines whether that last keyword in the aggregate profile has been processed (step 2222). If not, the process returns to step 2200 and is repeated for the next keyword in the aggregate profile. Once all of the keywords in the aggregate profile have been processed, the process is complete.
Again, as an example, assume that the user profile of the subject user is:
Also, assume that the aggregate profile is:
and that questions were asked for the keywords Sports, China, and Coffee from the aggregate profile. Further assume, that the subject user answered the question for Sports in a manner that indicated that the subject user has an interest in sports, answered the question for Coffee in a manner that indicated that the subject user is not interested in Coffee, and did not answer the question for China. As such, using the process described above, the user profile of the subject user is updated as follows:
Note that future iterations of the profile augmentation process may be performed in order to further augment the user profile of the subject user.
For instance, if the subject user thereafter moves to a new location and the process is repeated, an aggregate profile for a new crowd that is at or near the new location of the subject user may be:
In this example, assuming again that NMAX is 3, then Photography, Hiking, and Technology are selected and corresponding questions are provided to the subject user. For this example, assume that the subject user does not respond to any of the questions. As such, since the keyword Photography is already in the user profile of the subject user, a new probability is computed for the keyword Photography based on the old probability which in this case is 8 and the number of user matches for the keyword Photography in the aggregate profile for the new crowd. The keywords Hiking and Technology are not already in the user profile of the subject user. As such, the keywords Hiking and Technology are added to the user profile of the subject user along with corresponding probabilities computed based on the number of user matches for those keywords in the aggregate profile of the new crowd. As such, the updated user profile of the subject user may then be:
More specifically, in order to augment the user profile of the subject user based on the aggregate profile of the crowd at or near the current location of the subject user, the profile augmentation function 40 first provides a preliminary question to the subject user (step 2300). The preliminary question is preferably an open-ended question regarding the interests of the subject user. For example, the preliminary question may be “What is one of your current topics of interest?.” The profile augmentation function 40 then receives an answer from the subject user (step 2302). The answer from the subject user is referred to herein as a topic of interest. The profile augmentation function 40 then selects a desired number of keywords from the aggregate profile that have the highest number of user matches, do not have fixed probabilities in the user profile of the subject user, and have the highest affinity to the topic of interest provided by the subject user (step 2304). Affinity between the topic of interest and the keywords in the aggregate profile may be determined using any suitable technique for determining the affinity between two keywords. For example, an ontology or similar data structure that defines relationships between words and topics may be used to determine the degree of relationship between two keywords (e.g., degrees of separation of the two keywords in the ontology), where the degree of relationship is utilized as the affinity of the two keywords.
For example, assume that the user profile of the subject user is:
Also, assume that the aggregate profile is:
Further assume that topic of interest provided by the subject user is History and that the desired number of keywords to be selected is three (3). As such, the profile augmentation function 40 selects the Sports keyword because it has the highest number of user matches and is not already included in the user profile of the subject user with a fixed probability. The profile augmentation function 40 then selects two more keywords from the keywords China, Coffee, and Europe based on the affinities of those keywords to the topic of interest, which is History. As such, in this example, the profile augmentation function 40 selects the keywords China and Europe because they have higher affinities with the topic of interest than the keyword Coffee.
Next, the profile augmentation function 40 generates a question for each of the selected keywords (step 2306). In general, the questions are automatically generated in order to determine whether the subject user is interested in the identified keywords. Returning to our example, if the identified keywords are Sports, China, and Europe, the profile augmentation function 40 may generate the following questions: “Do you have an interest in sports?,” “Do you have an interest in China?,” and “Do you have an interest in Europe?.” Alternatively, the questions for the identified keywords may be a list of the identified keywords where the subject user will be enabled to select “Yes” or “No” for each of the identified keywords, or the like. The profile augmentation function 40 then provides the questions for the identified keywords to the subject user (step 2308), and receives answers to the questions from the subject user (2310). Note that the subject user may choose to answer all, some, or none of the questions. Lastly, the profile augmentation function 40 updates the user profile of the subject user based on any answers received from the subject user and, in this embodiment, the number of user matches for the keywords in the aggregate profile of the crowd (step 2312). More specifically, in one embodiment, the profile augmentation function 40 updates the user profile of the subject user using the process of
Next, the MAP server 12 obtains an aggregate profile for the crowd (step 2406). More specifically, the MAP server 12 compares the user profiles of the users in the crowd to one another to generate the aggregate profile of the crowd. As discussed above, in this embodiment, the aggregate profile of the crowd preferably includes a number of keywords and, for each keyword, a number of user matches for the keyword in the user profiles of the users in the crowd. The profile augmentation function 40, which for this embodiment is implemented on the MAP server 12, then augments the user profile of the system user based on the aggregate profile of the crowd in the manner described above (step 2408). Preferably, the process returns to step 2402 and is periodically or otherwise repeated to further augment the user profile of the system user over time.
Next, the profile augmentation function 40 obtains a historical aggregate profile for users historically located at or near the current location of the subject user (step 2602). In one embodiment, the users historically located at or near the current location of the subject user are users that were located at or near the current location of the subject user within one or more defined time windows. Each time window may be a relative time window that is relative to the current time (e.g., past 6 months, past 2 years, or the like) or an absolute time window (e.g., Friday, Feb. 12, 2010 or the like). In general, the users historically located at or near the current location of the subject user are users that were historically located within a defined geographic region encompassing (e.g., centered at) the current location of the subject user. In the preferred embodiment, the historical aggregate profile includes a number of keywords and, for each keyword, a number of user matches, or occurrences, of the keyword in the user profiles of the users historically located at or near the current location of the subject user. The profile augmentation function 40 then augments a user profile of the subject user based on the historical aggregate profile (step 2604).
More specifically, in order to augment the user profile of the subject user based on the historical aggregate profile for the current location of the subject user, the profile augmentation function 40 identifies, or selects, a predetermined number (NMAX) of keywords from the historical aggregate profile having the highest number of user matches and not having a fixed probability in the user profile of the subject user (step 2700). Keywords in the user profile of the subject user having fixed probabilities are keywords that have been entered by the subject user into the user profile. In addition, as discussed below, the keywords having fixed probabilities may also include keywords in which the subject user has expressed an interest or disinterest in response to questions from the profile augmentation function 40.
For example, assume that the user profile of the subject user is:
Also, assume that the historical aggregate profile is:
As such, the profile augmentation function 40 selects, at most, the predetermined number (NMAX) of the keywords from the historical aggregate profile having the highest number of user matches and not having a fixed probability in the user profile of the subject user. In this example, assume that NMAX is three (3). Here, the three keywords having the highest number of user matches and not having fixed probabilities in the user profile of the subject user are the keywords Books, Photography, and Fishing. Note that the keyword Tennis has a fixed probability in the user profile of the subject user.
Next, the profile augmentation function 40 generates a question for each of the identified keywords (step 2702). In general, the questions are automatically generated in order to determine whether the subject user is interested in the identified keywords. Returning to our example, if the identified keywords are Books, Photography, and Fishing, the profile augmentation function 40 may generate the following questions: “Do you have an interest in books?,” “Do you have an interest in photography?,” and “Do you like fishing?.” Alternatively, the questions for the identified keywords may be a list of the identified keywords where the subject user will be enabled to select “Yes” or “No” for each of the identified keywords, or the like. The profile augmentation function 40 then provides the questions for the identified keywords to the subject user (step 2704), and receives answers to the questions from the subject user (2706). Note that the subject user may choose to answer all, some, or none of the questions. Lastly, the profile augmentation function 40 updates the user profile of the subject user based on any answers received from the subject user and, in this embodiment, the number of user matches for the keywords in the historical aggregate profile (step 2708). In one embodiment, the profile augmentation function 40 updates the user profile of the subject user using the process of
Continuing our example, assume that the subject user responds positively to the question for the keyword Books, responds negatively to the question for the keyword Photography, and does not respond to the question for the keyword Fishing. After the process of
Note that future iterations of the profile augmentation process may be performed in order to further augment the user profile of the subject user.
More specifically, in order to augment the user profile of the subject user based on the historical aggregate profile of users historically located at or near the current location of the subject user, the profile augmentation function 40 first provides a preliminary question to the subject user (step 2800). The preliminary question is preferably an open-ended question regarding the interests of the subject user. For example, the preliminary question may be “What is one of your current topics of interest?.” The profile augmentation function 40 then receives an answer from the subject user (step 2802). The answer from the subject user is referred to herein as a topic of interest. The profile augmentation function 40 then selects a desired number of keywords from the historical aggregate profile that have the highest number of user matches, do not have fixed probabilities in the user profile of the subject user, and have the highest affinity to the topic of interest provided by the subject user (step 2804). Affinity between the topic of interest and the keywords in the historical aggregate profile may be determined using any suitable technique for determining the affinity between two keywords. For example, an ontology or similar data structure that defines relationships between words and topics may be used to determine the degree of relationship between two keywords (e.g., degrees of separation of the two keywords in the ontology), where the degree of relationship is utilized as the affinity of the two keywords.
For example, assume that the user profile of the subject user is:
Also, assume that the historical aggregate profile is:
Further assume that the topic of interest provided by the subject user is the Winter Olympics and that the desired number of keywords to be selected is three (3). As such, the profile augmentation function 40 selects Books and Photography because they have the highest number of user matches for keywords that do not have fixed probabilities in the user profile of the subject user. The profile augmentation function 40 then selects one more keyword from the keywords Coffee, China, Skiing, Europe, and Fishing based on the affinities of those keywords to the topic of interest, which is the Winter Olympics. As such, in this example, the profile augmentation function 40 selects the keyword Skiing because it has a higher affinity with the topic of interest than the other keywords.
Next, the profile augmentation function 40 generates a question for each of the selected keywords (step 2806). In general, the questions are automatically generated in order to determine whether the subject user is interested in the identified keywords. Returning to our example, if the identified keywords are Books, Photography, and Skiing, the profile augmentation function 40 may generate the following questions: “Do you have an interest in books?,” “Do you have an interest in photography?,” and “Do you have an interest in skiing?.” Alternatively, the questions for the identified keywords may be a list of the identified keywords where the subject user will be enabled to select “Yes” or “No” for each of the identified keywords, or the like. The profile augmentation function 40 then provides the questions for the identified keywords to the subject user (step 2808), and receives answers to the questions from the subject user (step 2810). Note that the subject user may choose to answer all, some, or none of the questions. Lastly, the profile augmentation function 40 updates the user profile of the subject user based on any answers received from the subject user and, in this embodiment, the number of user matches for the keywords in the historical aggregate profile for the current location of the subject user (step 2812). More specifically, in one embodiment, the profile augmentation function 40 updates the user profile of the subject user using the process of
Next, the history manager 58 obtains history objects relevant to the bounding box and the time window for the historical aggregate profile request from the datastore 66 of the MAP server 12 (step 3104). The relevant history objects are history objects recorded for time periods within or intersecting the time window and for locations, or geographic areas, within or intersecting the bounding box for the historical aggregate profile request. At this point, the history manager 58 gets the next history object from the relevant history objects identified in step 3104 (step 3106). The history manager 58 then generates an aggregate profile for the history object (step 3108). In order to generate the aggregate profile for the history object, the history manager 58 compares the user profiles of the anonymous user records stored in the history object to one another. The resulting aggregate profile for the history object includes a number of keywords and, for each keyword, a number of user matches for that keyword. The history manager 58 then determines whether there are more history objects to be processed (step 3110). If so, the process returns to step 3106 and is repeated until all of the history objects that are relevant to the historical aggregate profile request have been processed. Once all of the history objects have been processed, the history manager 58 combines the aggregate profiles of the history objects to provide the historical aggregate profile for the current location of the subject user (step 3112). More specifically, in this embodiment, the history manager 58 combines the aggregate profiles of the history objects to provide a list of keywords and, for each keyword, a number of user matches for that keyword over all of the history objects.
Before proceeding, it should be noted that, in the embodiments described above, the profile augmentation function 40 utilizes questions asked of the subject user. However, the profile augmentation function 40 is not limited thereto. In another embodiment, no questions are asked of the subject user. Rather, the profile augmentation function 40 operates to update the user profile of the subject user based on the aggregate profile date (i.e., the aggregate profile of the crowd at or near the current location of the subject user or the historical aggregate profile for the current location of the subject user) according to steps 2200, 2204-2214, and 2222 of
It should also be noted that the profile augmentation function 40 may utilize things in addition to the aggregate profile data when augmenting a user profile of a subject user. For example, in the embodiments of
Those skilled in the art will recognize improvements and modifications to the embodiments of the present invention. All such improvements and modifications are considered within the scope of the concepts disclosed herein and the claims that follow.
This application claims the benefit of provisional patent application Ser. No. 61/163,091, filed Mar. 25, 2009, the disclosure of which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61163091 | Mar 2009 | US |