This application claims priority under 35 U.S.C. § 119 from German Patent Application No. 18170964.3, filed May 7, 2018, the entire disclosure of which is herein expressly incorporated by reference.
The present disclosure relates to a method and system for modeling user and location, in particular for use in a location-based services environment. In a preferred embodiment, there is provided a method and system for modeling user and location for use in a location-based recommendation service.
Location-based services (LBS) have become highly relevant in many application domains, in particular with respect to automotive digital services. An effective and efficient use of location-based services can very much benefit from proper modeling of users and location, as well as of relations between users and locations.
“User modeling for point-of-interest recommendations in location-based social networks: the state-of-the-art”, Shudong Liu, School of Information & Security Engineering, Zhongnan University of Economics & Law, Wuhan 430073, China, describes that the rapid growth of location-based services has greatly enriched people's urban lives and attracted millions of users in recent years. Location-based social networks (LBSN) allow users to check-in at a physical location and share daily tips on points-of-interest (POI) with other users. Such check-in behavior can make daily real-life experiences spread quickly through the Internet. Moreover, such check-in data in LBSNs can be exploited to understand the basic laws of humans' daily movement and mobility. The author focuses on reviewing the taxonomy of user modeling for POI recommendations through the data analysis of LBSNs. The structure and data characteristics of LBSNs is introduced, then a formalization of user modeling for POI recommendations in LBSNs is presented. Depending on which type of LBSNs data was fully utilized in user modeling approaches for POI recommendations, user modeling algorithms can be divided into four categories: pure check-in data-based user modeling, geographical information-based user modeling, spatio-temporal information-based user modeling, and geo-social information-based user modeling. The author mentions that spatial clustering results from users' tendency to visit nearby places rather than distant ones in their daily lives, thereby generating clusters containing different visited locations within that same cluster. This may make it difficult to reliably identify single locations visited by a user.
“Mining Interesting Locations and Travel Sequences from GPS Trajectories”, Yu Zheng, et al., Microsoft Research Asia, WWW 2009, Apr. 20-24, 2009, Madrid, Spain, proposes a hypertext-induced topic search based inference model, which regards an individual's access on a location as a directed link from the user to that location. This model infers the interest of a location by taking into account the three factors: the notion that the interest of a location depends not only on the number of users visiting the location but also the users' travel experiences; the notion that users' travel experiences and location interests have a mutual reinforcement relationship; and the notion that the interest of a location and the travel experience of a user are relative values and are region-related. The authors have not, however, studied the algorithm framework in the semantic space that can bring not only improvements with respect to interpretation, but user and location into the same measurable space for computation, for example regarding a similarity for location recommendations, user and location profiling that are fundamental for smart location based service.
For the application domain of automotive digital services, the visits of a user to a certain location play an important role. Thus, it would be beneficial if the general relationship between user and location could be determined across a large set of users and locations, and if the user and location could be scored as user and location profiles.
Therefore, there is a need for an efficient and effective algorithm for achieving user and location scoring that can be used, for example, in smart location mining and recommendation, gamification, and for user and location profiles.
In particular, there is a need for an algorithm modeling framework that allows to bring both user and location into the measurable semantic space.
One or more of the objects specified above are substantially achieved by methods and system for modeling user and location in accordance with any one of the appended claims, which alleviate or eliminate one or more of the disadvantages described above and which realize one or more of the aforementioned advantages.
According to the invention, there is provided a method for determining a preference of a user for a location. The method includes determining a user-location relation based on a plurality of relations of a plurality of users with a plurality of locations, determining a plurality of POItags indicative of one or more properties of the plurality of locations, determining a user-POItag relation based on the plurality of users and the plurality of POItags, determining a location-POItag relation based on the plurality of locations and the plurality of POItags, and determining the preference of the user for the location based on at least one of the user-location relation, the user-POItag relation, and the location-POItag relation.
In a preferred embodiment, determining the user-POItag relation and/or determining the location-POItag relation is based on a probabilistic matrix factorization model.
In a preferred embodiment, the method further includes determining the locations based on a clustering of a plurality of geolocations. In some embodiments, each geolocation of the plurality of geolocations is indicative of a visit of a user of the plurality of users to a location of the plurality of locations.
In a preferred embodiment, each relation of the plurality of relations of the users with the locations is determined based on a visit of a user of the plurality of users to a location of the plurality of locations. In some embodiments, the visit is determined based on ri,j=f (visit_count(ui, 1j)). The visit_count represents a number of visits of the user to the location.
In a preferred embodiment, the plurality of users, the plurality of locations, and/or the plurality of POItags include latent factors. The latent factors may be represented in semantic space.
In a preferred embodiment, the relation of a user with respect to a location is determined as rij=uiT×lj.
In a preferred embodiment, the method further includes normalizing parameter values indicative of each user of the plurality of users and/or normalizing parameter values indicative of each location of the plurality of locations. Normalizing may be based on the following compression function: f(x)=√x.
In a preferred embodiment, one or more of the following steps: determining a plurality of POItags indicative of one or more properties of the plurality of locations, determining a user-POItag relation based on the plurality of users and the plurality of POItags, and determining a location-POItag relation based on the plurality of locations and the plurality of POItags, which is based on an estimation algorithm that is convex optimizable. The determining may be performed in semantic space.
According to the invention there is further provided a system for determining a preference of a user, the system including a control unit configured for performing the method in accordance with the present invention.
According to the invention there is further provided a vehicle including the system according to the present disclosure.
Other objects, advantages and novel features of the present invention will become apparent from the following detailed description of one or more preferred embodiments when considered in conjunction with the accompanying drawings.
The accompanying drawings disclose exemplifying and non-limiting aspects in accordance with embodiments of the present invention.
Proper modeling of users and locations is key to providing relevant location based services. Within this disclosure, point of interest (POI) tags are applied to individual locations in order to illustrate a users' interest and to provide structure to a location's profile. A profile is a collection of data indicative of properties of an entity, for example a user or location. The methods and systems disclosed herein propose a novel approach to modeling user and location, incorporate relationships between users and POItags as well as between locations and PIO-tags, employing user-location matrix in collaborative filtering to establish a user location modeling framework.
When planning to visit a certain place, users typically select a corresponding POI or address in order to identify the place in terms of navigation and destination management. Therefore, it can be assumed that users have or can easily accumulate a collection of POItags and/or addresses from navigation and destination management (e.g. from a navigation system in a vehicle or a corresponding app running on a mobile device) which can be used to reveal the users' interests over time. Simultaneously, locations can be provided with POItags, for example by a map provider or from different users (e.g. crowdsourcing). Such POItags can be used as input for a location's profile.
Therefore, there are several data sets available for processing: user (interest) profiles, location profiles, and the modeling of respective relationships. An exemplary use case benefits from such data sets in that user-POItag and location-POItag relationships (i.e. matrixes) can be employed in a recommendation system. For example, a user may define a type of place they intend to visit and the methods and systems disclosed herein can be used in order to provide the user with a corresponding recommendation.
In a first step, a clustering algorithm is applied in order to determine significant locations from a set of locations l visited by users u over a period of time. In some embodiments, density-based clustering is used. In density-based clustering, clusters may be defined as areas of higher density than a remainder of a data set. Elements located in sparsely populated areas, which are necessary to separate different clusters, may be considered noise or border points. It is noted that other clustering methods or algorithms are applicable. As a result several clusters can be determined, denoting (relatively small) areas containing a number of visited geolocations, without the necessity of these geolocations to be identical.
Next, a user and location interaction matrix R is built as:
ri,j=f(visit_count(useri,locationj)).
Here, f is a monotonic function (e.g. a linear function f(x)=x). In order to avoid users showing a large number of visits being overly dominant with respect to other users in the data set, a compression function can be applied. In one example the compression function is defined as f(x)=√x.
Subsequently, a user experience score, i.e. a score indicative of the experience of a user u with respect to a location l, is determined based on a geolocation and location significance score, i.e. a combined score of a user and that of a location. A user score vector may be denoted as u=u1 . . . uN across N users and a location score may be denoted as l=l1 . . . lM across M locations. The vector u0 can be initialized as
Therefore, for the n-th iteration:
ln=un-1·R and
un=ln·RT
The iteration is terminated if |un−un-1|ε.
Using merely POItag data in order to semantically describe a given location may introduce ambiguity in some case, since users may make use of different POItags in connection with a single location. The POItag includes, for example, the geolocation region information (e.g. city, state and country) due to the high relevance within the application domain as a proven way to identify certain locations. Further, POItags may also include category information (e.g. restaurant, shopping, recreational). It is noted that a score, both for a user or a location, may be provided with a factor or function indicative of a decay. In some applications, relevance of a user score or a location score may be rather short-lived (e.g. users change their behavior and/or locations being visited less frequently). In such applications, it is desirable to let scores decay over time in order to quickly adapt to changing situations.
So-called latent factors can be used to represent user, location, and POItag. Each factor can be a topic that can illustrate the user's preference, location profile, or POItag's semantic meaning, all in latent but semantic space. Latent variables, as opposed to observable variables, are variables that are not directly observed but are rather inferred through a mathematical model from other variables that are observed or directly measured. Machine learning models that aim to explain observed variables in terms of latent variables are called latent variable models. Those variables can be semantically measurable using a suitable distance measure in the latent space. Therefore, variables having similar semantics are close to one another, in terms of the selected distance measure, in the latent space. By incorporating the POItag into the latent factor model, POItags can be used to reveal a factor's semantic meaning.
Table 210 contains POItags Tk, table 220 contains users Ui, and table 230 contains locations Lj. The concept includes applying collaborative filtering in order to determine matrixes 240, 250, and 260 from tables 210, 220, and 230. “Collaborative Filtering with User Ratings and Tags”, Tengfei Bao, Yong Ge, Enhong Chen, Hui Xiong, Jilei Tian, which is incorporated herein by reference in its entirety, describe a collaborative filtering model based on probabilistic matrix factorization in order to determine predict users' interests to items by simultaneously utilizing both tag and rating information. This method can be applied in the present example. It is noted that the article mentions “latent features” instead of “latent factors” as used in the present disclosure. These terms are used interchangeably as they refer to the same identical concept. Using the method, a low-rank approximation for three matrices is performed at the same time to learn the low-dimensional latent factors of users, items, and tags. Then, one user's preference to an item is predicted as the product of the user and item latent factors. It has been shown that the proposed method can significantly outperform benchmark methods, which is beneficial in the present application domain.
Generally, user and location are represented with POItag distributions, for example:
User u: POItag 1: 20, POItag 2: 30, . . . .
Location l: POItag 1: 0, POItag 2: 100, . . . .
The respective values may be normalized in order to project the values into a common interval (e.g. [0-1.0]), for example:
User u: POItag 1: 0.02, POItag 2: 0.03, . . . .
Location l: POItag 1: 0, POItag 2: 0.07, . . . .
Further, user-POItag, location-POItag, and user-location relationships may be modeled within a probabilistic matrix factorization model. The user, location, and POItag are in semantic spaces, and the user-POItag, location-POItag, and user-location matrix are generated as the inner product in semantic space. In particular, the user and location on POItag space are the projections from semantic space.
One or more of determining the plurality of POItags 210 indicative of one or more properties of the plurality of locations 230, determining the user-POItag relation 240 based on the plurality of users 220 and the plurality of POItags 210, and determining the location-POItag relation 250 based on the plurality of locations 230 and the plurality of POItags 210, may be based on an estimation algorithm that is convex optimizable. One option for an estimation algorithm includes estimating one parameter while keeping other parameters fixed, in order to arrive at a convex optimizable problem. The relation rij of a user ui with a location lj is determined as rij=uT×lj. The meaning of each dimension in semantic space is that both user and location are transformed into latent space, such that a distance or relevancy between them can be measured using the selected distance measure (see above). Here, each dimension is represented by a corresponding latent variable. In order to interpret the semantic meaning dimensional z, the top k POItags in that semantic space dimensional can be used.
In accordance with the present invention, it is, thus, very easy to determine a similarity of users across locations and, likewise, a similarity of locations across the users. Based on a similarity matrix so generated, the recommendation can be determined and users can be provided with a suggestion to visit locations preferred by similar users. Further, a user score can be used for gamification. For example, an opinion leader or other important user (e.g. a local person as compared to a visitor) will typically exhibit a high score, due to frequent visits to locations within a region, effectively making that person an expert user for the region.
The foregoing disclosure has been set forth merely to illustrate the invention and is not intended to be limiting. Since modifications of the disclosed embodiments incorporating the spirit and substance of the invention may occur to persons skilled in the art, the invention should be construed to include everything within the scope of the appended claims and equivalents thereof.
Number | Date | Country | Kind |
---|---|---|---|
18170964 | May 2018 | EP | regional |
Number | Name | Date | Kind |
---|---|---|---|
8762302 | Spivack | Jun 2014 | B1 |
8983494 | Onnen | Mar 2015 | B1 |
9237543 | Karr | Jan 2016 | B2 |
9282161 | Hill | Mar 2016 | B1 |
9369988 | Johnson | Jun 2016 | B1 |
9516467 | Cronin | Dec 2016 | B1 |
9710873 | Hill | Jul 2017 | B1 |
9774696 | Calvert | Sep 2017 | B1 |
20040198386 | Dupray | Oct 2004 | A1 |
20070287473 | Dupray | Dec 2007 | A1 |
20080301727 | Cristofalo | Dec 2008 | A1 |
20100030578 | Siddique | Feb 2010 | A1 |
20100156933 | Jones | Jun 2010 | A1 |
20110010364 | Ahtisaari et al. | Jan 2011 | A1 |
20110238517 | Ramalingam | Sep 2011 | A1 |
20110302124 | Cai | Dec 2011 | A1 |
20110302162 | Xiao | Dec 2011 | A1 |
20120036015 | Sheikh | Feb 2012 | A1 |
20120058775 | Dupray | Mar 2012 | A1 |
20130124449 | Pinckney | May 2013 | A1 |
20130268357 | Heath | Oct 2013 | A1 |
20130290106 | Bradley | Oct 2013 | A1 |
20140052527 | Roundtree | Feb 2014 | A1 |
20140074639 | Tian | Mar 2014 | A1 |
20140129331 | Spivack | May 2014 | A1 |
20140278992 | Roundtree | Sep 2014 | A1 |
20140279727 | Baraniuk | Sep 2014 | A1 |
20140297617 | Rajakarunanayake | Oct 2014 | A1 |
20140297669 | Rajakarunanayake | Oct 2014 | A1 |
20140370844 | Lara | Dec 2014 | A1 |
20160210602 | Siddique | Jul 2016 | A1 |
20160316325 | Sadr | Oct 2016 | A1 |
20180165554 | Zhang | Jun 2018 | A1 |
20190303807 | Gueye | Oct 2019 | A1 |
Number | Date | Country |
---|---|---|
WO 2013107669 | Jul 2013 | WO |
Entry |
---|
Extended European Search Report issued in counterpart European Application No. 18170964.3 dated Sep. 13, 2018 (nine (9) pages). |
Zheng Y. et al.,“Mining Interesting Locations and Travel Sequences from GPS Trajectories”, Microsoft Research Asia, Apr. 20, 2009, pp. 791-800, XP58025650, Madrid, Spain (10 pages). |
Number | Date | Country | |
---|---|---|---|
20190342698 A1 | Nov 2019 | US |