The present disclosure relates to the field of information processing technologies, and in particular, to a method for pushing recommendation information, a server, and a storage medium.
By recommending users with candidate objects such as movies, music, books, friends, groups, or goods, the users can obtain information about the corresponding recommended candidate objects without the need of active search, providing the users with a way to passively obtain information. Currently, one kind of method for pushing recommendation information is mainly implemented based on a social-network relationship between users. For example, if a user A watches a movie M, and the user A and a user B are in a friend relationship, the user A may recommend the movie M to the user B.
However, for the current recommendation information method, only the social-network relationship between the users is considered. But the users having the social-network relationship do not necessarily have same recommendation need. For example, a user A and a user B are in a friend relationship, but the user A and the user B may have totally different movie preferences. As a result, recommending a movie M watched by the user A to the user B is inaccurate. As can be seen, for current method for pushing recommendation information based on the social-network relationship, the recommendation result often is inaccurate, and such method may need to be improved.
Embodiments of the present invention provide a method for pushing recommendation information, a server, and a non-transitory computer-readable storage medium.
The method for pushing recommended information includes: obtaining a meta path that connects a candidate user and a target user in a heterogeneous information network, the meta path comprising a connection between the candidate user and a candidate recommendation-object and having an attribute value; obtaining a user similarity between the target user and the candidate user relative to the meta path; estimating an attribute value of a connection between the candidate recommendation-object and the target user according to the attribute value of the connection between the candidate user and the candidate recommendation-object, an attribute value constraint condition of the meta path, and the user similarity; and sending recommendation information of the candidate recommendation-object to a terminal corresponding to the target user when the estimated attribute value meets a recommendation condition.
The server includes a memory storing instructions, and a processor coupled to the memory. The processor executes the instructions and is configured for: obtaining a meta path that connects a candidate user and a target user in a heterogeneous information network, the meta path comprising a connection between the candidate user and a candidate recommendation-object and having an attribute value; obtaining a user similarity between the target user and the candidate user relative to the meta path; estimating an attribute value of a connection between the candidate recommendation-object and the target user according to the attribute value of the connection between the candidate user and the candidate recommendation-object, an attribute value constraint condition of the meta path, and the user similarity; and sending recommendation information of the candidate recommendation-object to a terminal corresponding to the target user when the estimated attribute value meets a recommendation condition.
The non-transitory computer-readable storage medium containing computer-executable instructions for, when executed by one or more processors, performing a recommendation information pushing method. The method includes: obtaining a meta path that connects a candidate user and a target user in a heterogeneous information network, the meta path comprising a connection between the candidate user and a candidate recommendation-object and having an attribute value; obtaining a user similarity between the target user and the candidate user relative to the meta path; estimating an attribute value of a connection between the candidate recommendation-object and the target user according to the attribute value of the connection between the candidate user and the candidate recommendation-object, an attribute value constraint condition of the meta path, and the user similarity; and sending recommendation information of the candidate recommendation-object to a terminal corresponding to the target user when the estimated attribute value meets a recommendation condition.
Details of one or more embodiments of the present invention are provided in the following accompanying drawings and descriptions. Other features, objectives, and advantages of the present disclosure become more obvious from the specification, accompanying drawings, and claims.
To make the objectives, technical solutions, and advantages of the present disclosure clearer, the following further describes the present disclosure in detail with reference to the accompanying drawings and embodiments. It should be understood that, specific embodiments described herein are only used for explaining the present disclosure, but not used for limiting the present disclosure.
For ease of understanding a method for pushing recommendation information provided in the present disclosure, related concepts of a heterogeneous information network are explained and described first.
Referring to
The heterogeneous information network may be used for recommending movies. In this case, the heterogeneous information network includes different types of objects, for example, users and movies; and further includes various relationships between these objects, such as movie watching information, social-network networks, and attribute information of objects. The heterogeneous information network can effectively integrate various types of information that may be useful to recommendation. In addition, exploring different semantics of objects and relationships in the network can disclose subtle relationships between the objects.
For example, a meta path “user-movie-user” in
Further, unlike a conventional heterogeneous information network and a conventional meta path, where an attribute value of a connection is not considered, the disclosed heterogeneous information network used for recommending movies may include a connection with an attribute value. Specifically, a user may give a rating value of 1 to 5 to a movie that the user has watched (as a connection between a user and a movie shown in
For example, based on a path “user-movie-user”, similarities of Tom and Mary or Bob are the same because they have watched same movies. However, they may give different rating values due to totally different interests. In fact, Tom and Bob give very high rating values to same movies. Therefore, they are similar based on the rating values. Mary has a totally different taste because she does not like these movies at all. The attribute value of the connection is not considered in the conventional meta path. Therefore, these differences cannot be disclosed. However, these differences are important, especially when a candidate recommendation-object is recommended. Therefore, concepts such as an existing heterogeneous information network and an existing meta path need to be improved to introduce the attribute value of the connections.
Specifically, a heterogeneous information network is defined. A network mode Sn=(A, R, W) is given and includes an object type set A={A} a relationship set R={Re} of connected object pairs, and an attribute value set W={W} of relationships. An information network with an attribute value is a directed graph G=(V, E, W), and includes an object type mapping function φ:V→A, a connection type mapping function ψ:E→R, and an attribute value type mapping function θ:W→W. Each object νεV belongs to a particular object type φ(e)εA, each connection eεbelongs to a particular relationship ψ(e)εR and each attribute value wεW belongs to a particular attribute value type θ(w)εW. When the object type |A|=1 and the relationship type |R|=1, the network may be referred to as a homogeneous information network. When the object type |A>1| (or the relationship type |R|>1) and the attribute value type |W|=0, the network may be referred to as a heterogeneous information network without an attribute value. When the object type |A>1| (or the relationship type |R|>1) and the attribute value type |W|>0, the network may be referred to as a heterogeneous information network with an attribute value.
The conventional heterogeneous information network generally has no attribute value, that is, a relationship in the network has no attribute value or does not consider the attribute value. For the heterogeneous information network with an attribute value, some relationships in the network has attribute values, and these attribute values may be continuous or discrete. The continuous attribute values may be converted to discrete attribute values for processing. For example, in the heterogeneous information network used for recommending movies, the user may give a rating value of 1 to 5 to a movie that the user has watched. In a heterogeneous information network of scientific literature, a relationship between an author and a paper may have different attribute values to represent which author of the paper the author is.
Referring to
Two objects in the heterogeneous information network may be connected through different meta paths, and these paths have different meanings. For example, in
Next, a meta path in a heterogeneous network with an attribute value is defined. The meta path refers to a meta path whose connection has an attribute value constraint, and may be represented as
(also may be written as A1(δ1(Re1))A2(δ2(Re2)) . . . (δl(Rel))Al+1|C for short), where the subscript l represents a number of the meta path. If a connection of a relationship Re has an attribute value, an attribute value function δ(Re) is a set of attribute values in the relationship Re; otherwise, δ(Re) is an empty set.
represents that the relationship Rel between Al and Al+1 is based on the attribute value δl(Rel) An attribute value constraint condition of the attribute value function is an associated constraint set between attribute value functions. If all attribute value functions of the meta path are empty sets (corresponding attribute value constraint conditions are also empty sets), the meta path is referred to as a meta path without an attribute value; otherwise, the meta path is referred to as a meta path with an attribute value or an extended meta path.
Using
(that is, U(1)M) represents that users rate 1 point for the movie, and means that the users do not like the movie. A meta path
(that is, U(1,2)M(1,2)U) represents candidate users who do not like the same movie as the target user, and the meta path UMU without an attribute value may only reflect users who have same movie watching records. In addition, attribute values of different relationships in the meta path can be limited by flexibly setting associated constraints. Referring to
As shown in
As shown in
As shown in
Step 502. Obtaining a meta path that connects a candidate user and a target user in a heterogeneous information network, where the meta path includes a connection that is between the candidate user and a candidate recommendation-object and that has an attribute value.
Specifically, the heterogeneous information network includes various types of objects, the objects include at least a candidate user, a target user, and a candidate recommendation-object, and connections between the objects represent relationships between connected objects. Here, the user is a data object obtained through mapping from a natural person, the target user represents a receiver of recommendation information, and the candidate user is a user whose relationship and attribute value are known. The candidate recommendation-object refers to an object that may be recommended to the target user, and includes at least one of movies, music, books, friends, groups, and goods.
The heterogeneous information network includes a meta path, and the meta path is connected to a candidate user and a target user, and includes a candidate recommendation-object. Object types of the meta path are symmetric. For example, in the heterogeneous information network shown in
Step 504. Obtaining a user similarity between the target user and the candidate user relative to the meta path.
The similarity refers to similarity measurement, and represents a similarity degree of two objects. The user similarity between the target user and the candidate user relative to the meta path is a similarity that is between the target user and the candidate user and that is calculated based on the meta path. Because an excessively long meta path has no meaning and generates an undesired similarity, the length of the meta path may be limited to a certain number, for example, not larger than 4. The length of the meta path is equal to the number of connections in the meta path.
In an embodiment, Step 504 includes: according to similarities between the target user and the candidate user relative to atomic meta paths of the meta path, obtaining the user similarity between the target user and the candidate user relative to the meta path.
If an attribute value function δl(Rel) in a meta path A1(δ1(Re1))A2 (δ2 (Re2)) . . . (δl(Rel))Al+1|C has a fixed value, the path is referred to as an atomic meta path. One meta path is a set of all atomic meta paths that meet an attribute value constraint condition of the meta path. For an atomic meta path, an existing similarity measurement method may be directly used. The existing similarity measurement method is, for example, PathSim (Y. Sun, J. Han, X. Yan, P. Yu, and T. Wu. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. In VLDB, pages 992-1003, 2011), PCRW (N. Lao and W. Cohen. Fast query execution for retrieval models based on path constrained random walks. In KDD, pages 881-888, 2010), or HeteSim (C. Shi, X. Kong, Y Huang, P. S. Yu, and B. Wu. Hetesim: A general framework for relevance measure in heterogeneous networks. IEEE Transactions on Knowledge and Data Engineering, 26(10):2479-2492, 2014).
Step 506. Estimating an attribute value of a connection between the candidate recommendation-object and the target user according to the attribute value of the connection between the candidate user and the candidate recommendation-object, an attribute value constraint condition of the meta path, and the user similarity.
Specifically, the meta path further includes an attribute value constraint condition used for restricting a mathematical relationship between two attribute values: the attribute value that is between the candidate user and the candidate recommendation-object, and the attribute value of the connection between the candidate recommendation-object and the target user. The attribute value constraint condition may be that the two attribute values are equal, or within a preset range. When the two attribute values are equal, the attribute value constraint is a strict attribute value constraint, so that a recommendation result is more accurate; while if the two attribute values are within in the preset range, broad meta path semantics can be found.
Using a meta path U(i)M(j)U|i=j as an example, an attribute value i and an attribute value j are variables whose value is 1 to 5, and the attribute value i and the attribute value j need to meet an attribute value constraint condition: i=j. Alternatively, using a meta path U(i)M(j)U∥i−|≦1) as an example, an attribute value i and an attribute value j need to meet an attribute value constraint condition: |i−j|≦1.
In this way, the user similarity may reflect a similarity degree between the candidate user and the target user, and therefore may be used for determining a degree considered in the attribute value corresponding to the corresponding candidate user when the attribute value of the connection between the candidate recommendation-object and the target user is estimated. The attribute value constraint condition may be used for limiting a specific value of an estimated attribute value, so that the estimated attribute value conforms to the semantics of the meta path.
Step 508. Sending recommendation information of the candidate recommendation-object to a terminal corresponding to the target user when the estimated attribute value meets a recommendation condition.
Specifically, the recommendation condition is a determining condition of whether to recommend a corresponding candidate recommendation-object to a target user. The recommendation condition, for example, may be that the estimated attribute value is greater than a preset threshold, or the estimated attribute value is equal to a preset threshold, or the estimated attribute value is less than a preset threshold, and is specifically determined according to the meaning of the attribute value and the recommendation requirement.
For example, if the attribute value represents a rating value given to a movie by a user, if the rating value is positively correlated with an attitude of the user for the movie, that is, that the user likes the movie more indicates a higher rating value, when the estimated attribute value is greater than or equal to a preset threshold, the recommendation information of the candidate recommendation-object is sent to the terminal corresponding to the target user. If the rating value is negatively correlated with the attitude of the user for the movie, that is, that the user likes the movie more indicates a lower rating value, when the estimated attribute value is less than the preset threshold, the recommendation information of the candidate recommendation-object is sent to the terminal corresponding to the target user. The preset threshold may be flexibly configured according to a recommendation precision requirement.
The recommendation information may include description information of the candidate recommendation-object, and may further include an access address of the candidate recommendation-object. For example, when the candidate recommendation-object is a movie, the description information may include information such as a movie name, a movie overview, a director, an actor, and a promotion poster, and the access address may be an access address of a ticket website or an access address of an online video website.
In the foregoing method for pushing recommendation information, a new heterogeneous information network is used for recommending an object, and the heterogeneous information network includes a meta path that connects a candidate user and a target user, and may represent a social-network relationship between the target user and the candidate user. A connection between the candidate user and a candidate recommendation-object in the meta path has an attribute value, so as to quantize a relationship between the candidate user and the candidate recommendation-object. After obtaining a user similarity between the target user and the candidate user relative to the meta path, an attribute value of a connection between the candidate recommendation-object and the target user may be estimated according to the user similarity and in combination with the attribute value of the connection between the candidate user and the candidate recommendation-object and an attribute value constraint condition, and the estimated attribute value may reflect a quantized relationship between the target user and the candidate recommendation-object. In this way, when the heterogeneous information network is used for recommendation, not only a social-network relationship of the target user is considered, but also a quantized relationship between the target user and the candidate recommendation-object is considered, so that a push result is more accurate.
As shown in
Step 602. Splitting the meta path into multiple atomic meta paths according to the attribute value constraint condition of the meta path.
Specifically, for a meta path A1(δ1(Re1))A2(δ2(Re2)) . . . (δl(Rel))Al+1|C, a discrete value range of the attribute value may be traversed when an attribute value constraint condition C is met, to obtain multiple atomic meta paths through splitting. The number of atomic meta paths obtained through splitting is related to the discrete value range and the attribute value constraint condition C.
For example, using the heterogeneous information network shown in
Step 604. Obtaining similarities between the target user and the candidate user relative to the atomic meta paths.
Specifically, any of these similarity measurement methods: PathSim, PCRW, and HeteSim may be used for calculating similarities between the target user and the candidate user relative to the atomic meta paths. When PathSim is used for calculating the similarity, specifically, the number of instances of the meta paths that connect the target user and the candidate user is calculated first along an atomic meta path, and the number is then regularized, to obtain a corresponding similarity.
Step 606. According to the obtained similarities relative to the atomic meta paths, calculating the user similarity that is between the target user and the candidate user relative to the meta path.
The similarities relative to the atomic meta paths are similarities based on the atomic meta paths. Because the meta path may be split into a group of corresponding atomic meta paths, the user similarity based on the meta path may be seen as a comprehensive similarity based on the similarities of all the corresponding atomic meta paths. The user similarity may be obtained in a summation manner or a weighted summation manner.
In an embodiment, Step 606 includes: calculating a sum of the obtained similarities relative to the atomic meta paths; and using the sum of the similarities directly as the user similarity that is between the target user and the candidate user relative to the meta path, or using the sum of the similarities as the user similarity after performing a positive correlation operation on the sum of the similarities.
Specifically, after the similarities between the target user and the candidate user relative to the atomic meta paths are calculated, similarities corresponding to all atomic meta paths of the meta path are summed. Then, the sum of these similarities may be directly used as the user similarity between the target user and the candidate user relative to the meta path, or a positive correlation operation may be performed on the sum of these similarities and then the resulted sum of these similarities is used as the user similarity between the target user and the candidate user relative to the meta path. The positive correlation operation refers to an operation in which variation trends of a dependent variable and an independent variable are consistent, for example, by adding, subtracting, multiplying, or dividing a positive value. The positive correlation operation includes regularization processing. The regularization processing needs to be performed on user similarities calculated by using the two similarity measurement methods: PathSim and HeteSim, to limit a value range of the calculated similarities.
Herein, a heterogeneous information network used for recommending movies is used as an example to describe a process of calculating the user similarity. Referring to
Referring to the lower part of
As shown in
Step 802. Obtaining a discrete value range of the attribute value of the connection between the target user and the candidate recommendation-object.
Specifically, it is assumed that a user set is U, both a candidate user ν and a target user u belong to the user set U; a set of candidate recommendation-objects is |X|, a candidate recommendation-object xε|X|; P represents a set of meta paths; RεR|U|×|X| is an attribute value matrix, Ru,x εR represents attribute values of the target user u and the candidate recommendation-object x; a discrete value range of an attribute value Ru,x is a positive integer of 1 to N, for example, N may be 5. R represents a real number set.
Step 804. Separately obtaining, for each value in the discrete value range, a connection that is between the candidate user and the candidate recommendation-object and that has an attribute value whose value meets the attribute value constraint condition, and calculating, according to a user similarity that is between the candidate user corresponding to the obtained connection and the target user, an attribute value strength corresponding to the value.
Specifically, it is assumed that SεR|U|×|X| is a user similarity matrix, representing a similarity between every two users in the user set U. Su,v(l) represents a user similarity that is between the target user u and the candidate user ν and that is relative to the meta path Pl. Herein, an attribute value strength QεR|U|×|X|×N is defined. Qu,v,r(l) represents a strength that an attribute value of a connection between the target user u and the candidate recommendation-object x in a given path Pl is r. The attribute value strength Qu,v,r(l) is related to the user similarity Su,v(l). The attribute value strength Qu,v,r(l) is further related to the number of candidate users ν that have attribute values that meet the attribute value constraint condition. When the attribute value constraint condition is that two attribute values are equal, in this case, the attribute value strength Qu,v,r(l) is related to the quantity of the candidate users ν whose attribute value is r. Qu,v,r(l) may be calculated by using the following formula (1):
In the formula (1), Ev,x,r represents whether the attribute value of the connection between the candidate user ν and the candidate recommendation-object x is r; if yes, Ev,x,r is 1; otherwise, Ev,x,r is 0. Herein, when the attribute value constraint condition is that two attribute values are equal is used as an example only. When the attribute value constraint condition is other situations, Ev,x,r may be correspondingly modified to that the value is 1 only when Rv,x and r meet the attribute value constraint condition.
In this way, corresponding to each value r in the discrete value range, user similarities Su,v(l) that are in the meta path P1l and that are between a candidate user ν and a target user u that are connected to a connection of an attribute value whose value r meets the attribute value constraint condition are separately obtained and summated, to obtain an attribute value strength Qu,v,r(l) corresponding to the value r.
Step 806. Calculating a weighted average of values in the discrete value range separately by using a corresponding attribute value strength as a weight.
Specifically, each value r of 1 to is separately multiplied by a corresponding attribute value strength Qu,v,r(l) to perform weighting to calculate a weighted average. In an embodiment, after regularization processing is performed on the attribute value strength Qu,v,r(l), a weighted average may be further calculated on the values in the discrete value range separately by using a corresponding regularized attribute value strength as a weight. The performing regularization processing on the attribute value strength Qu,v,r(l) is specifically: dividing the attribute value strength Qu,v,r(l) by the sum of all attribute value strengths in the corresponding meta path.
Step 808. Obtaining an estimated attribute value of the connection between the candidate recommendation-object and the target user according to the calculated weighted average.
Specifically, when there is only one meta path, the weighted average calculated in step 806 may be directly used as the estimated attribute value. Specifically, the estimated attribute value may be calculated by using the following formula (2):
where {circumflex over (R)}u,v(l) represents the estimated attribute value of a connection between a candidate recommendation-object x and a target user u in a meta path P1l, N represents an upper limit of a discrete value range, Qu,v,r(l) represents an attribute value strength corresponding to a value r, and Qu,v,k(l) represents an attribute value strength corresponding to a value k.
In this embodiment, an attribute value of a connection between a target user and a candidate recommendation-object may be predicted according to a given meta path, to recommend the candidate recommendation-object that meets a recommendation condition to the target user. Moreover, the formula (2) has one additional advantage, that is, it may eliminate a bias of user similarities obtained through calculation in different meta paths. Considering that user similarities obtained through calculation based on different meta paths have different value ranges and, therefore, it is difficult to compare similarity calculations between different meta paths and attribute value strengths, regularized or normalized attribute value strengths in the formula (2) may eliminate the difference of the value range.
In an embodiment, Step 808 includes: calculating a weighted average by separately multiplying a weighted average calculated in each meta path by a path weight of a corresponding meta path, to obtain the estimated attribute value of the connection between the candidate recommendation-object and the target user.
Specifically, a unified path weight of each meta path is set for all users, and represents a preference of the users for the meta path. Specifically, as shown in the formula (3):
where w(l) represents a path weight of a meta path Pl. A comprehensive estimated attribute value {circumflex over (R)}u,x based on all meta paths may be represented by using a weighted average of attribute values {circumflex over (R)}u,x estimated on the meta paths. The sum of path weights of meta paths obtained after target optimization is performed is 1. Therefore, {circumflex over (R)}u,x is a weighted average calculated by separately multiplying a weighted average calculated in each meta path by a path weight of the corresponding meta path.
To make the estimated attribute value matrix {circumflex over (R)}εR|U|×|X| close to the real attribute value matrix R, herein, a bisection error based on the real attribute value and the estimated attribute value defines a target function, as shown in the formula (4):
The symbol □ represents a Hadamard product of matrices, that is, a product of corresponding elements; ∥•∥p represents a p norm of matrices. Y represents an indicating matrix, Yu,x=1 represents that a connection between the target user u and the candidate recommendation-object x has an attribute value; otherwise, Yu,x=0; λ0 is a control parameter. s.t. represents a constraint condition. When a real attribute value of the connection between the target user and the candidate recommendation-object is known, the target function of the foregoing formula (4) may be optimized to calculate a path weight vector wεR1×|P|.
In an embodiment, Step 808 includes: calculating a weighted average by separately multiplying a weighted average calculated in each meta path by path weights corresponding to the target user and the corresponding meta path, to obtain the estimated attribute value of the connection between the candidate recommendation-object and the target user.
Specifically, considering that in many real application scenarios, each user has a personalized interest preference, and a unified path weight cannot provide a personalized recommendation to the user. To implement the personalized recommendation, a path weight vector may be set for each user. It is assumed that a path weight matrix is represented as WεR|U|×|P|, where each element Wu(l) represents path weights corresponding to the target user u and the path Pl. A column vector W(l)εR|U|×1 represents path weight vectors of all users in a path Pl. Therefore, the estimated attribute value matrix {circumflex over (R)}u,x represents an attribute value of a connection between the target user u in all meta paths and the candidate recommendation-object ν. There is the formula (5):
Another target function is also defined, such as the formula (6):
The symbol □ represents a Hadamard product of matrices, that is, a product of corresponding elements; ∥•∥p represents a p norm of matrices. Y represents an indicating matrix, Yu,x=1 represents that a connection between the target user u and the candidate recommendation-object x has an attribute value; otherwise, Yu,x=0. λ0 is a control parameter. diag(W(l)) represents a diagonal matrix converted from a vector W(l). s.t. represents a constraint condition. When a real attribute value of the connection between the target user u and the candidate recommendation-object x is known, the target function of the foregoing formula (6) may be optimized to calculate a path weight vector W.
In an embodiment, the method for pushing recommendation information further includes: obtaining a real attribute value of the connection between the candidate recommendation-object and the target user; initializing the path weights corresponding to the target user and the meta path; and adjusting, according to the user similarity, the initialized path weights in a direction towards an average of the path weights corresponding to the candidate user and the meta path, so that a difference between the real attribute value and the estimated attribute value meets a minimization condition.
Specifically, although the user personalized path weight is considered in the formula (6), it is difficult for users who have only a small amount of attribute value information to perform effective weight learning. Weights that need to be learned are totally |U|×|P|, but the number of training samples is far less than |U|×|X|. The training samples are always insufficient to perform weight learning, and this is particularly important for cold start users and goods. A path weight of a user should be relatively consistent with a path weight of a similar user. For users who have only a small quantity of attribute values, their path weights may be learned from path weights of other users similar to the users, because a user similarity based on a meta path is more effective for these users.
The path weights corresponding to the target user and the meta path are initialized, and may be specifically initialized to 0 or a value greater than 0, and the initialized path weight is then adjusted in a direction towards an average of path weights corresponding to the candidate user and the meta path. An approaching speed is positively correlated with a user similarity, a larger user similarity indicates a faster approaching speed, and a smaller user similarity indicates a slower approaching speed. When the difference between the real attribute value and the estimated attribute value meets the minimization condition, the adjusting is stopped. The minimization condition may be the foregoing formula (4) or (6) or the following formula (9).
Therefore, a path weight regularizer is defined, as shown in the formula (7):
where |U| represents a total quantity of users, |P| represents a total quantity of extended paths, Wu(l) represents path weights corresponding to a target user u and a path Pl, Wu(l) represents path weights corresponding to a candidate user ν and a path Pl, and
is a user similarity, on which regularization is performed, that is based on a path Pl and that is between the target user u and the candidate user ν. For convenience, the path weight regularizer may be represented in a matrix form of the following formula (8):
where W(l) is a path weight matrix,
Based on the foregoing formula (6), a path weight regularizer is added, to obtain a target function such as the following formula (9):
wherein the symbol □ represents a Hadamard product of matrices, that is, a product of corresponding elements; ∥•∥p represents a p norm of matrices; Y represents an indicating matrix, Yu,x=1 represents that a connection between the target user u and the candidate recommendation-object x has an attribute value; otherwise, Yu,x=0. λ0 is a control parameter; λ1 is another control parameter; W(l) represents path weight vectors of all users in a path Pl; diag(W(l)) represents a diagonal matrix converted from a vector W(l); W represents path weight matrices of all users; {circumflex over (R)}(l) represents an estimated attribute value matrix based on a meta path Pl, and R represents a real attribute value matrix.
The foregoing formula (9) is a non-negative quadratic programming problem, that is, a simple form of non-negative matrix decomposition. An optimized solution may be performed by using a gradient projection method for resolving an optimization problem with a non-negative boundary constraint. For the gradient projection method for resolving an optimization problem with a non-negative boundary constraint, reference may be made to “C. J. Lin. Projected gradient methods for non-negative matrix factorization. In Neural Computation, pages 2756-2279, 2007”. The gradient for W in the formula (9) is:
where the symbol T represents a transposition. An updated formula of Wu(l) is shown in the formula (11):
where α is a step, and may be set according to a user requirement.
Specifically, the path weights corresponding to the target user and the meta path may be learned by using the following step (1) to step (7). Step (1) to step (7) may be referred to as a SemRec method (Semantic path based personalized Recommendation method).
Step (1). Obtaining a heterogeneous information network G with an attribute value, a meta path set P of connected users, a control parameter λ0, a control parameter λ1, a step α when a parameter is updated, and a convergence threshold ò.
Step (2). Separately calculating, relative to each meta path in the meta path set P, a user similarity matrix S(l), an attribute value strength matrix Q(l), and an estimated attribute value matrix {circumflex over (R)}(l).
Step (3). Initializing a path weight matrix W>0.
Repeatedly performing the following steps (4), (5), and (6), until |−Wold|<ò is met.
Step (4). Wold:=W.
Step (5). Calculating
Step (6).
Step (7). Outputting a path weight matrix W of all users.
where Wold:=W represents assigning W to Wold,
represents calculating a partial differential of the formula (9),
represents choosing a maximum value between 0 and
and |W−Wold|<ò represents that a difference between calculated in neighboring two iterations is less than a convergence threshold ò.
As can be found from the target function, the unified path weight learning method (as L1 shown in the formula (4)) is a special personalized weight learning method (L2 shown in the formula (6)), that is, path weights (that is, W(l)) of all users in the path Pl are equal. In addition, both the two weight learning methods are special examples of a weight learning method of a path weight regularizer. The optimized target function L3 changes to L2 when λ1 is 0, and changes to L1 when λ1 is approaching +∞. Therefore, the control parameter λ1 actually controls the personalization level, and a smaller λ1 represents a more strong user personalized path weight, but this makes weight learning very difficult. Therefore, a real application may need to set a suitable λ1 according to an application scenario.
In an embodiment, the candidate recommendation-object is a network resource; and the attribute value is a rating value. The network resource includes resources that can be obtained from a network, such as movies, videos, and novels, and the rating value may be used for reflecting a quantized attitude of a user to the network resources.
To verify a recommendation effect of a heterogeneous information network with an attribute value, two data sets are obtained from the network. A first data set includes 13367 user object users, 12677 movies, and 1068278 rating values of 1 to 5. The first data set further includes a social-network relationship between the user object users and attribute information of the user object users and the movies. A second data set includes rating values of the user object users to local merchants, and attribute information of the user object users and the merchants. The data set includes 16239 user object users, 14284 local merchants, and 198397 rating values of 1 to 5. Table 1 is detailed statistical information of the two data sets. The two data sets have some different properties. The rating value relationships of the first data set are denser but the social-network relationships are very sparse, and the rating value relationships of the second data set are relatively sparse but the social-network relationships are denser.
Here, two general evaluation indicators: a root mean squared error (RMSE) and a mean absolute error (MAE), are used to evaluate and estimate the quality of the attribute value.
where Rtest represents an entire test set. Further, a data set is divided into a training set and a test set, the training set is used for training a heterogeneous information network with an attribute value, and the test set is used for testing an effect of the trained heterogeneous information network. A smaller MAE or RMSE indicates a better effect.
To show the effectiveness of the provided SemRec method, four variant methods of SemRec are compared. In addition to the personalized path weight learning method (referred to as SemRecReg) with a path weight regularizer, three special versions of SemRec are further considered: A method based on a single meta path (referred to as SemRecSgl), a method of learning a unified path weight by all users (referred to as SemRecAll), and a method for learning a personalized path weight by each user (referred to as SemRecInd).
Because an excessively long meta path has no meaning and generates a bad similarity, herein, five meta paths whose length is not over 4 are used in each data set. Table 2 shows these meta paths with a weight or without a weight. A user similarity is calculated in SemRec by using PathSim. The parameter λ0 in SemRec is set to 0.01, and λ1 is set to 103.
For the first data set, different training data proportions (20%, 40%, 60%, and 80%) are set to show comparison results in different data sparsity. That the training data proportion is 20% represents that 20% of rating values in a user-candidate recommendation-object rating value matrix is used as a training set to perform model training, and the remaining 80% of rating values is predicted. The first data set has denser rating value relationships. The rating value relationships of the second data set are sparser. Therefore, more data in the second data set is used as training sets (60%, 70%, 80%, and 90%). For each experiment result, ten training sets and test sets are obtained through division independently and randomly according to a given proportion, and an average is used as the result shown in Table 3.
Through analysis on a test result of Table 3, different versions of SemRec have different performance. Generally, SemRec (such as SemRecAll and SemRecReg) with multiple paths has a better effect than SemRec (that is, SemRecSgl) with a single path, with the exception of SemRecInd. This represents that a path weight learning method of SemRec may effectively integrate similarity information generated on different paths. Because of the sparsity of rating values, a recommendation effect of SemRecInd is worse than an effect of SemRecAll in most cases. In addition, SemRecReg may achieve a better effect in all cases. This is because SemRecReg not only implements personalized path weight learning of all users, but also uses path weight regularization to avoid a problem brought by the sparsity of rating values.
In addition, an average running time of these methods in the learning process is recorded. SemRec of four versions has a longer running time as the complexity of the path weight learning method increases. SemRecSgl and SemRecAll are very fast, and may be directly applied to online learning. The running time of SemRecInd and SemRecReg are also acceptable. In actual application, a suitable SemRec method may be selected according to a requirement to balance the efficiency and performance.
As shown in
The meta path obtaining module 901 is configured to obtain a meta path that connects a candidate user and a target user in a heterogeneous information network, where the meta path includes a connection that is between the candidate user and a candidate recommendation-object and that has an attribute value.
The user similarity obtaining module 902 is configured to obtain a user similarity between the target user and the candidate user relative to the meta path.
The attribute value estimation module 903 is configured to estimate an attribute value of a connection between the candidate recommendation-object and the target user according to the attribute value of the connection between the candidate user and the candidate recommendation-object, an attribute value constraint condition of the meta path, and the user similarity.
The push module 904 is configured to send recommendation information of the candidate recommendation-object to a terminal corresponding to the target user when the estimated attribute value meets a recommendation condition.
As shown in
The splitting module 902a is configured to split the meta path into multiple atomic meta paths according to the attribute value constraint condition of the meta path.
The similarity calculation module 902b is configured to obtain similarities between the target user and the candidate user relative to the atomic meta paths.
The user similarity combination module 902c is configured to calculate, according to the obtained similarities relative to the atomic meta paths, the user similarity between the target user and the candidate user relative to the meta path.
In an embodiment, the user similarity combination module 902c is further configured to calculate a sum of the obtained similarities relative to the atomic meta paths; and use the sum of the similarities directly as the user similarity between the target user and the candidate user relative to the meta path, or use the sum of the similarities as the user similarity after performing a positive correlation operation on the sum of the similarities.
In an embodiment, the attribute value estimation module 903 includes: a discrete value range obtaining module 903a, an attribute value strength calculation module 903b, a weighted averaging module 903c, and an estimation result generation module 903d.
The discrete value range obtaining module 903a is configured to obtain a discrete value range of the attribute value of the connection between the target user and the candidate recommendation-object.
The attribute value strength calculation module 903b is configured to separately obtain, for each value in the discrete value range, a connection that is between the candidate user and the candidate recommendation-object and that has an attribute value whose value meets the attribute value constraint condition, and calculate, according to a user similarity that is between the candidate user corresponding to the obtained connection and the target user, an attribute value strength corresponding to the value.
The weighted averaging module 903c is configured to calculate a weighted average of values in the discrete value range separately by using a corresponding attribute value strength as a weight.
The estimation result generation module 903d is configured to obtain an estimated attribute value of the connection between the candidate recommendation-object and the target user according to the calculated weighted average.
In an embodiment, the estimation result generation module 903d is further configured to calculate a weighted average by separately multiplying a weighted average calculated in each meta path by a path weight of a corresponding meta path, to obtain the estimated attribute value of the connection between the candidate recommendation-object and the target user.
In an embodiment, the estimation result generation module 903d is further configured to calculate a weighted average by separately multiplying a weighted average calculated in each meta path by path weights corresponding to the target user and the corresponding meta path, to obtain the estimated attribute value of the connection between the candidate recommendation-object and the target user.
In an embodiment, the apparatus 900 for pushing recommendation information further includes: a path weight learning module 905, configured to obtain a real attribute value of the connection between the candidate recommendation-object and the target user; initialize the path weights corresponding to the target user and the meta path; and adjust, according to the user similarity, the initialized path weights in a direction towards an average of the path weights corresponding to the candidate user and the meta path, so that a difference between the real attribute value and the estimated attribute value meets a minimization condition.
In an embodiment, the candidate recommendation-object is a network resource; and the attribute value is a rating value.
A person of ordinary skill in the art may understand that all or some processes for implementing the foregoing embodiment methods may be completed by a computer program instructing related hardware. The program may be stored in a computer readable storage medium. When the program is running, the processes in the embodiments of the foregoing methods may be included. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (ROM), or a random access memory (RAM).
The foregoing technical characteristics of the embodiments may be combined randomly. For ease of description, not all possible combinations of the technical characteristics in the foregoing embodiments are described. However, as long as these combinations of the technical characteristics do not conflict, the combinations shall be considered as the scope recorded in this specification.
The foregoing embodiments only describe several implementations of the present disclosure, and the descriptions are relatively specific and detailed, but cannot be understood as limitation to the patent scope of the present invention. It should be noted that a person of ordinary skill in the art may further make variations and improvements without departing from the conception of the present disclosure, and these all fall within the protection scope of the present disclosure. Therefore, the patent protection scope of the present disclosure should be subject to the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
201510567428.9 | Sep 2015 | CN | national |
This application is a continuation application of PCT Patent Application No. PCT/CN2016/084284, filed on Jun. 1, 2016, which claims priority to Chinese Patent Application No. 201510567428.9, entitled “METHOD AND APPARATUS FOR PUSHING RECOMMENDATION INFORMATION” filed on Sep. 8, 2015, which is incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2016/084284 | Jun 2016 | US |
Child | 15715840 | US |