The present invention relates to the field of knowledge graph rules, in particular to a combined commodity mining method based on knowledge graph rule embedding.
In knowledge graph, triples (head, relation, tail) are used to represent knowledge. We can represent this knowledge with one-hot vectors. But there are too many entities and relations, and the dimensions are too large. The one-hot vectors cannot capture similarities when two entities or relations are very close. Inspired by a Wrod2Vec model, many methods for representing entities and relations with distributed representations (KGE) have been proposed in academic community, such as TransE, TransH, TransR and so on. A basic idea of these models is that by learning a graph structure, the head, relation and tail can be represented by a low-dimensional dense vector. For example, the TransE is to make a sum of the head vector and the relation vector as close as possible to the tail vector. In the TransE, a triple is scored as:
For a correct triple(h, r, t)∈Δ′, there should be a lower score, while for a wrong triple(h′, r′, t) ∈Δ′, there should be a higher score, and a final loss function is:
The knowledge graph only has the correct triplet (golden triplet), so a negative instance can be generated by destroying the head entity or tail entity of a correct triplet, that is, one of the head entity, tail entity, and relation is randomly replaced with other entity or relation, resulting in a set of negative instancesΔ . By continuously optimizing this loss function, the representations of h, r, t can be finally learned.
In the field of e-commerce, likewise, there is also a commodity knowledge graph. In the commodity knowledge graph, the head entity refers to a commodity, the relation refers to a commodity attribute, and the tail entity refers to an attribute value of the commodity. Therefore, embeddings of the commodity, commodity attribute and commodity attribute value can be learned through the KGE method, and then used in a downstream task.
In the field of the e-commerce, merchants sometimes need to bind and sell several commodities. On the one hand, a total price of the several commodities is generally lower than a sum of selling prices of all single commodities, so that profit is given to users and they would be more motivated to buy. On the other hand, a seller can make more profit by selling several commodities at the same time than by selling one commodity. Therefore, there is a great demand for combined commodity sales in practical applications, which requires a method that can automatically help sellers combine several commodities that can be sold together.
However, the KGE-based method has the disadvantage that although it can predict whether two commodities belong to a combination, the seller does not know why the two commodities are combined, so it is necessary to provide interpretability for this combination. Based on this, it is urgent to design a method so that the sellers can intuitively know why two commodities can be sold together.
The present invention provides a combined commodity mining method based on knowledge graph rule embeddings. By expressing the combined commodity rules as embeddings, and then parsing the learned rule embeddings into specific rules, it can help merchants to construct combined commodities that can be sold together.
A combined commodity mining method based on knowledge graph rule embeddings, comprising:
In step (1), the composition of each triple in the commodity knowledge graph is (I, P, V), which represents that the attribute value of the commodity I under the attribute P is V. Different commodities are associated with the same attribute or attribute value, thus forming the structure of the graph.
In step (2), the commodity I, the commodity attribute P, the commodity attribute value V and a plurality of rules are respectively numbered as ids, and then each id constitutes a one-hot vector, and then the one-hot vector is mapped into an embedding, which is continuously optimized with a model training process.
In steps (3) to (5), in the three neural networks, a calculation formula of an activation function of each layer of neurons is:
wherein the function of RELU judges the value of each element in this matrix in turn, and if the value of the element is greater than 0, then the value is kept, otherwise the value is set to be 0.
In the three neural networks, a calculation formula of each layer of each neural network is:
wherein, W1 W2,..., WL; b1 b2,...,bL are all parameters that need to be learned, W1, W2 , W3, ..., WL are matrices having sizes dimemb*dim1, dim1*dim2, dim2*dim3,...,dimL-1*dimL respectively and being random initialized; b1,b2,...,bLis a randomly initialized vector of size dim1, dim2, dim3,...,dimL, L is the number of layers of the neural network; a nonlinear activation functionsigmoid (z)=
the output value is limited to (0 , 1) interval.
In step (6), the similarity scores s21, s22 and s2 are all calculated by cosine similarity, and the specific formulas are:
In step (10), the cross entropy loss function is:
wherein, prob(i) and y(i) are both probability distribution functions, 0≤i<K and i are integers, y ( i ) ∈{0,1} is a real probability distribution and 0≤prob (i ) ≤1 is a probability distribution predicted by the model, Σy ( i ) =1, Σprob ( i ) =1 , and K refers to a total i i number of categories, herein, K is 2; this cross entropy function is used to measure a difference between two distributions, the larger the value calculated by this formula, the greater the difference between the two distributions.
Preferably, the optimization algorithm having gradient descent is SGD or Adam.
The specific process of step (11) is:
Compared with the prior art, the present disclosure has the following beneficial effects:
The present invention integrates the learning of the rules into the training process of the model, and finally parses the learned rule embeddings into rules. Based on the rules, the seller can know why the two commodities can be combined for sale, which can bring great benefits for e-commerce sales of commodities.
The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be pointed out that the following embodiments are intended to facilitate the understanding of the present invention, but do not have any limiting effect on it.
As shown in
S01. constructing a knowledge graph of commodities, wherein for each ternary group, a head entity is a commodity, a relation is a commodity attribute, a tail entity is a commodity attribute value. A task of combining commodities is defined as: given two commodities in the commodity knowledge graph, and a plurality of attributes and attribute values of each commodity, it is necessary to determine whether the two commodities are a combined commodity. The innovation of the present invention is that rule learning is integrated into a model training process, so that a seller can be provided with interpretability through learned rules.
S02. expressing firstly each commodity, commodity attribute, commodity attribute value, and rule as an id, and then indexing each id to an embedding. For each sample, the inputted two commodities would have n attributes and attribute values, plus the inputted n rules, the present invention predicts whether the two commodities are a combined commodity based on this method.
S03. calculating firstly a score of each attribute. Firstly, splicing and inputting the embedding of each rule and the embedding of each commodity attribute into a first neural network to obtain a importance score s1 of the attribute. A formula for each layer of the first neural network is:
Specifically, by splicing and inputting the embedding of each rule and the embedding of the commodity attribute into a fully connected layer, more and more high-level semantics are obtained, and finally the importance score s1 of the attribute under the rule can be predicted based on the high-level semantics, a larger value means that the attribute is more likely to be included in this rule. Pre-setting a threshold thres1, when the value of s1 is greater than thres1, then the attribute is included in this rule.
S04. then calculating a score of each attribute value. Splicing and inputting the embedding of each rule and the embedding of each commodity attribute into a second neural network to obtain a predicted embedding of the attribute value. A formula for each layer of the second neural network is:
Specifically, the rules and attributes can be inputted into a multi-layer neural network, and finally the embedding of the attribute value that should be taken under the attributes is predicted. Next, there are two cases. If the inputted attribute values under the attribute of the two commodities are the same, then a similarity between the attribute value and the predicted attribute value can be calculated. The higher the similarity degree, the higher the score of the attribute value. The method for calculating the similarity of the attribute value is as follows:
Meanwhile, there is a possibility that under this rule, the value under this attribute is “same”. At this time, splicing and inputting the embedding of each rule and the embedding of each commodity attribute into a third neural network, so as to obtain a probability that the value under this attribute is “same”. The formula of the third neural network is:
⋯
If the inputted attribute values under the attribute of the two commodities are different, then the degree of similarity between the two attribute values and the predicted attribute values can be calculated separately, and then the two similarity scores can be combined to finally obtain the score for the two attribute values. The method for calculating the similarity of the attribute values is as follows:
S05. next, solving a score of an attribute-attribute value pair. It can be divided into three cases: when the score s1 of the attribute is less than or equal to the preset threshold thres1, then the score of the attribute value under the attribute should be 0; if the score s1 of the attribute is greater than the preset threshold thres1 and the attribute values of the two commodities under this attribute are the same, then the score of this attribute-attribute value is
if the score s1 of the attribute is greater than the preset threshold thres1 and the attribute values of the two commodities under this attribute are different, then the score of this attribute-attribute value is
S06, after obtaining the score of an attribute-attribute pair, calculating a score of a commodity pair under a certain rule, wherein the calculation formula is:
S07. after obtaining the score of a commodity pair under a certain rule, summing up the scores of the commodity pair under all the rules to obtain a final score of the commodity pair, wherein the calculation formula is:
S08. comparing the obtained score of the commodity pair with two labels 0 or 1 indicating whether belonging to a combined commodity to obtain a cross entropy loss;
This loss function is then optimized with the an Adam optimizer.
S09. after the rules are learned, parsing the rules, wherein the way of parsing the rules is similar to that during the training. First, the embedding of the rule and the embedding of each possible attribute are spliced and inputted into the first network to obtain the importance score of each attribute, and if the score s1 of the attribute is greater than the threshold thres1, then this attribute is included in this rule. Then, if the attribute is included in this rule, the value under the rule should be calculated to be “same” or a specific value.
In this way, the combination commodity rules can be obtained. In the final application, there are mainly two ways:
commodities belong to a combined commodity. If none of the rules can determine that the two commodities belong to a combined commodity, then the two commodities do not constitute a combined commodity.
Next, a specific example is used to illustrate the construction process of the present invention.
First, as shown in Table 1, it is a sample of the model input, which contains two commodities, each commodity contains a plurality of attributes and attribute values, under each attribute, the attribute values of the two commodities may or may not be the same.
First, representing all attributes and attribute values of the two commodities as embeddings. Then passing each attribute through the first neural network to obtain the importance score of the attribute; then inputting the attribute value to the second neural network to obtain the attribute value score. Then, summing up the attribute and attribute-value scores to obtain the attribute-attribute value pair score. Then, summing up the scores of all attribute-attribute value pairs to obtain the score of the two commodities belonging to the same commodity under this rule. Finally, summing up the scores of all the rules for these two commodities, and finally obtaining the score that these two commodities belong to the same commodity.
During the testing phase, the rules need to be parsed. As shown in Table 2, it is a rule parsed by the model based on the samples shown in Table 1.
The way of parsing the rule is similar to that in the training process. It also determines which attributes the rule contains, and then determines which attribute values should be contained under each attribute, and finally the rule can be parsed.
The above-mentioned embodiments describe the technical solutions and beneficial effects of the present invention in detail. It should be understood that the above-mentioned embodiments are only specific embodiments of the present invention and are not intended to limit the present invention. Any modifications, additions and equivalent replacements made within the principle of the present invention shall be included within the protection scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
202011538259.3 | Dec 2020 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2021/135500 | 12/3/2021 | WO |