Item Recommendation Method and Related Device Thereof

TECHNICAL FIELD

The present disclosure relates to artificial intelligence (AI) technologies, and in particular, to an item recommendation method and a related device thereof.

BACKGROUND

Currently, a neural network model of an AI technology can be used to predict an item that can be recommended to the user. Specifically, historical information of the user may be first collected, and the historical information indicates items that the user has interacted with and behaviors of the user for these items. Because there are a plurality types of behaviors of the user for the item, the historical information of the user may be classified based on the types of behaviors, and various types of information are separately processed by using the neural network model, to obtain processing results of the various types of information. Finally, the processing results of the various types of information may be superimposed to obtain an item recommendation result, thereby determining a target item to be recommended to the user.

In the foregoing process, when the neural network model processes information, mutual impact of a plurality of behaviors belonging to a same category is mainly considered, and factors that are considered are simple. As a result, accuracy of the item recommendation result finally output by the model is not high, affecting user experience.

SUMMARY

Embodiments of the present disclosure provide an item recommendation method and a related device thereof. An item recommendation result output by a neural network model used by the method can have high accuracy, thereby helping optimize user experience.

A first aspect of embodiments of the present disclosure provides an item recommendation method. The method includes the following steps.

When a user uses an application, to display an item of interest on a page of the application, some historical data of previously using the application by the user may be first acquired, and N pieces of first information may be obtained based on the historical data. An i^thpiece of first information indicates an i^thfirst item (that is, a historical item) that has been operated by the user when the user uses the application and an i^thbehavior. The i^thbehavior may be understood as a behavior performed by the user when the user operates the i^thitem. N behaviors performed by the user may be classified into M categories. i=1, . . . , N, N≥M, and M>1. For example, when the user uses shopping software, to predict a commodity that can be recommended to the user, five pieces of first information generated when the user uses the software previously may be obtained. A 1^stpiece of first information indicates a piece of clothing and a click behavior of the user on the clothing. A 2^ndpiece of first information indicates a pair of shoes and an add-to-favorites behavior of the user on the shoes. A 3^rdpiece of first information indicates a hat and a purchase behavior of the user on the hat. A 4^thpiece of first information indicates a pair of trousers and a click behavior of the user on the trousers. A 5^thpiece of first information indicates another pair of trousers and a purchase behavior of the user on the trousers. It can be learned that the five behaviors of the user on the five commodities may be classified into three types: a tap behavior, an add-to-favorites behavior, and a purchase behavior.

After the N pieces of first information are obtained, the N pieces of first information may be input into a target model, so that the target model may process the N pieces of first information based on a multi-head self-attention mechanism, to correspondingly obtain N pieces of second information. After the N pieces of second information are obtained, the target model may obtain an item recommendation result based on the N pieces of second information. The item recommendation result may be used to determine, from K second items (that is, candidate items), a target item recommended to the user, and K≥1.

It can be learned from the foregoing method that, when the target item of interest needs to be recommended to the user, the N pieces of first information may be first input to the target model. The i^thpiece of first information indicates the i^thfirst item and the i^thbehavior. The i^thbehavior is a behavior of the user for the i^thitem. The N behaviors of the user correspond to M categories. i=1, . . . , N≥M, and M>1. Then, the N pieces of first information may be processed by using the target model based on the multi-head self-attention mechanism, to obtain the N pieces of second information. Finally, the item recommendation result can be obtained by using the target model based on the N pieces of second information. The item recommendation result is used to determine, from the K second items, the target item recommended to the user, and K≥1. In the foregoing process, the N pieces of first information not only indicate N first items, but also indicate the N behaviors that can be classified into the M categories. Therefore, in a process in which the target model processes the N pieces of first information to correspondingly obtain the N pieces of second information, not only mutual impact of a plurality of behaviors belonging to a same category and mutual impact of a plurality of first items may be considered, but also mutual impact of a plurality of behaviors belonging to different categories may be considered. Factors that are considered are comprehensive. Therefore, the item recommendation result output by the target model based on the N pieces of second information can have high accuracy, thereby helping optimize user experience.

In a possible implementation, processing the N pieces of first information based on the multi-head self-attention mechanism, to obtain the N pieces of second information includes: performing linear processing on the i^thpiece of first information, to obtain an i^thpiece of Q information, an i^thpiece of K information, and an i^thpiece of V information; and performing an operation on the i^thpiece of Q information, N pieces of K information, N pieces of V information, and N pieces of weight information corresponding to the i^thbehavior, to obtain an i^thpiece of second information, where a j^thpiece of weight information corresponding to the i^thbehavior is determined based on the i^thbehavior and a j^thbehavior, and j=1, . . . , N. In the foregoing implementation, after the N pieces of first information are received, for any one of the N pieces of first information, that is, the i^thpiece of first information, the target model may first perform linear processing on the i^thpiece of first information, to obtain the i^thpiece of Q information, the i^thpiece of K information, and the i^thpiece of V information. For the remaining first information other than the i^thpiece of first information, the target model may also perform an operation similar to that performed on the i^thpiece of first information. Therefore, a total of N pieces of Q information, N pieces of K information, and N pieces of V information can be obtained. To be specific, the target model may perform linear processing on the 1^stpiece of first information, to obtain a 1^stpiece of Q information, a 1^stpiece of K information, and a 1^stpiece of V information, may further perform linear processing on the 2^ndpiece of first information, to obtain a 2^ndpiece of Q information, a 2^ndpiece of K information, a 2^ndpiece of V information, . . . , and may further perform linear processing on an N^thpiece of first information, to obtain an N^thpiece of Q information, an N^thpiece of K information, and an N^thpiece of V information.

For the i^thpiece of Q information, the target model may perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, and the N pieces of weight information corresponding to the i^thbehavior, to obtain the i^thpiece of second information. The j^thpiece of weight information corresponding to the i^thbehavior is determined based on the i^thbehavior and the j^thbehavior, and j=1, . . . , N. For the remaining Q information other than the i^thpiece of Q information, the target model may also perform an operation similar to that performed on the i^thpiece of Q information. Therefore, the N pieces of second information can be obtained. To be specific, the target model may first perform an operation on the 1^stpiece of Q information, the N pieces of K information, the N pieces of V information, and N pieces of weight information corresponding to a 1^stbehavior, to obtain a 1^stpiece of second information, the target model may further perform an operation on the 2^ndpiece of Q information, the N pieces of K information, the N pieces of V information, and N pieces of weight information corresponding to a 2^ndbehavior, to obtain a 2^ndpiece of second information, . . . , and the target model may further perform an operation on the N^thpiece of Q information, the N pieces of K information, the N pieces of V information, and N pieces of weight information corresponding to an N^thbehavior, to obtain an N^thpiece of second information. A 1^stpiece of weight information corresponding to the 1st behavior is determined based on the 1^stbehavior, a 2^ndpiece of weight information corresponding to the 1^stbehavior is determined based on the 1^stbehavior and the 2^ndbehavior, . . . , an N^thpiece of weight information of the 1^stbehavior is determined based on the 1^stbehavior and the N^thbehavior, . . . , a 1^stpiece of weight information corresponding to the N^thbehavior is determined based on the N^thbehavior and the 1^stbehavior, a 2^ndpiece of weight information corresponding to the N^thbehavior is determined based on the N^thbehavior and the 2^ndbehavior, . . . , and an N^thpiece of weight information of the N^thbehavior is determined based on the N^thbehavior.

In a possible implementation, the method further includes: obtaining N pieces of third information, where an i^thpiece of third information indicates the i^thbehavior; and performing an operation on the i^thpiece of third information and the N pieces of third information, to obtain N pieces of fourth information corresponding to the i^thbehavior, where a j^thpiece of fourth information corresponding to the i^thbehavior indicates a distance between the i^thbehavior and the j^thbehavior. The performing an operation on the i^thpiece of Q information, N pieces of K information, N pieces of V information, and N pieces of weight information corresponding to the i^thbehavior, to obtain an i^thpiece of second information includes: performing an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the i^thbehavior, and the N pieces of fourth information corresponding to the i^thbehavior, to obtain the i^thpiece of second information. In the foregoing implementation, after the N pieces of third information are received, for any one of the N pieces of third information, that is, the i^thpiece of third information, the target model may perform an operation on the i^thpiece of third information and the N pieces of third information, to obtain the N pieces of fourth information corresponding to the i^thbehavior. The j^thpiece of fourth information corresponding to the i^thbehavior indicates a distance between the i^thbehavior and the j^thbehavior. For the remaining third information other than the i^thpiece of third information, the target model may also perform an operation similar to that performed on the i^thpiece of third information. Therefore, a total of N pieces of fourth information corresponding to the 1^stbehavior, N pieces of fourth information corresponding to the 2^ndbehavior, . . . , and N pieces of fourth information corresponding to the N^thbehavior can be obtained. To be specific, the target model may perform an operation on the 1^stpiece of third information and the 1^stpiece of third information to obtain a 1^stpiece of fourth information (indicating a distance between 1^stbehaviors) corresponding to the 1^stbehavior, may further perform an operation on the 1^stpiece of third information and a 2^ndpiece of third information to obtain a 2^ndpiece of fourth information (indicating a distance between the 1^stbehavior and the 2^ndbehavior) corresponding to the 1st behavior, . . . , may further perform an operation on the 1^stpiece of third information and an N^thpiece of third information to obtain an N^thpiece of fourth information (indicating a distance between the 1^stbehavior and the N^thbehavior) corresponding to the 1^stbehavior, . . . , may further perform an operation on the N^thpiece of third information and the 1^stpiece of third information to obtain a 1st piece of fourth information (indicating a distance between the N^thbehavior and the 1^stbehavior) corresponding to the N^thbehavior, may further perform an operation on the N^thpiece of third information and the 2^ndpiece of third information to obtain a 2^ndpiece of fourth information (indicating a distance between the N^thbehavior and the 2^ndbehavior) corresponding to the N^thbehavior, . . . , and may further perform an operation on the N^thpiece of third information and the N^thpiece of third information to obtain an N^thpiece of fourth information (indicating a distance between N^thbehaviors) corresponding to the N^thbehavior. In this case, for the i^thpiece of Q information, the target model may perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the i^thbehavior, and the N pieces of fourth information corresponding to the i^thbehavior, to obtain the i^thpiece of second information. For the remaining Q information other than the i^thpiece of Q information, the target model may also perform an operation similar to that performed on the i^thpiece of Q information. Therefore, the N pieces of second information can be obtained. To be specific, the target model may first perform an operation on the 1^stpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the 1^stbehavior, and the N pieces of fourth information corresponding to the 1^stbehavior to obtain the 1^stpiece of second information, the target model may further perform an operation on the 2^ndpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the 2^ndbehavior, and the N pieces of fourth information corresponding to the 2^ndbehavior to obtain the 2^ndpiece of second information, . . . , and the target model may further perform an operation on the N^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the N^thbehavior, and the N pieces of fourth information corresponding to the N^thbehavior to obtain the N^thpiece of second information. It can be learned that, in a process of processing the N pieces of first information based on the multi-head self-attention mechanism, the target model further considers impact caused by a distance between orders of different behaviors. Factors that are considered are more comprehensive in comparison with a related technology. The item recommendation result output by the target model may also accurately fit a real intention of the user, thereby further improving accuracy of the item recommendation result.

In a possible implementation, the distance between the i^thbehavior and the j^thbehavior includes an interval between an order of the i^thbehavior and an order of the j^thbehavior, for example, an interval between a time at which the user performs the i^thbehavior and a time at which the user performs the j^thbehavior.

In a possible implementation, obtaining the item recommendation result based on the N pieces of second information includes: performing feature extraction on the N pieces of second information to obtain fifth information and sixth information, where the fifth information indicates a difference between the N behaviors, and the sixth information indicates a same point between the N behaviors; fusing the fifth information and the sixth information to obtain seventh information, where the seventh information indicates interest distribution of the user; and calculating matching degrees between the seventh information and K pieces of eighth information, where the matching degree is used as the item recommendation result, a t^thpiece of eighth information indicates a t^thsecond item, and t=1, . . . , K. In the foregoing implementation, after the N pieces of second information are obtained, the target model may perform feature extraction on the N pieces of second information in one manner, to obtain the fifth information. The fifth information includes an exclusive characteristic of each of the N behaviors. Therefore, the fifth information may be used to indicate a difference between the N behaviors. At the same time, the target model may further perform feature extraction on the N pieces of second information in another manner, to obtain the sixth information. The sixth information includes a common characteristic of the N behaviors. Therefore, the sixth information may indicate a same point between the N behaviors. After the fifth information and the sixth information are obtained, the target model may perform weighted summation on the fifth information and the sixth information to obtain the seventh information. The seventh information is a behavior representation of the user. Therefore, the seventh information may indicate interest distribution of the user. After the seventh information is obtained, the target model may further obtain the K pieces of eighth information. The t^thpiece of eighth information indicates the t^thsecond item, and t=1, . . . , K. In the K pieces of eighth information, the target model may calculate a matching degree between the seventh information and the t^thpiece of eighth information. For eighth information other than the t^thpiece of eighth information, the target model may also perform an operation similar to that performed on the t^thpiece of eighth information. Therefore, the matching degrees between the seventh information and the K pieces of eighth information can be obtained. Therefore, these matching degrees can be used as final item recommendation results output by the target model. It can be learned that the target model may perform deeper information mining on a processing result of the multi-head self-attention mechanism, to mine exclusive information of a plurality of behaviors of the user and common information of the plurality of behaviors, thereby constructing a behavior representation of the user. In this case, an item that can match the behavior representation of the user may be used as the target item recommended to the user, so that accuracy of item recommendation can be improved.

In a possible implementation, the K second items include the N first items.

A second aspect of embodiments of the present disclosure provides a model training method. The method includes: inputting N pieces of first information into a to-be-trained model to obtain a predicted item recommendation result, where the to-be-trained model is configured to: obtain the N pieces of first information, where an i^thpiece of first information indicates an i^thfirst item and an i^thbehavior, the i^thbehavior is a behavior of a user for the i^thitem, N behaviors of the user correspond to M categories, i=1, . . . , N, N≥M, and M>1; process the N pieces of first information based on a multi-head self-attention mechanism, to obtain N pieces of second information; and obtain the predicted item recommendation result based on the N pieces of second information, where the predicted item recommendation result is used to determine, from K second items, a target item recommended to the user, and K≥1; obtaining a target loss based on the predicted item recommendation result and a real item recommendation result, where the target loss indicates a difference between the predicted item recommendation result and the real item recommendation result; and updating a parameter of the to-be-trained model based on the target loss until a model training condition is met, to obtain a target model.

The target model obtained through training in the foregoing method has a function of recommending an item to the user. When the target item of interest needs to be recommended to the user, the N pieces of first information may be first input to the target model. The i^thpiece of first information indicates the i^thfirst item and the i^thbehavior. The i^thbehavior is a behavior of the user for the i^thitem. The N behaviors of the user correspond to M categories. i=1, . . . , N≥M, and M>1. Then, the N pieces of first information may be processed by using the target model based on the multi-head self-attention mechanism, to obtain the N pieces of second information. Finally, the item recommendation result can be obtained by using the target model based on the N pieces of second information. The item recommendation result is used to determine, from the K second items, the target item recommended to the user, and K≥1. In the foregoing process, the N pieces of first information not only indicate N first items, but also indicate the N behaviors that can be classified into the M categories. Therefore, in a process in which the target model processes the N pieces of first information to correspondingly obtain the N pieces of second information, not only mutual impact of a plurality of behaviors belonging to a same category and mutual impact of a plurality of first items may be considered, but also mutual impact of a plurality of behaviors belonging to different categories may be considered. Factors that are considered are comprehensive. Therefore, the item recommendation result output by the target model based on the N pieces of second information can have high accuracy, thereby helping optimize user experience.

In a possible implementation, the to-be-trained model is configured to: perform linear processing on the i^thpiece of first information, to obtain an i^thpiece of Q information, an i^thpiece of K information, and an i^thpiece of V information; and perform an operation on the i^thpiece of Q information, N pieces of K information, N pieces of V information, and N pieces of weight information corresponding to the i^thbehavior, to obtain an i^thpiece of second information, where a j^thpiece of weight information corresponding to the i^thbehavior is determined based on the i^thbehavior and a j^thbehavior, and j=1, . . . , N.

In a possible implementation, the to-be-trained model is further configured to: obtain N pieces of third information, where an i^thpiece of third information indicates the i^thbehavior; and perform an operation on the i^thpiece of third information and the N pieces of third information, to obtain N pieces of fourth information corresponding to the i^thbehavior, where a j^thpiece of fourth information corresponding to the i^thbehavior indicates a distance between the i^thbehavior and the i^thbehavior. The to-be-trained model is configured to perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the i^thbehavior, and the N pieces of fourth information corresponding to the i^thbehavior, to obtain the i^thpiece of second information.

In a possible implementation, the distance between the i^thbehavior and the j^thbehavior includes an interval between an order of the i^thbehavior and an order of the j^thbehavior.

In a possible implementation, the to-be-trained model is configured to: perform feature extraction on the N pieces of second information to obtain fifth information and sixth information, where the fifth information indicates a difference between the N behaviors, and the sixth information indicates a same point between the N behaviors; fuse the fifth information and the sixth information to obtain seventh information, where the seventh information indicates interest distribution of the user; and calculate matching degrees between the seventh information and K pieces of eighth information, where the matching degree is used as the item recommendation result, a t^thpiece of eighth information indicates a t^thsecond item, and t=1, . . . , K.

In a possible implementation, the K second items include the N first items.

A third aspect of embodiments of the present disclosure provides an item recommendation apparatus. The apparatus includes: a first obtaining module configured to obtain N pieces of first information by using a target model, where an i^thpiece of first information indicates an i^thfirst item and an i^thbehavior, the i^thbehavior is a behavior of a user for the i^thitem, N behaviors of the user correspond to M categories, i=1, . . . , N, N≥M, and M>1; a processing module configured to process the N pieces of first information by using the target model based on a multi-head self-attention mechanism, to obtain N pieces of second information; and a second obtaining module configured to obtain an item recommendation result by using the target model based on the N pieces of second information, where the item recommendation result is used to determine, from K second items, a target item recommended to the user, and K≥1.

It can be learned from the foregoing apparatus that, when the target item of interest needs to be recommended to the user, the N pieces of first information may be first input to the target model. The i^thpiece of first information indicates the i^thfirst item and the i^thbehavior. The i^thbehavior is a behavior of the user for the i^thitem. The N behaviors of the user correspond to M categories. i=1, . . . , N≥M, and M>1. Then, the N pieces of first information may be processed by using the target model based on the multi-head self-attention mechanism, to obtain the N pieces of second information. Finally, the item recommendation result can be obtained by using the target model based on the N pieces of second information. The item recommendation result is used to determine, from the K second items, the target item recommended to the user, and K≥1. In the foregoing process, the N pieces of first information not only indicate N first items, but also indicate the N behaviors that can be classified into the M categories. Therefore, in a process in which the target model processes the N pieces of first information to correspondingly obtain the N pieces of second information, not only mutual impact of a plurality of behaviors belonging to a same category and mutual impact of a plurality of first items may be considered, but also mutual impact of a plurality of behaviors belonging to different categories may be considered. Factors that are considered are comprehensive. Therefore, the item recommendation result output by the target model based on the N pieces of second information can have high accuracy, thereby helping optimize user experience.

In a possible implementation, the processing module is configured to: perform linear processing on the i^thpiece of first information by using the target model, to obtain an i^thpiece of Q information, an i^thpiece of K information, and an i^thpiece of V information; and perform an operation on the i^thpiece of Q information, N pieces of K information, N pieces of V information, and N pieces of weight information corresponding to the i^thbehavior by using the target model, to obtain an i^thpiece of second information, where a j^thpiece of weight information corresponding to the i^thbehavior is determined based on the i^thbehavior and a j^thbehavior, and j=1, . . . , N.

In a possible implementation, the apparatus further includes: a third obtaining module configured to obtain N pieces of third information by using the target model, where an i^thpiece of third information indicates the i^thbehavior; an operation module configured to perform an operation on the i^thpiece of third information and the N pieces of third information by using the target model to obtain N pieces of fourth information corresponding to the i^thbehavior, where a j^thpiece of fourth information corresponding to the i^thbehavior indicates a distance between the i^thbehavior and the j^thbehavior; and a processing module configured to perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the i^thbehavior, and the N pieces of fourth information corresponding to the i^thbehavior by using the target model, to obtain the i^thpiece of second information.

In a possible implementation, the distance between the i^thbehavior and the j^thbehavior includes an interval between an order of the i^thbehavior and an order of the j^thbehavior.

In a possible implementation, the second obtaining module is configured to: perform feature extraction on the N pieces of second information by using the target model to obtain fifth information and sixth information, where the fifth information indicates a difference between the N behaviors, and the sixth information indicates a same point between the N behaviors; fuse the fifth information and the sixth information by using the target model to obtain seventh information, where the seventh information indicates interest distribution of the user; and calculate matching degrees between the seventh information and K pieces of eighth information by using the target model, where the matching degree is used as the item recommendation result, a t^thpiece of eighth information indicates a t^thsecond item, and t=1, . . . , K.

In a possible implementation, the K second items include the N first items.

A fourth aspect of embodiments of the present disclosure provides a model training apparatus, where the apparatus includes: a processing module configured to input N pieces of first information into a to-be-trained model to obtain a predicted item recommendation result, where the to-be-trained model is configured to: obtain the N pieces of first information, where an i^thpiece of first information indicates an i^thfirst item and an i^thbehavior, the i^thbehavior is a behavior of a user for the i^thitem, N behaviors of the user correspond to M categories, i=1, . . . , N, N≥M, and M>1; process the N pieces of first information based on a multi-head self-attention mechanism, to obtain N pieces of second information; and obtain the predicted item recommendation result based on the N pieces of second information, where the predicted item recommendation result is used to determine, from K second items, a target item recommended to the user, and K≥1; an obtaining module configured to obtain a target loss based on the predicted item recommendation result and a real item recommendation result, where the target loss indicates a difference between the predicted item recommendation result and the real item recommendation result; and an updating module configured to update a parameter of the to-be-trained model based on the target loss until a model training condition is met, to obtain a target model.

The target model obtained through training by the apparatus has a function of recommending an item to the user. When the target item of interest needs to be recommended to the user, the N pieces of first information may be first input to the target model. The i^thpiece of first information indicates the i^thfirst item and the i^thbehavior. The i^thbehavior is a behavior of the user for the i^thitem. The N behaviors of the user correspond to M categories. i=1, . . . , N≥M, and M>1. Then, the N pieces of first information may be processed by using the target model based on the multi-head self-attention mechanism, to obtain the N pieces of second information. Finally, the item recommendation result can be obtained by using the target model based on the N pieces of second information. The item recommendation result is used to determine, from the K second items, the target item recommended to the user, and K≥1. In the foregoing process, the N pieces of first information not only indicate N first items, but also indicate the N behaviors that can be classified into the M categories. Therefore, in a process in which the target model processes the N pieces of first information to correspondingly obtain the N pieces of second information, not only mutual impact of a plurality of behaviors belonging to a same category and mutual impact of a plurality of first items may be considered, but also mutual impact of a plurality of behaviors belonging to different categories may be considered. Factors that are considered are comprehensive. Therefore, the item recommendation result output by the target model based on the N pieces of second information can have high accuracy, thereby helping optimize user experience.

In a possible implementation, the to-be-trained model is further configured to: obtain N pieces of third information, where an i^thpiece of third information indicates the i^thbehavior; and perform an operation on the i^thpiece of third information and the N pieces of third information to obtain N pieces of fourth information corresponding to the i^thbehavior, where a j^thpiece of fourth information corresponding to the i^thbehavior indicates a distance between the i^thbehavior and the j^thbehavior. The to-be-trained model is configured to perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the i^thbehavior, and the N pieces of fourth information corresponding to the i^thbehavior, to obtain the i^thpiece of second information.

In a possible implementation, the distance between the i^thbehavior and the j^thbehavior includes an interval between an order of the i^thbehavior and an order of the j^thbehavior.

In a possible implementation, the K second items include the N first items.

A fifth aspect of embodiments of the present disclosure provides an item recommendation apparatus. The apparatus includes a memory and a processor. The memory stores code, and the processor is configured to execute the code. When the code is executed, the item recommendation apparatus performs the method according to any one of the first aspect or the possible implementations of the first aspect.

A sixth aspect of embodiments of the present disclosure provides a model training apparatus. The apparatus includes a memory and a processor. The memory stores code, and the processor is configured to execute the code. When the code is executed, the model training apparatus performs the method according to any one of the second aspect or the possible implementations of the second aspect.

A seventh aspect of embodiments of the present disclosure provides a circuit system. The circuit system includes a processing circuit. The processing circuit is configured to perform the method according to any one of the first aspect, the possible implementations of the first aspect, the second aspect, or the possible implementations of the second aspect.

An eighth aspect of embodiments of the present disclosure provides a chip system. The chip system includes a processor configured to invoke a computer program or computer instructions stored in a memory, so that the processor performs the method according to any one of the first aspect, the possible implementations of the first aspect, the second aspect, or the possible implementations of the second aspect.

In a possible implementation, the processor is coupled to the memory through an interface.

In a possible implementation, the chip system further includes the memory. The memory stores the computer program or the computer instructions.

A ninth aspect of embodiments of the present disclosure provides a computer storage medium. The computer storage medium stores a computer program. When the program is executed by a computer, the computer is enabled to perform the method according to any one of the first aspect, the possible implementations of the first aspect, the second aspect, or the possible implementations of the second aspect.

A tenth aspect of embodiments of the present disclosure provides a computer program product. The computer program product stores instructions. When the instructions are executed by a computer, the computer is enabled to perform the method according to any one of the first aspect, the possible implementations of the first aspect, the second aspect, or the possible implementations of the second aspect.

In embodiments of the present disclosure, when the target item of interest needs to be recommended to the user, the N pieces of first information may be first input to the target model. The i^thpiece of first information indicates the i^thfirst item and the i^thbehavior. The i^thbehavior is a behavior of the user for the i^thitem. The N behaviors of the user correspond to M categories. i=1, . . . , N≥M, and M>1. Then, the N pieces of first information may be processed by using the target model based on the multi-head self-attention mechanism, to obtain the N pieces of second information. Finally, the item recommendation result can be obtained by using the target model based on the N pieces of second information. The item recommendation result is used to determine, from the K second items, the target item recommended to the user, and K≥1. In the foregoing process, the N pieces of first information not only indicate N first items, but also indicate the N behaviors that can be classified into the M categories. Therefore, in a process in which the target model processes the N pieces of first information to correspondingly obtain the N pieces of second information, not only mutual impact of a plurality of behaviors belonging to a same category and mutual impact of a plurality of first items may be considered, but also mutual impact of a plurality of behaviors belonging to different categories may be considered. Factors that are considered are comprehensive. Therefore, the item recommendation result output by the target model based on the N pieces of second information can have high accuracy, thereby helping optimize user experience.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a diagram of a structure of an artificial intelligence main framework;

FIG. 2A is a diagram of a structure of an item recommendation system according to an embodiment of the present disclosure;

FIG. 2B is a diagram of another structure of an item recommendation system according to an embodiment of the present disclosure;

FIG. 2C is a diagram of an item recommendation related device according to an embodiment of the present disclosure;

FIG. 3 is a diagram of an architecture of a system 100 according to an embodiment of the present disclosure;

FIG. 4 is a schematic flowchart of an item recommendation method according to an embodiment of the present disclosure;

FIG. 5 is a diagram of a structure of a target model according to an embodiment of the present disclosure;

FIG. 6 is a diagram of a structure of a first module and a second module according to an embodiment of the present disclosure;

FIG. 7 is a schematic flowchart of a model training method according to an embodiment of the present disclosure;

FIG. 8 is a diagram of a structure of an item recommendation apparatus according to an embodiment of the present disclosure;

FIG. 9 is a diagram of a structure of a model training apparatus according to an embodiment of the present disclosure;

FIG. 10 is a diagram of a structure of an execution device according to an embodiment of the present disclosure;

FIG. 11 is a diagram of a structure of a training device according to an embodiment of the present disclosure; and

FIG. 12 is a diagram of a structure of a chip according to an embodiment of the present disclosure.

DESCRIPTION OF EMBODIMENTS

With rapid development of computer technologies, to meet an Internet access requirement of a user, developers are increasingly inclined to display content of interest on a page of an application. In view of this, for an application, it is usually necessary to predict which item or items the user purchases, that is, recommend an item of interest to the user, to present these items on the page of the application, thereby providing a service for the user. For example, for shopping software, it is required to predict commodities that the user is inclined to purchase when using the shopping software, that is, recommend commodities of interest to the user, and display these commodities on a page of the shopping software for the user to browse and purchase.

Currently, a neural network model of an AI technology can be used to predict an item that can be recommended to the user. Specifically, historical information of the user may be first collected, and the historical information indicates items that the user has interacted with and behaviors of the user for these items. Because there are a plurality of types of behaviors (for example, various types of behaviors such as a tap behavior, an add-to-favorites behavior, a search behavior, an add-to-cart behavior, and a purchase behavior) of the user for the item, the historical information of the user may be classified based on the types of behaviors, and various types of historical information is separately processed by using the neural network model, to obtain processing results of the various types of historical information. Finally, the processing results of the various types of historical information may be superimposed to obtain an item recommendation result, thereby determining a target item recommended to the user.

In the foregoing process, when the neural network model processes historical information, mutual impact of a plurality of behaviors belonging to a same category is mainly considered, and factors that are considered are simple. As a result, accuracy of the item recommendation result finally output by the model is not high, affecting user experience.

Further, a plurality of behaviors of the user are usually sorted (for example, in a time sequence), and orders of the behaviors usually affect a purchase decision of the user. For example, a purchase behavior of the user for an item some time ago still have a large impact on a current interest of the user in another item (whether to perform a purchase behavior). The foregoing neural network model cannot be aware of the impact. As a result, an item recommendation result output by the neural network model cannot accurately match a real intention of the user, and accuracy of the item recommendation result is also reduced.

Further, in a training process of the neural network model, a function of training data indicating an auxiliary behavior (for example, a tap behavior, an add-to-favorites behavior, a search behavior, and an add-to-cart behavior) is usually ignored, and only training data indicating a main behavior (for example, a purchase behavior) is used to complete model training, leading to poor performance of a model obtained through training.

To resolve the foregoing problem, an embodiment of the present disclosure provides an item recommendation method. The method may be implemented with reference to AI technology. The AI technology is a technical discipline that simulates, extends, and expands human intelligence by using a digital computer or a machine controlled by a digital computer. The AI technology obtains an optimal result by perceiving an environment, obtaining knowledge, and using knowledge. In other words, the AI technology is a branch of computer science, and attempts to understand essence of intelligence and produce a new intelligent machine that can react in a similar manner to human intelligence. Using AI to process data is a common application manner of AI.

An overall working process of an AI system is first described. FIG. 1 is a diagram of a structure of an AI main framework. The following describes the AI main framework from two dimensions: an “intelligent information chain” (a horizontal axis) and an “information technology (IT) value chain” (a vertical axis). The “intelligent information chain” reflects a series of processes from obtaining data to processing the data. For example, the process may be a general process of intelligent information perception, intelligent information representation and formation, intelligent inference, intelligent decision-making, and intelligent execution and output. In this process, the data undergoes a refinement process of “data-information-knowledge-intelligence”. The “IT value chain” reflects a value brought by AI to the IT industry from an underlying infrastructure and information (technology providing and processing implementation) of AI to an industrial ecological process of a system.

(1) Infrastructure

The infrastructure provides computing capability support for the AI system, implements communication with an external world, and implements support by using a basic platform. The infrastructure communicates with the outside by using a sensor. A computing capability is provided by a smart chip (a hardware acceleration chip such as a central processing unit (CPU), a neural processing unit (NPU), a graphics processing unit (GPU), an application-specific integrated circuit (ASIC), or a field-programmable gate array (FPGA)). The basic platform includes related platforms such as a distributed computing framework and a network for assurance and support, including cloud storage and computing, an interconnection network, and the like. For example, the sensor communicates with the outside to obtain data, and the data is provided to a smart chip in a distributed computing system provided by the basic platform for computing.

(2) Data

Data at an upper layer of the infrastructure indicates a data source in the AI field. The data relates to a graph, an image, a speech, and text, further relates to Internet of things (IoT) data of a device, and includes service data of an existing system and perception data such as force, displacement, a liquid level, a temperature, and humidity.

(3) Data Processing

Data processing usually includes data training, machine learning, deep learning, searching, inference, decision making, and the like.

Machine learning and deep learning may mean performing symbolic and formalized intelligent information modeling, extraction, preprocessing, training, and the like on data.

Inference is a process in which human intelligent inference is simulated in a computer or an intelligent system, and machine thinking and problem resolving are performed by using formal information based on an inference control policy. A typical function is searching and matching.

Decision making is a process of making a decision after intelligent information is inferred, and usually provides functions such as classification, ranking, and prediction.

(4) General Capability

After data processing mentioned above is performed on data, some general capabilities may further be formed based on a data processing result, for example, an algorithm or a general system such as translation, text analysis, computer vision processing, speech recognition, and image recognition.

(5) Intelligent Product and Industry Application

The intelligent product and industry application are products and applications of the AI system in various fields. The intelligent product and industry application involve packaging overall AI solutions, to productize and apply intelligent information decision-making. Application fields of the intelligent information decision-making mainly include a smart terminal, smart transportation, smart health care, autonomous driving, a smart city, and the like.

The following describes several application scenarios of the present disclosure.

FIG. 2A is a diagram of a structure of an item recommendation system according to an embodiment of the present disclosure. The item recommendation system includes user equipment and a data processing device. The user equipment includes a smart terminal such as a mobile phone, a personal computer, or an information processing center. The user equipment is an initiating end of item recommendation. As an initiator of an item recommendation request, a user usually initiates the request by using the user equipment.

The data processing device may be a device or a server that has a data processing function, for example, a cloud server, a network server, an application server, or a management server. The data processing device receives the item recommendation request from the smart terminal through an interactive interface, and then performs item recommendation processing in a manner such as machine learning, deep learning, searching, inference, and decision making by using a memory storing data and a processor processing data. The memory in the data processing device may be a general name, and includes a local storage and a database storing historical data. The database may be in the data processing device, or may be in another network server.

In the item recommendation system shown in FIG. 2A, the user equipment may receive an instruction of the user. For example, the user may trigger/select an application on the user equipment, and then initiate a request to the data processing device, so that the data processing device performs item recommendation processing on the application on the user equipment, to obtain an item recommendation result for the application. For example, the user may trigger one (for example, shopping software, music software, or a software mall) of a plurality of applications installed on the user equipment, and then initiate an item recommendation request for the application to the data processing device, so that the data processing device performs item recommendation processing for the application, to obtain an item recommendation result for the application, that is, a probability that each item in the application is purchased by the user. Therefore, based on the item recommendation result, it may be determined that some items with a high probability are target items recommended to the user.

In FIG. 2A, the data processing device may perform the item recommendation method in embodiments of the present disclosure.

FIG. 2B is a diagram of another structure of an item recommendation system according to an embodiment of the present disclosure. In FIG. 2B, the user equipment is directly used as the data processing device. After determining an application triggered/selected by the user, the user equipment can directly perform item recommendation processing on the application. A specific process is similar to that in FIG. 2A. For details, refer to the foregoing descriptions.

In the item recommendation system shown in FIG. 2B, the user equipment may receive an instruction of the user. For example, the user may trigger one of a plurality of applications installed on the user equipment. The user equipment may directly perform item recommendation processing on the application, to obtain an item recommendation result for the application, that is, a probability that each item in the application is purchased by the user. Therefore, based on the item recommendation result, it may be determined that some items with a high probability are target items recommended to the user.

In FIG. 2B, the user equipment may perform the item recommendation method in embodiments of the present disclosure.

FIG. 2C is a diagram of an item recommendation related device according to an embodiment of the present disclosure.

The user equipment in FIG. 2A and FIG. 2B may be specifically a local device 301 or a local device 302 in FIG. 2C. The data processing device in FIG. 2A may be specifically an execution device 210 in FIG. 2C. A data storage system 250 may store to-be-processed data of the execution device 210. The data storage system 250 may be integrated into the execution device 210, or may be disposed on a cloud or another network server.

The processor in FIG. 2A and FIG. 2B may perform data training/machine learning/deep learning by using a neural network model or another model (for example, a model based on a support vector machine), and perform item recommendation processing on an application by using a final model that is obtained through data training or learning, to obtain a corresponding processing result.

FIG. 3 is a diagram of an architecture of a system 100 according to an embodiment of the present disclosure. In FIG. 3, an execution device 110 is configured with an input/output (I/O) interface 112, to exchange data with an external device. The user may input data to the I/O interface 112 by using a client device 140. In this embodiment of the present disclosure, the input data may include: each to-be-scheduled task, a resource that can be invoked, and another parameter.

In a process in which the execution device 110 preprocesses the input data, or in a process in which a calculation module 111 of the execution device 110 performs related processing such as calculation (for example, performs function implementation of a neural network in the present disclosure), the execution device 110 may invoke data, code, and the like in a data storage system 150 for corresponding processing, and may further store, into the data storage system 150, data, an instruction, and the like that are obtained through corresponding processing.

Finally, the I/O interface 112 returns a processing result to the client device 140, to provide the processing result for the user.

It should be noted that, for different objectives or different tasks, a training device 120 may generate corresponding target model/rules based on different training data. The corresponding target model/rules may be used to achieve the foregoing objectives or complete the foregoing tasks, thereby providing a required result for the user. The training data may be stored in a database 130, and is a training sample acquired by a data acquisition device 160.

In a case shown in FIG. 3, the user may manually provide input data and the user may manually provide the input data in an interface provided by the I/O interface 112. In another case, the client device 140 may automatically send the input data to the I/O interface 112. If the client device 140 needs to obtain authorization from the user to automatically send the input data, the user may set corresponding permission in the client device 140. The user may view, on the client device 140, a result output by the execution device 110. The result may be specifically presented in a specific manner of displaying, a sound, an action, or the like. The client device 140 may alternatively be used as a data acquisition end, to acquire, as new sample data, input data input to the I/O interface 112 and an output result output from the I/O interface 112 that are shown in the figure, and store the new sample data in the database 130. Certainly, the client device 140 may alternatively not perform acquisition. Instead, the I/O interface 112 directly stores, in the database 130 as new sample data, the input data input to the I/O interface 112 and the output result output from the I/O interface 112 that are shown in the figure.

It should be noted that FIG. 3 is merely a diagram of a system architecture according to an embodiment of the present disclosure. A position relationship between the devices, the components, the modules, and the like shown in the figure does not constitute any limitation. For example, in FIG. 3, the data storage system 150 is an external memory relative to the execution device 110. In another case, the data storage system 150 may alternatively be disposed in the execution device 110. As shown in FIG. 3, a neural network may be obtained through training based on the training device 120.

An embodiment of the present disclosure further provides a chip. The chip includes a neural-network processing unit NPU. The chip may be disposed in the execution device 110 shown in FIG. 3, to complete calculation work of the calculation module 111. The chip may alternatively be disposed in the training device 120 shown in FIG. 3, to complete training work of the training device 120 and output the target model/rule.

The neural-network processing unit NPU serves as a coprocessor, and may be disposed on a host CPU. The host CPU assigns a task. A core part of the NPU is an operation circuit, and a controller controls the operation circuit to extract data in a memory (a weight memory or an input memory) and perform an operation.

In some implementations, the operation circuit includes a plurality of processing engines (PE) inside. In some implementations, the operation circuit is a two-dimensional systolic array. The operation circuit may alternatively be a one-dimensional systolic array or another electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the operation circuit is a general-purpose matrix processor.

For example, it is assumed that there is an input matrix A, a weight matrix B, and an output matrix C. The operation circuit fetches, from a weight memory, data corresponding to the matrix B, and caches the data on each PE in the operation circuit. The operation circuit fetches data of the matrix A from an input memory, to perform a matrix operation on the matrix B, and stores an obtained partial result or an obtained final result of the matrix in an accumulator.

A vector calculation unit may perform further processing on an output of the operation circuit, for example, vector multiplication, vector addition, an exponent operation, a logarithm operation, or a value comparison. For example, the vector calculation unit may be configured to perform network computation, such as pooling, batch normalization, or local response normalization at a non-convolutional/non-FC layer in the neural network.

In some implementations, the vector calculation unit can store a processed output vector in a unified buffer. For example, the vector calculation unit may apply a non-linear function to the output of the operation circuit, for example, a vector of an accumulated value, to generate an activation value. In some implementations, the vector calculation unit generates a normalized value, a combined value, or both a normalized value and a combined value. In some implementations, the processed output vector can be used as an activation input to the operation circuit, for example, used at a subsequent layer in the neural network.

A unified memory is configured to store input data and output data.

For weight data, a direct memory access controller (DMAC) transfers input data in the external memory to the input memory and/or the unified memory, stores, into the weight memory, weight data in the external memory, and stores, into the external memory, the data in the unified memory.

A bus interface unit (BIU) is configured to implement interaction between the host CPU, the DMAC, and an instruction fetch buffer by using a bus.

The instruction fetch buffer connected to the controller is configured to store instructions used by the controller.

The controller is configured to invoke the instructions buffered in the instruction fetch buffer, to control a working process of an operation accelerator.

Usually, the unified memory, the input memory, the weight memory, and the instruction fetch buffer each are an on-chip memory. The external memory is a memory outside the NPU. The external memory may be a double data rate synchronous dynamic random-access memory (DDR SDRAM), a high bandwidth memory (HBM), or another readable and writable memory.

Embodiments of the present disclosure relate to massive application of a neural network. Therefore, for ease of understanding, the following first describes terms and concepts related to the neural network in embodiments of the present disclosure.

(1) Neural Network

The neural network may include a neuron. The neuron may be an operation unit that uses xs and an intercept of 1 as an input. An output of the operation unit may be as follows:

$\begin{matrix} h_{w, b} (x) = f (W^{T} x) = f (\sum_{s = 1}^{n} W_{s} x_{s} + b) & (1) \end{matrix}$

s=1, 2, . . . , or n. n is a natural number greater than 1. Ws is a weight of xs. b is a bias of the neuron. f is an activation function of the neuron, and is used to introduce a non-linear feature into the neural network to, convert an input signal in the neuron into an output signal. The output signal of the activation function may serve as an input of a next convolutional layer. The activation function may be a sigmoid function. The neural network is a network formed by connecting many single neurons together. To be specific, an output of a neuron may be an input of another neuron. An input of each neuron may be connected to a local receptive field of a previous layer to extract a feature of the local receptive field. The local receptive field may be a region including several neurons.

Work at each layer of the neural network may be described by using a mathematical expression y=a(Wx+b). From a physical layer, work at each layer of the neural network may be understood as completing transformation from input space to output space (namely, from row space to column space of a matrix) by performing five operations on the input space (a set of input vectors). The five operations include: 1. dimension increasing/dimension reduction; 2. scaling up/scaling down; 3. rotation; 4. translation; and 5. “bending”. The operations 1, 2, and 3 are performed by Wx, the operation 4 is performed by +b, and the operation 5 is performed by a custom-character . The word “space” is used herein for expression because a classified object is not a single thing, but a type of things. Space is a collection of all individuals of this type of things. W is a weight vector, and each value in the vector represents a weight value of one neuron at this layer of the neural network. The vector W determines space transformation from the input space to the output space described above. In other words, a weight W at each layer controls how to transform space. A purpose of training the neural network is to finally obtain a weight matrix (a weight matrix formed by vectors W at a plurality of layers) at all layers of a trained neural network. Therefore, a training process of the neural network is essentially a manner of learning of control of space transformation, and more specifically, learning of a weight matrix.

Because it is expected that an output of the neural network is as close as possible to a value that is actually expected to be predicted, a current predicted value of the network may be compared with an actually expected target value, and then a weight vector at each layer of the neural network is updated based on a difference between the current predicted value and the target value (certainly, there is usually an initialization process before a first update, that is, a parameter is preconfigured for each layer of the neural network). For example, if the predicted value of the network is large, the weight vector is adjusted to decrease the predicted value, and adjustment is continuously performed, until the neural network can predict the actually expected target value. Therefore, “how to obtain a difference between the predicted value and the target value through comparison” needs to be predefined. This is a loss function or an objective function. The loss function and the objective function are important equations that measure the difference between the predicted value and the target value. The loss function is used as an example. A higher output value (loss) of the loss function indicates a larger difference. Therefore, training of the neural network is a process of minimizing the loss as much as possible.

(2) Back Propagation Algorithm

In a training process, a neural network may correct a value of a parameter in an initial neural network model by using an error back propagation (BP) algorithm, so that a reconstruction error loss of the neural network model becomes increasingly smaller. Specifically, an input signal is forward transferred until the error loss is generated in an output, and the parameter of the initial neural network model is updated through back propagation of information about the error loss, to converge the error loss. The back propagation algorithm is an error-loss-centered back propagation motion intended to obtain a parameter, such as a weight matrix, of an optimal neural network model.

The following describes the method provided in the present disclosure from a neural network training side and a neural network application side.

The model training method provided in this embodiment of the present disclosure relates to data sequence processing, and may be specifically applied to methods such as data training, machine learning, and deep learning, to perform symbolic and formalized intelligent information modeling, extraction, preprocessing, training, and the like on training data (for example, N pieces of first information and N pieces of third information in the model training method provided in this embodiment of the present disclosure), and finally obtain a trained neural network (for example, a target model in the model training method provided in this embodiment of the present disclosure). In addition, in the item recommendation method provided in this embodiment of the present disclosure, input data (for example, the N pieces of first information and the N pieces of third information in the item recommendation method provided in this embodiment of the present disclosure) may be input into the trained neural network by using the foregoing trained neural network, to obtain output data (for example, the item recommendation result in the item recommendation method provided in this embodiment of the present disclosure). It should be noted that, the model training method and the item recommendation method provided in embodiments of the present disclosure are based on a same idea, or may be understood as two parts in a system, or two phases of an entire procedure, for example, a model training phase and a model application phase.

FIG. 4 is a schematic flowchart of an item recommendation method according to an embodiment of the present disclosure. As shown in FIG. 4, the method includes the following steps.

401: Obtain N pieces of first information, where an i^thpiece of first information indicates an i^thfirst item and an i^thbehavior, the i^thbehavior is a behavior of a user for the i^thitem, N behaviors of the user correspond to M categories, i=1, . . . , N, N≥M, and M>1.

In this embodiment, when a user uses an application, to display an item of interest on a page of the application, some historical data of previously using the application by the user may be first acquired. The historical data may include attribute information (for example, a name of a first item, a price of the first item, a function of the first item, and a category of the first item) of N first items (which may also be referred to as historical items) that the user has operated on the application, and related information (for example, a type of a behavior) of the N behaviors performed by the user when the user operates the N first items. It should be noted that, based on the related information of the N behaviors, the N behaviors may be classified into M categories, and one category may include at least one behavior. For example, for shopping software, it is assumed that the user once taps three commodities, adds two commodities to favorites, and purchases two commodities on the software. It can be learned that the user has operated seven commodities, and therefore, the user correspondingly performs seven behaviors. The seven behaviors may be classified into three types. A first type includes three tap behaviors, a second type includes two add-to-favorites behaviors, and a third type includes two purchase behaviors. In this case, when historical data of the user for the software is acquired, the historical data includes attribute information of the seven commodities and related information of the seven behaviors.

In this case, the attribute information of the N first items may be separately mapped to latent space, to correspondingly obtain vector representations of the N first items, where a vector representation of the i^thfirst item indicates the i^thfirst item. Similarly, the related information of the N behaviors may be separately mapped to the latent space, to correspondingly obtain vector representations (that is, N pieces of third information) of the N behaviors, where a vector representation of the i^thbehavior (that is, an i^thpiece of third information) indicates a behavior of the user for the i^thfirst item. For example, after the attribute information of the N first items is mapped, vector representations x=[x₁, x₂, . . . , x_N] of the N first items may be obtained. A vector representation x₁of a 1^stfirst item indicates the 1^stfirst item, a vector representation x₂of a 2^ndfirst item indicates the 2^ndfirst item, . . . , and a vector representation x_Nof an N^thfirst item indicates the N^thfirst item. Similarly, after the related information of the N behaviors is mapped, vector representations b=[b₁, b₂, . . . , b_N] of the N behaviors may be obtained. A vector representation b₁of a 1^stbehavior (that is, a 1^stpiece of third information) indicates a behavior of the user for the 1^stfirst item, a vector representation b₂of a 2^ndbehavior (that is, a 2^ndpiece of third information) indicates a behavior of the user for the 2^ndfirst item, . . . , and a vector representation b_Nof an N^thbehavior (that is, an N^thpiece of third information) indicates a behavior of the user for the N^thfirst item.

Based on this, the vector representation of the i^thfirst item and the vector representation of the i^thbehavior may be spliced to obtain the i^thpiece of first information. The i^thpiece of first information indicates the i^thfirst item and a behavior of the user for the i^thfirst item. Similar splicing operations may also be performed on vector representations of remaining first items and vector representations of remaining behaviors, so that the N pieces of first information can be obtained. Still as in the foregoing example, the vector representation x₁of the 1^stfirst item and the vector representation b₁of the 1^stbehavior may be spliced to obtain a 1^stpiece of first information h₁, the vector representation of the 2^ndfirst item and the vector representation of the 2^ndbehavior may be spliced to obtain a 2^ndpiece of first information h₂, . . . , and the vector representation of the N^thfirst item and the vector representation of the N^thbehavior are spliced to obtain an N^thpiece of first information h_N. In this way, the N pieces of first information H=[h₁, h₂, . . . , h_N] are obtained.

After the N pieces of first information are obtained, the N pieces of first information may be input into a target model (a trained neural network model), to process the N pieces of first information by using the target model and obtain an item recommendation result.

402: Process the N pieces of first information by using the target model based on a multi-head self-attention mechanism, to obtain N pieces of second information.

The target model obtains the N pieces of first information, and the N pieces of first information may be processed based on the multi-head self-attention mechanism to obtain the N pieces of second information.

It should be noted that, as shown in FIG. 5 (FIG. 5 is a diagram of a structure of a target model according to an embodiment of the present disclosure), the target model may include a first module, a second module, and a third module. A first input end of the first module is used as a first input end of the entire target model, and is configured to receive the N pieces of first information. A second input end of the first module is used as a second input end of the target model, and is configured to receive the N pieces of third information. An input end of the second module is used as a third input end of the target model, and is configured to receive the N pieces of third information. A first input end of the third module is used as a fourth input end of the target model, and is configured to receive the N pieces of third information. A third input end of the first module is connected to an output end of the second module, an output end of the first module is connected to a second input end of the third module, and an output end of the third module is used as an output end of the target model.

Specifically, the first module of the target model may process the N pieces of first information based on the multi-head self-attention mechanism in the following manners, to obtain the N pieces of second information.

(1) After the N pieces of first information are received, for any one of the N pieces of first information, that is, the i^thpiece of first information, the first module may first perform linear processing on the i^thpiece of first information, to obtain an i^thpiece of Q information, an i^thpiece of K information, and an i^thpiece of V information. For the remaining first information other than the i^thpiece of first information, the first module may further perform an operation similar to that performed on the i^thpiece of first information. Therefore, a total of N pieces of Q information, N pieces of K information, and N pieces of V information can be obtained. To be specific, the first module may perform linear processing on the 1^stpiece of first information, to obtain a 1^stpiece of Q information, a 1^stpiece of K information, and a 1^stpiece of V information, may further perform linear processing on the 2^ndpiece of first information, to obtain a 2^ndpiece of Q information, a 2^ndpiece of K information, a 2^ndpiece of V information, . . . , and may further perform linear processing on the N^thpiece of first information, to obtain an N^thpiece of Q information, an N^thpiece of K information, and an N^thpiece of V information.

(2) For the i^thpiece of Q information, the first module may perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, and N pieces of weight information corresponding to the i^thbehavior, to obtain an i^thpiece of second information. A j^thpiece of weight information corresponding to the i^thbehavior is determined based on the i^thbehavior and a j^thbehavior, and j=1, . . . , N. For the remaining Q information other than the i^thpiece of Q information, the first module may further perform an operation similar to that performed on the i^thpiece of Q information. Therefore, the N pieces of second information can be obtained. To be specific, the first module may first perform an operation on the 1^stpiece of Q information, the N pieces of K information, the N pieces of V information, and N pieces of weight information corresponding to a 1^stbehavior, to obtain a 1^stpiece of second information, the first module may further perform an operation on the 2^ndpiece of Q information, the N pieces of K information, the N pieces of V information, and N pieces of weight information corresponding to a 2^ndbehavior, to obtain a 2^ndpiece of second information, . . . , and the first module may further perform an operation on the N^thpiece of Q information, the N pieces of K information, the N pieces of V information, and N pieces of weight information corresponding to an N^thbehavior, to obtain an N^thpiece of second information. A 1^stpiece of weight information corresponding to the 1st behavior is determined based on the 1^stbehavior, a 2^ndpiece of weight information corresponding to the 1^stbehavior is determined based on the 1^stbehavior and the 2^ndbehavior, . . . , an N^thpiece of weight information of the 1^stbehavior is determined based on the 1^stbehavior and the N^thbehavior, . . . , a 1^stpiece of weight information corresponding to the N^thbehavior is determined based on the N^thbehavior and the 1^stbehavior, a 2^ndpiece of weight information corresponding to the N^thbehavior is determined based on the N^thbehavior and the 2^ndbehavior, . . . , and an N^thpiece of weight information of the N^thbehavior is determined based on the N^thbehavior.

Further, the second module of the target model may further cooperate with the first module of the target model to jointly obtain the N pieces of second information.

(1) After the N pieces of third information are received, for any one of the N pieces of third information, that is, the i^thpiece of third information, the second module may perform an operation on the i^thpiece of third information and the N pieces of third information, to obtain N pieces of fourth information corresponding to the i^thbehavior. A j^thpiece of fourth information corresponding to the i^thbehavior indicates a distance between the i^thbehavior and the j^thbehavior. For the remaining third information other than the i^thpiece of third information, the second module may also perform an operation similar to that performed on the i^thpiece of third information. Therefore, a total of N pieces of fourth information corresponding to the 1^stbehavior, N pieces of fourth information corresponding to the 2^ndbehavior, . . . , and N pieces of fourth information corresponding to the N^thbehavior can be obtained. To be specific, the second module may perform an operation on the 1^stpiece of third information and the 1^stpiece of third information to obtain a 1^stpiece of fourth information (indicating a distance between 1^stbehaviors) corresponding to the 1^stbehavior, may further perform an operation on the 1^stpiece of third information and the 2^ndpiece of third information to obtain a 2^ndpiece of fourth information (indicating a distance between the 1^stbehavior and the 2^ndbehavior) corresponding to the 1^stbehavior, . . . , may further perform an operation on the 1^stpiece of third information and an N^thpiece of third information to obtain an N^thpiece of fourth information (indicating a distance between the 1^stbehavior and the N^thbehavior) corresponding to the 1^stbehavior, . . . , may further perform an operation on the N^thpiece of third information and the 1^stpiece of third information to obtain a 1^stpiece of fourth information (indicating a distance between the N^thbehavior and the 1^stbehavior) corresponding to the N^thbehavior, may further perform an operation on the N^thpiece of third information and the 2^ndpiece of third information to obtain a 2^ndpiece of fourth information (indicating a distance between the N^thbehavior and the 2^ndbehavior) corresponding to the N^thbehavior, . . . , and may further perform an operation on the N^thpiece of third information and the N^thpiece of third information to obtain an N^thpiece of fourth information (indicating a distance between N^thbehaviors) corresponding to the N^thbehavior.

(2) After the N pieces of fourth information corresponding to the 1^stbehavior, the N pieces of fourth information corresponding to the 2^ndbehavior, . . . , and the N pieces of fourth information corresponding to the N^thbehavior from the second module are received, for the i^thpiece of Q information, the first module may perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the i^thbehavior, and the N pieces of fourth information corresponding to the i^thbehavior, to obtain the i^thpiece of second information. For the remaining Q information other than the i^thpiece of Q information, the first module may further perform an operation similar to that performed on the i^thpiece of Q information. Therefore, the N pieces of second information can be obtained. To be specific, the first module may first perform an operation on the 1^stpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the 1^stbehavior, and the N pieces of fourth information corresponding to the 1^stbehavior to obtain the 1^stpiece of second information, the first module may further perform an operation on the 2^ndpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the 2^ndbehavior, and the N pieces of fourth information corresponding to the 2^ndbehavior to obtain the 2^ndpiece of second information, . . . , and the first module may further perform an operation on the N^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the N^thbehavior, and the N pieces of fourth information corresponding to the N^thbehavior to obtain the N^thpiece of second information.

Further, the distance between the i^thbehavior and the j^thbehavior includes an interval between an order of the i^thbehavior and an order of the j^thbehavior, for example, an interval between a time at which the user performs the i^thbehavior and a time at which the user performs the j^thbehavior.

Still as in the foregoing example, as shown in FIG. 6 (FIG. 6 is a diagram of a structure of a first module and a second module according to an embodiment of the present disclosure, and FIG. 6 is drawn based on FIG. 5), the first module of the target model may include N groups of self-attention modules and a multilayer perceptron (MLP) module, the N groups of self-attention modules are all connected to the MLP module, and each group of self-attention modules may include N self-attention modules. In addition, in the N groups of self-attention modules, a 1^stgroup of self-attention modules are configured to process the 1^stpiece of first information h₁to obtain a 1^stpiece of twelfth information g₁, a 2^ndgroup of self-attention modules are configured to process the 2^ndpiece of first information h₂to obtain a 2^ndpiece of twelfth information g₂, . . . , and an N^thgroup of self-attention modules are configured to process the N^thpiece of first information h_Nto obtain an N^thpiece of twelfth information g_N. The MLP module may separately process the 1^stpiece of twelfth information g₁, the 2^ndpiece of twelfth information g₂, . . . , and the N^thpiece of twelfth information g_N, to correspondingly obtain the 1^stpiece of second information h₁, the 2^ndpiece of second information h₂, . . . and the N^thpiece of second information h_N.

The following describes working processes of the N groups of self-attention modules and the MLP module. Because the working processes of the N groups of self-attention modules are similar, for ease of description, the following describes any one of the N groups of self-attention modules, that is, an i^thgroup of self-attention modules.

For a j^thself-attention module in the i^thgroup of self-attention modules, the i^thpiece of first information h_iand a j^thpiece of first information h_jare input. The self-attention module may first perform linear processing on the i^thpiece of first information h_ito obtain the i^thpiece of Q information q_i, the i^thpiece of K information k_i, and the i^thpiece of V information v_i, and perform linear processing on the j^thpiece of first information h_jto obtain a j^thpiece of Q information q_j, a j^thpiece of K information k_j, and a j^thpiece of V information v_j.

Then, the module may multiply the i^thpiece of Q information q_i, the j^thpiece of K information k_j, and a j^thfirst weight matrix W_(b_i_,b_j₎^attcorresponding to the i^thbehavior, to obtain a j^thpiece of ninth information F[i, j] corresponding to the i^thbehavior. The information is expressed as the following formula:

$\begin{matrix} F [i, j] = q_{i} W_{(b_{i}, b_{j})}^{att} k_{j} & (2) \end{matrix}$

At the same time, the module may further receive the j^thpiece of fourth information P₁[i, j] corresponding to the i^thbehavior from the second module. The j^thpiece of fourth information P₁[i, j] corresponding to the i^thbehavior may be obtained by the second module by performing an operation on the i^thpiece of third information b_iand a j^thpiece of third information b_j. An operation process is expressed as the following formula:

$\begin{matrix} P_{1} [i, j] = f_{(b_{i}, b_{j})} (j - i) & (3) \end{matrix}$

In the foregoing formula, (j−i) may be determined based on the i^thpiece of third information b_iand the j^thpiece of third information b_j, and indicates an interval between an order of the i^thbehavior and an order of the j^thbehavior.

Then, the module may perform fusion (for example, addition) on the j^thpiece of ninth information F[i, j] corresponding to the i^thbehavior and the j^thpiece of fourth information P₁[i, j] corresponding to the i^thbehavior, and then perform normalization (for example, softmax processing), to obtain a j^thpiece of tenth information A[i, j] corresponding to the i^thbehavior.

Subsequently, the module may multiply the j^thpiece of tenth information A[i, j] corresponding to the i^thbehavior, a j^thsecond weight matrix W_(b_i_,b_j₎^aggcorresponding to the i^thbehavior, and the j^thpiece of V information v_j, to obtain a j^thpiece of eleventh information R[i, j] corresponding to the i^thbehavior. The information is expressed as the following formula:

$\begin{matrix} R [i, j] = A [i, j] W_{(b_{i}, b_{j})}^{agg} v_{j} & (4) \end{matrix}$

It should be noted that the j^thfirst weight matrix W_(b_i_,b_j₎^attcorresponding to the i^thbehavior and the j^thsecond weight matrix W_(b_i_,b_j₎^aggcorresponding to the i^thbehavior are the j^thpiece of weight information corresponding to the i^thbehavior.

Similarly, the remaining self-attention modules other than the j^thself-attention module in the i^thgroup of self-attention modules may also perform an operation similar to that performed by the j^thself-attention module. Therefore, the i^thgroup of self-attention modules may obtain N pieces of eleventh information R[i]=[R [i, 1], R [i, 2], . . . , R [i, N]] corresponding to the i^thbehavior, and perform weighted summation on R[i, 1], R [i, 2], . . . , R [i, N], to obtain an i^thpiece of twelfth information g_i.

Similarly, the remaining groups of self-attention modules other than the i^thgroup of self-attention modules may also perform an operation similar to that performed by the i^thgroup of self-attention modules. Therefore, the N groups of self-attention modules may obtain N pieces of twelfth information G=[g₁, g₂, . . . , g_N] in total, that is, the 1^stpiece of twelfth information g₁, the 2^ndpiece of twelfth information g₂, . . . , and the N^thpiece of twelfth information g_N.

Finally, after the N pieces of twelfth information G=[g₁, g₂, . . . , g_N] are received, the MLP module may process the N pieces of twelfth information G=[g₁, g₂, . . . , g_N] with reference to the N pieces of third information b=[b₁, b₂, . . . , b_N], to obtain N pieces of second information H=[h₁, h₂, . . . , h_N]. To be specific, the 1^stpiece of third information b₁and the 1^stpiece of twelfth information g₁are processed (for example, feature extraction and non-linear processing) to obtain the 1^stpiece of second information h₁, the 2^ndpiece of third information b₂and the 2^ndpiece of twelfth information g₂are processed to obtain the 2^ndpiece of second information h₂, . . . , and the N^thpiece of third information b_Nand the N^thpiece of twelfth information g_Nare processed (for example, feature extraction and non-linear processing) to obtain the N^thpiece of second information h_N.

403: Obtain an item recommendation result by using the target model based on the N pieces of second information, where the item recommendation result is used to determine, from K second items, a target item recommended to the user, and K≥1.

After the N pieces of second information are obtained, the target model may obtain the item recommendation result based on the N pieces of second information. The item recommendation result may be used to determine, from the K second items (which may also be understood as candidate items), the target item recommended to the user, and K≥1. Usually, the K second items include the N first items.

Specifically, the third module of the target model may obtain the item recommendation result in the following manners.

(1) After the N pieces of second information from the first module are received, a first expert network of the third module may perform feature extraction on the N pieces of second information to obtain fifth information. The fifth information includes an exclusive characteristic of each of the N behaviors. Therefore, the fifth information may be used to indicate a difference between the N behaviors. At the same time, a second expert network of the third module may perform feature extraction on the N pieces of second information to obtain sixth information. The sixth information includes a common characteristic of the N behaviors. Therefore, the sixth information may indicate a same point between the N behaviors.

(2) After the fifth information and the sixth information are obtained, the third module may perform weighted summation on the fifth information and the sixth information (a used weight may be determined by the third module based on the N pieces of third information), to obtain seventh information. The seventh information is a behavior representation of the user. Therefore, the seventh information may indicate interest distribution of the user.

(3) After the seventh information is obtained, the third module may further obtain K pieces of eighth information. A t^thpiece of eighth information indicates a t^thsecond item, and t=1, . . . , K. In the K pieces of eighth information, the third module may calculate a matching degree between the seventh information and the t^thpiece of eighth information. For eighth information other than the t^thpiece of eighth information, the third module may also perform an operation similar to that performed on the t^thpiece of eighth information. Therefore, matching degrees between the seventh information and the K pieces of eighth information can be obtained. Therefore, these matching degrees can be used as final item recommendation results output by the target model.

In this way, based on the item recommendation result, it may be determined that some second items with a high matching degree are target items recommended to the user.

In addition, the target module provided in this embodiment of the present disclosure may be further compared with a neural network model provided in a related technology, for comparison of performance of these models on different datasets. Comparison results are shown in Table 1.

TABLE 1

Model

Dataset 1
Dataset 2
Dataset 3

Indicator

Indicator 1
Indicator 2
Indicator 1
Indicator 2
Indicator 1
Indicator 2

Related
0.755
0.481
0.262
0.153
0.285
0.185

technology 1

Related
0.756
0.485
0.305
0.189
0.392
0.250

technology 2

Related
0.789
0.500
0.302
0.185
0.461
0.292

technology 3

Related
0.810
0.513
0.373
0.235
0.443
0.282

technology 4

Related
0.796
0.504
0.372
0.221
0.597
0.406

technology 5

Related
0.816
0.531
0.385
0.234
0.605
0.431

technology 6

Related
0.793
0.492
0.374
0.221
0.481
0.307

technology 7

Related
0.872
0.585
0.391
0.243
0.486
0.317

technology 8

Related
0.790
0.478
0.332
0.179
0.481
0.304

technology 9

Related
0.826
0.530
0.354
0.209
0.489
0.309

technology 10

Related
0.796
0.502
0.369
0.222
0.463
0.277

technology 11

Related
0.87
0.582
0.491
0.300
0.532
0.345

technology 12

Related
0.791
0.500
0.317
0.178
0.475
0.296

technology 13

Related
0.819
0.531
0.637
0.442
0.795
0.611

technology 14

Related
0.838
0.558
0.675
0.476
0.816
0.632

technology 15

Related
0.652
0.515
0.666
0.415
0.682
0.513

technology 16

Embodiments
0.882
0.624
0.768
0.608
0.879
0.713

of the present

disclosure

Performance
1.15%
6.67%
13.78%
27.73%
7.72%
12.82%

improvement

It can be learned from Table 1 that, in terms of recommendation accuracy, the target model provided in embodiments of the present disclosure can obtain an optimal experimental result on both indicators, which proves effectiveness of the item recommendation manner provided in embodiments of the present disclosure.

Further, in a process of processing the N pieces of first information based on the multi-head self-attention mechanism, the target model further considers impact caused by an interval (for example, an interval between times at which the user performs different behaviors) between orders of different behaviors. Factors that are considered are more comprehensive in comparison with a related technology. The item recommendation result output by the target model may also accurately fit a real intention of the user, thereby further improving accuracy of the item recommendation result.

The foregoing describes in detail the item recommendation method provided in embodiments of the present disclosure. The following describes a model training method provided in embodiments of the present disclosure. FIG. 7 is a schematic flowchart of a model training method according to an embodiment of the present disclosure. As shown in FIG. 7, the method includes the following steps.

701: Input N pieces of first information into a to-be-trained model to obtain a predicted item recommendation result, where the to-be-trained model is configured to: obtain the N pieces of first information, where an i^thpiece of first information indicates an i^thfirst item and an i^thbehavior, the i^thbehavior is a behavior of a user for the i^thitem, N behaviors of the user correspond to M categories, i=1, . . . , N, N≥M, and M>1; process the N pieces of first information based on a multi-head self-attention mechanism, to obtain N pieces of second information; and obtain the predicted item recommendation result based on the N pieces of second information, where the predicted item recommendation result is used to determine, from K second items, a target item recommended to the user, and K≥1.

In this embodiment, when the to-be-trained model (that is, a neural network model that needs to be trained) needs to be trained, a batch of training data may be first obtained. The batch of training data includes the N pieces of first information. The i^thpiece of first information indicates the i^thfirst item (that is, a historical item) and the i^thbehavior, the i^thbehavior is the behavior of the user for the i^thitem, the N behaviors of the user correspond to the M categories, i=1, . . . , N, N≥M, and M>1. It should be noted that real item recommendation results corresponding to the N pieces of first information are known. Therefore, based on the real item recommendation results, a real item recommended to the user may be determined from the K second items (that is, candidate items).

In this case, after the N pieces of first information are obtained, the N pieces of first information may be input into the to-be-trained model. In this way, after the N pieces of first information are received, the to-be-trained model may process the N pieces of first information based on the multi-head self-attention mechanism, to obtain the N pieces of second information. Then, the to-be-trained model may obtain the predicted item recommendation result based on the N pieces of second information, and the predicted item recommendation result is used to determine, from the K second items, the target item (a predicted item) recommended to the user.

In a possible implementation, the to-be-trained model is further configured to: obtain N pieces of third information, where an i^thpiece of third information indicates the i^thbehavior; and perform an operation on the i^thpiece of third information and the N pieces of third information, to obtain N pieces of fourth information corresponding to the i^thbehavior, where a j^thpiece of fourth information corresponding to the i^thbehavior indicates a distance between the i^thbehavior and the j^thbehavior. The to-be-trained model is configured to perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the i^thbehavior, and the N pieces of fourth information corresponding to the i^thbehavior, to obtain the i^thpiece of second information.

In a possible implementation, the distance between the i^thbehavior and the j^thbehavior includes an interval between an order of the i^thbehavior and an order of the j^thbehavior.

In a possible implementation, the K second items include the N first items.

It should be noted that for descriptions of step 701, refer to related descriptions of step 401 to step 403 in the embodiment shown in FIG. 4.

702: Obtain a target loss based on the predicted item recommendation result and a real item recommendation result, where the target loss indicates a difference between the predicted item recommendation result and the real item recommendation result.

After the predicted item recommendation result output by the to-be-trained model is obtained, because the real item recommendation result is known, the predicted item recommendation result and the real item recommendation result may be calculated by using a preset target loss function, to obtain the target loss. The target loss indicates the difference between the predicted item recommendation result and the real item recommendation result.

703: Update a parameter of the to-be-trained model based on the target loss until a model training condition is met, to obtain a target model.

After the target loss is obtained, the parameter of the to-be-trained model may be updated based on the target loss, and the to-be-trained model obtained after the parameter is updated continues to be trained by using a next batch of training data until the model training condition is met (for example, the target loss is convergent), to obtain the target model in the embodiment shown in FIG. 4.

The target model obtained through training in this embodiment of the present disclosure has a function of recommending an item to the user. When the target item of interest needs to be recommended to the user, the N pieces of first information may be first input to the target model. The i^thpiece of first information indicates the i^thfirst item and the i^thbehavior. The i^thbehavior is a behavior of the user for the i^thitem. The N behaviors of the user correspond to M categories. i=1, . . . , N≥M, and M>1. Then, the N pieces of first information may be processed by using the target model based on the multi-head self-attention mechanism, to obtain the N pieces of second information. Finally, the item recommendation result can be obtained by using the target model based on the N pieces of second information. The item recommendation result is used to determine, from the K second items, the target item recommended to the user, and K≥1. In the foregoing process, the N pieces of first information not only indicate N first items, but also indicate the N behaviors that can be classified into the M categories. Therefore, in a process in which the target model processes the N pieces of first information to correspondingly obtain the N pieces of second information, not only mutual impact of a plurality of behaviors belonging to a same category and mutual impact of a plurality of first items may be considered, but also mutual impact of a plurality of behaviors belonging to different categories may be considered. Factors that are considered are comprehensive. Therefore, the item recommendation result output by the target model based on the N pieces of second information can have high accuracy, thereby helping optimize user experience.

Further, in a training process of the target model, the used training data, that is, the N pieces of first information, indicates the N behaviors that can be classified into the M categories. The N behaviors may include behaviors such as a tap behavior, an add-to-favorites behavior, a search behavior, and an add-to-cart behavior, and may further include a purchase behavior. Therefore, in this embodiment of the present disclosure, not only a function of training data indicating a main behavior in model training is considered, but also a function of training data indicating an auxiliary behavior in model training is considered, so that the target model obtained through training may have good performance.

The foregoing describes in detail the model training method provided in embodiments of the present disclosure. The following describes an item recommendation apparatus and a model training device apparatus provided in embodiments of the present disclosure. FIG. 8 is a diagram of a structure of an item recommendation apparatus according to an embodiment of the present disclosure. As shown in FIG. 8, the apparatus includes: a first obtaining module 801 configured to obtain N pieces of first information by using a target model, where an i^thpiece of first information indicates an i^thfirst item and an i^thbehavior, the i^thbehavior is a behavior of a user for the i^thitem, N behaviors of the user correspond to M categories, i=1, . . . , N, N≥M, and M>1; a processing module 802 configured to process the N pieces of first information by using the target model based on a multi-head self-attention mechanism, to obtain N pieces of second information; and a second obtaining module 803 configured to obtain an item recommendation result by using the target model based on the N pieces of second information, where the item recommendation result is used to determine, from K second items, a target item recommended to the user, and K≥1.

In a possible implementation, the processing module 802 is configured to: perform linear processing on the i^thpiece of first information by using the target model, to obtain an i^thpiece of Q information, an i^thpiece of K information, and an i^thpiece of V information; and perform an operation on the i^thpiece of Q information, N pieces of K information, N pieces of V information, and N pieces of weight information corresponding to the i^thbehavior by using the target model, to obtain an i^thpiece of second information, where a j^thpiece of weight information corresponding to the i^thbehavior is determined based on the i^thbehavior and a j^thbehavior, and j=1, . . . , N.

In a possible implementation, the apparatus further includes: a third obtaining module configured to obtain N pieces of third information by using the target model, where an i^thpiece of third information indicates the i^thbehavior; an operation module configured to perform an operation on the i^thpiece of third information and the N pieces of third information by using the target model to obtain N pieces of fourth information corresponding to the i^thbehavior, where a j^thpiece of fourth information corresponding to the i^thbehavior indicates a distance between the i^thbehavior and the j^thbehavior; and a processing module 802 configured to perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the i^thbehavior, and the N pieces of fourth information corresponding to the i^thbehavior by using the target model, to obtain the i^thpiece of second information.

In a possible implementation, the distance between the i^thbehavior and the j^thbehavior includes an interval between an order of the i^thbehavior and an order of the j^thbehavior.

In a possible implementation, the second obtaining module 803 is configured to: perform feature extraction on the N pieces of second information by using the target model to obtain fifth information and sixth information, where the fifth information indicates a difference between the N behaviors, and the sixth information indicates a same point between the N behaviors; fuse the fifth information and the sixth information by using the target model to obtain seventh information, where the seventh information indicates interest distribution of the user; and calculate matching degrees between the seventh information and K pieces of eighth information by using the target model, where the matching degree is used as the item recommendation result, a t^thpiece of eighth information indicates a t^thsecond item, and t=1, . . . , K.

In a possible implementation, the K second items include the N first items.

FIG. 9 is a diagram of a structure of a model training apparatus according to an embodiment of the present disclosure. As shown in FIG. 9, the apparatus includes: a processing module 901 configured to input N pieces of first information into a to-be-trained model to obtain a predicted item recommendation result, where the to-be-trained model is configured to: obtain the N pieces of first information, where an i^thpiece of first information indicates an i^thfirst item and an i^thbehavior, the i^thbehavior is a behavior of a user for the i^thitem, N behaviors of the user correspond to M categories, i=1, . . . , N, N≥M, and M>1; process the N pieces of first information based on a multi-head self-attention mechanism, to obtain N pieces of second information; and obtain the predicted item recommendation result based on the N pieces of second information, where the predicted item recommendation result is used to determine, from K second items, a target item recommended to the user, and K≥1; an obtaining module 902 configured to obtain a target loss based on the predicted item recommendation result and a real item recommendation result, where the target loss indicates a difference between the predicted item recommendation result and the real item recommendation result; and an updating module 903 configured to update a parameter of the to-be-trained model based on the target loss until a model training condition is met, to obtain a target model.

In a possible implementation, the to-be-trained model is further configured to: obtain N pieces of third information, where an i^thpiece of third information indicates the i^thbehavior; and perform an operation on the i^thpiece of third information and the N pieces of third information, to obtain N pieces of fourth information corresponding to the i^thbehavior, where a j^thpiece of fourth information corresponding to the i^thbehavior indicates a distance between the i^thbehavior and the j^thbehavior. The to-be-trained model is configured to perform an operation on the i^thpiece of Q information, the N pieces of K information, the N pieces of V information, the N pieces of weight information corresponding to the i^thbehavior, and the N pieces of fourth information corresponding to the i^thbehavior, to obtain the i^thpiece of second information.

In a possible implementation, the distance between the i^thbehavior and the j^thbehavior includes an interval between an order of the i^thbehavior and an order of the j^thbehavior. In a possible implementation, the to-be-trained model is configured to: perform feature extraction on the N pieces of second information to obtain fifth information and sixth information, where the fifth information indicates a difference between the N behaviors, and the sixth information indicates a same point between the N behaviors; fuse the fifth information and the sixth information to obtain seventh information, where the seventh information indicates interest distribution of the user; and calculate matching degrees between the seventh information and K pieces of eighth information, where the matching degree is used as the item recommendation result, a t^thpiece of eighth information indicates a t^thsecond item, and t=1, . . . , K.

In a possible implementation, the K second items include the N first items.

It should be noted that content such as information exchange between the modules/units of the apparatuses and an execution process is based on the same concept as that of the method embodiments of the present disclosure, and produces the same technical effects as those of the method embodiments of the present disclosure. For specific content, refer to the foregoing descriptions in the method embodiments of the present disclosure.

An embodiment of the present disclosure further relates to an execution device. FIG. 10 is a diagram of a structure of an execution device according to an embodiment of the present disclosure. As shown in FIG. 10, the execution device 1000 may be specifically represented as a mobile phone, a tablet computer, a notebook computer, an intelligent wearable device, a server, or the like. This is not limited herein. The item recommendation apparatus described in the embodiment corresponding to FIG. 8 may be deployed on the execution device 1000, and is configured to implement a function of item recommendation in the embodiment corresponding to FIG. 4. Specifically, the execution device 1000 includes: a receiver 1001, a transmitter 1002, a processor 1003, and a memory 1004 (there may be one or more processors 1003 in the execution device 1000, and one processor is used as an example in FIG. 10). The processor 1003 may include an application processor 10031 and a communication processor 10032. In some embodiments of the present disclosure, the receiver 1001, the transmitter 1002, the processor 1003, and the memory 1004 may be connected through a bus or in another manner.

The memory 1004 may include a read-only memory and a random access memory, and provide instructions and data for the processor 1003. A part of the memory 1004 may further include a non-volatile random-access memory (NVRAM). The memory 1004 stores a processor and operation instructions, an executable module or a data structure, a subnet thereof, or an extended set thereof. The operation instructions may include various operation instructions for various operations.

The processor 1003 controls an operation of the execution device. During specific application, the components of the execution device are coupled together through a bus system. In addition to a data bus, the bus system may further include a power bus, a control bus, a status signal bus, and the like. However, for clear description, various types of buses in the figure are referred to as the bus system.

The method disclosed in embodiments of the present disclosure may be applied to the processor 1003, or may be implemented by the processor 1003. The processor 1003 may be an integrated circuit chip, and has a signal processing capability. In an implementation process, steps in the methods can be implemented by using a hardware integrated logic circuit in the processor 1003, or by using instructions in a form of software. The processor 1003 may be a general-purpose processor, a digital signal processor (DSP), a microprocessor, or a microcontroller. The processor 1003 may further include an ASIC, an FPGA) or another programmable logic device, a discrete gate, a transistor logic device, or a discrete hardware component. The processor 1003 may implement or perform the methods, the steps, and logical block diagrams that are disclosed in embodiments of the present disclosure. The general-purpose processor may be a microprocessor, or the processor may be any general processor or the like. The steps in the methods disclosed with reference to embodiments of the present disclosure may be directly performed and completed by a hardware decoding processor, or may be performed and completed by using a combination of hardware in the decoding processor and a software module. The software module may be located in a mature storage medium in the art, such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory, an electrically erasable programmable memory, or a register. The storage medium is located in the memory 1004, and the processor 1003 reads information in the memory 1004 and completes the steps in the foregoing methods in combination with hardware of the processor.

The receiver 1001 may be configured to receive input digital or character information, and generate a signal input related to setting and function control of the execution device. The transmitter 1002 may be configured to output the digital or character information through a first interface. The transmitter 1002 may be configured to send instructions to a disk group through the first interface, to modify data in the disk group. The transmitter 1002 may further include a display device such as a display.

In this embodiment of the present disclosure, in one case, the processor 1003 is configured to process the user-associated information by using the target model in the embodiment corresponding to FIG. 4, to obtain the item recommendation result.

An embodiment of the present disclosure further relates to a training device. FIG. 11 is a diagram of a structure of a training device according to an embodiment of the present disclosure. As shown in FIG. 11, the training device 1100 is implemented by one or more servers. The training device 1100 may greatly differ due to different configurations or performance, and may include one or more CPUs 1114 (for example, one or more processors), a memory 1132, one or more storage media 1130 (for example, one or more mass storage devices) that store an application 1142 or data 1144. The memory 1132 and the storage medium 1130 may perform transitory storage or persistent storage. A program stored in the storage medium 1130 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the training device. Further, the central processing unit 1114 may be configured to communicate with the storage medium 1130, and perform the series of instruction operations in the storage medium 1130 on the training device 1100.

The training device 1100 may further include one or more power supplies 1126, one or more wired or wireless network interfaces 1150, one or more input/output interfaces 1158, or one or more operating systems 1141, for example, Windows Server™, Mac OS X™, Unix™, Linux™ and FreeBSD™.

Specifically, the training device may perform the model training method in the embodiment corresponding to FIG. 7.

An embodiment of the present disclosure further relates to a computer-readable storage medium. The computer-readable storage medium stores a program used for signal processing. When the program runs on a computer, the computer is enabled to perform the steps performed by the foregoing execution device, or the computer is enabled to perform the steps performed by the foregoing training device.

An embodiment of the present disclosure further relates to a computer program product. The computer program product stores instructions. When the instructions are executed by a computer, the computer is enabled to perform the steps performed by the foregoing execution device, or the computer is enabled to perform the steps performed by the foregoing training device.

The execution device, the training device, or the terminal device provided in embodiments of the present disclosure may be specifically a chip. The chip includes a processing unit and a communication unit. The processing unit may be, for example, a processor. The communication unit may be, for example, an input/output interface, a pin, or a circuit. The processing unit may execute computer-executable instructions stored in a storage unit, so that a chip in the execution device performs the item recommendation method described in embodiments, or a chip in the training device performs the model training method described in embodiments. Optionally, the storage unit is a storage unit in the chip, for example, a register or a cache. Alternatively, the storage unit may be a storage unit in a wireless access device but outside the chip, for example, a read-only memory (ROM), another type of static storage device that can store static information and instructions, or a random-access memory (RAM).

Specifically, FIG. 12 is a diagram of a structure of a chip according to an embodiment of the present disclosure. The chip may be represented as a neural-network processing unit NPU 1200. The NPU 1200 is mounted to a host CPU as a coprocessor, and the host CPU allocates a task. A core part of the NPU is an operation circuit 1203, and a controller 1204 controls the operation circuit 1203 to extract matrix data in a memory and perform a multiplication operation.

In some implementations, the operation circuit 1203 includes a plurality of processing engines (PE) inside. In some implementations, the operation circuit 1203 is a two-dimensional systolic array. The operation circuit 1203 may alternatively be a one-dimensional systolic array or another electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the operation circuit 1203 is a general-purpose matrix processor.

For example, it is assumed that there is an input matrix A, a weight matrix B, and an output matrix C. The operation circuit fetches, from a weight memory 1202, data corresponding to the matrix B, and caches the data on each PE in the operation circuit. The operation circuit fetches data of the matrix A from an input memory 1201, to perform a matrix operation on the matrix B, and stores an obtained partial result or an obtained final result of the matrix in an accumulator 1208.

A unified memory 1206 is configured to store input data and output data. Weight data is directly transferred to the weight memory 1202 by using a direct memory access controller (DMAC) 1205. The input data is also transferred to the unified memory 1206 by using the DMAC.

A BIU is a bus interface unit, namely, a bus interface unit 1213, and is configured to perform interaction between an AXI bus and the DMAC and between the AXI bus and an instruction fetch buffer (IFB) 1209.

The bus interface unit (BIU) 1213 is used by the instruction fetch buffer 1209 to obtain instructions from an external memory, and is further used by the direct memory access controller 1205 to obtain original data of the input matrix A or the weight matrix B from the external memory.

The DMAC is mainly configured to transfer input data in the external memory DDR to the unified memory 1206, transfer weight data to the weight memory 1202, or transfer input data to the input memory 1201.

A vector calculation unit 1207 includes a plurality of operation processing units. If required, further processing is performed on an output of the operation circuit 1203, for example, vector multiplication, vector addition, an exponential operation, a logarithmic operation, or a value comparison. The vector calculation unit 1207 is mainly configured to perform network computation at a non-convolutional/fully-connected layer of a neural network, for example, batch normalization, pixel-level summation, and upsampling of a predicted label plane.

In some implementations, the vector calculation unit 1207 can store a processed output vector in the unified memory 1206. For example, the vector calculation unit 1207 may apply a linear function or a non-linear function to the output of the operation circuit 1203. For example, linear interpolation is performed on a predicted label plane extracted at a convolutional layer. For another example, vectors whose values are accumulated are used to generate an activation value. In some implementations, the vector calculation unit 1207 generates a normalized value, a pixel-level summation value, or both a normalized value and a pixel-level summation value. In some implementations, the processed output vector can be used as an activation input to the operation circuit 1203, for example, used at a subsequent layer in the neural network.

The instruction fetch buffer 1209 connected to the controller 1204 is configured to store instructions used by the controller 1204.

The unified memory 1206, the input memory 1201, the weight memory 1202, and the instruction fetch buffer 1209 are all on-chip memories. The external memory is private to a hardware architecture of the NPU.

Any one of the processors mentioned above may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling program execution.

In addition, it should be noted that the described apparatus embodiment is merely an example. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all the modules may be selected according to actual needs to achieve the objectives of the solutions of embodiments. In addition, in the accompanying drawings of the apparatus embodiments provided in the present disclosure, a connection relationship between modules indicates that the modules have a communication connection with each other, and may be specifically implemented as one or more communication buses or signal cables.

Based on the description of the foregoing implementations, a person skilled in the art may clearly understand that the present disclosure may be implemented by software in addition to necessary universal hardware, or by dedicated hardware, including an ASIC, a dedicated CPU, a dedicated memory, a dedicated component, and the like. Usually, any function implemented by a computer program may be easily implemented by using corresponding hardware. In addition, specific hardware structures used to implement a same function may be various, for example, an analog circuit, a digital circuit, or a dedicated circuit. However, in the present disclosure, a software program implementation is a better implementation in most cases. Based on such an understanding, the technical solutions of the present disclosure may be implemented in a form of a software product. The computer software product is stored in a readable storage medium, such as a floppy disk, a Universal Serial Bus (USB) flash drive, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disc of a computer, and includes several instructions for instructing a computer device (which may be a personal computer, a training device, a network device, or the like) to perform the methods in embodiments of the present disclosure.

All or some of the foregoing embodiments may be implemented by using software, hardware, firmware, or any combination thereof. When software is used to implement embodiments, the foregoing embodiments may be implemented completely or partially in a form of a computer program product.

The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the procedures or functions according to embodiments of the present disclosure are completely or partially generated. The computer may be a general-purpose computer, a dedicated computer, a computer network, or other programmable apparatuses. The computer instructions may be stored in a computer-readable storage medium, or may be transmitted from a computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, a computer, a training device, or a data center to another website, computer, training device, or data center in a wired (for example, a coaxial cable, an optical fiber, or a digital subscriber line (DSL)) or wireless (for example, infrared, radio, or microwave) manner. The computer-readable storage medium may be any usable medium that can be stored by a computer, or a data storage device, for example, a training device or a data center, integrating one or more usable media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a digital versatile disc (DVD)), a semiconductor medium (for example, a solid state drive (SSD)), or the like.

	Number	Date	Country
Parent	PCT/CN2023/101248	Jun 2023	WO
Child	18989318		US

Item Recommendation Method and Related Device Thereof

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

CROSS-REFERENCE TO RELATED APPLICATIONS

Continuations (1)