The present invention relates to a ranking function generating apparatus, a ranking function generating method, and a program.
As a technology for improving ranking of items with respect to search queries in a search system, a technology of using training data on a plurality of domains to generate a ranking function is known (PTL 1).
However, since the technology described in PTL 1 mentioned above generates a plurality of ranking functions, parameters used when these ranking functions are integrated need to be determined using a method such as cross-validation.
An aspect of the present invention has been devised in view of the foregoing point, and an object thereof is to generate ranking functions for a plurality of domains.
To attain the object, a ranking function generating apparatus according to the aspect includes a training data production unit configured to produce training data including at least a first search log related to a first item included in a search result of a search query, a second search log related to a second item included in the search result, and respective domains of the first search log and the second search log; and a learning unit configured to learn, using the training data, parameters of a neural network that implements ranking functions for a plurality of domains through multi-task learning regarding each of the domains as a task.
Ranking functions for a plurality of domains can be generated.
In the following, an embodiment of the present invention will be described. In the present embodiment, a ranking function generating apparatus 10 capable of generating ranking functions for a plurality of domains will be described. More specifically, the ranking functions for the plurality of domains are implemented by a common neural network, and the ranking function generating apparatus 10 learns parameters of the neural network through multi-task learning to generate the ranking functions for the plurality of domains. Note that the ranking function is a function that receives as input a feature value (hereinafter referred to as the “feature value of an item”) of a combination of a search query and the item, and outputs a rank of the item with respect to the search query.
Hereinafter, a situation is assumed in which a plurality of types of search logs (i.e., search logs for a plurality of domains) can be acquired in a search system, and ranking functions corresponding to the types of these search logs are implemented by a common neural network. In addition, as the search system, an EC (Electronic Commerce) site or the like is assumed, and the types of the search logs (i.e., the domains of the search logs) are assumed to be categorized by behaviors of a user toward the items (e.g., a commercial product).
It is assumed that the behaviors of the user include three behaviors which are a behavior (click) of selecting an item from a search result of the search query, a behavior (cart) of adding, into a cart, the item from the search result or on an item detail screen after the selection of the item (i.e., adding the item included in the search result into the cart) or the like, and a behavior (conversion) of purchasing the item in the cart. Accordingly, it is assumed that there are three types of the search logs which are a search log related to the user behavior “click”, a search log related to the user behavior “cart”, and a search log related to the user behavior “conversion”.
However, the search system is not limited to the EC site, and the present embodiment can search for any item, and is applicable to any search system capable of acquiring search logs for a plurality of domains.
<Functional Configuration>
First, referring to
As illustrated in
The case production unit 101 uses search log data stored in the search log DB 201 and relational feature value data stored in the relational feature value DB 202 to produce case data to be stored in the case DB 203.
Here, referring to
As illustrated in
For example, a record of the search log data related to the user behavior “click” in the first row includes a query ID “1”, an item ID “5”, and the number of times “500”. This represents that, in a search result of the search query having the query ID “1”, the user behavior “click” was performed for a total of 500 times with respect to the item having the item ID “5”. Note that the same applies also to records of search log data related to the other user behaviors.
As such, each record of the search log data stored in the search log DB 201 is information representing, for each of the query IDs and the item IDs, the number of times the corresponding user behavior was performed with respect to the item having the item ID among the items included in the search result of the search query having the query ID.
Next, referring to
As illustrated in
As such, each record of the relational feature value data is information representing, for each of the query IDs and the item IDs, feature values related to a search query having the query ID and an item having the item ID. Note that the feature value includes at least a feature value representing a feature of the item and a feature value representing a feature of the item with respect to the search query (in other words, a feature value representing a relationship between the search query and the item).
As the feature value representing the feature of the item, a feature value may be considered that is obtained by, e.g., extracting term frequency (TF) vectors from a document including a name of the item, a descriptive text for the item, a release date of the item, a category into which the item is classified, and the like, and producing, as the feature value, TF-IDF, a BM 25 score, or the like from these term frequency vectors. Also, as the feature value representing the feature of the item with respect to the search query, a feature value may be considered that is obtained by, e.g., extracting the term frequency vectors in the same manner as for the search query and producing the feature value described in, e.g., Reference Literature 1 “Wu, L., Hu, D., Hong, L., and Liu, H.: Turning clicks into purchases: Revenue optimization for product search in e-commerce, in The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 365-374 (2018)” or the like. However, these feature values are exemplary, and it is possible to use any feature value representing the feature of the item and any feature value representing the feature of the item with respect to the query.
Next, case data stored in the case DB 203 will be described with reference to
As illustrated in
As such, a record of the case data is data in which a query ID, an item ID, a domain, the number of times, and a feature value are associated with each other. In other words, each record of the case data is information representing, for each of the query IDs and the item IDs, the number of times a corresponding user behavior (user behavior corresponding to the domain) was performed with respect to the item having the item ID among the items included in the search result of the search query having this query ID, and feature values related to the search query having the query ID and the item having the item ID. Such case data is produced by connecting the search log data and the relational feature value data by using the same query ID and the same item ID.
The training pair production unit 102 uses the case data stored in the case DB 203 to produce training pair data to be stored in the training pair DB 204.
The training pair data stored in the training pair DB 204 will be described with reference to
As illustrated in
Thus, a record of the training pair data is data in which a pair ID, a query ID, a domain, two item IDs, two numbers of times, and two feature values are associated with each other. Such a record of training pair data can be produced by connecting the two records of case data each having the same query ID and the same domain. For example, a record of training pair data having a pair ID “1” in
Note that records of the training pair data stored in the training pair DB 204 are used as training data when the parameters of the neural network that implements ranking functions for the plurality of domains are learned.
The parameter learning unit 103 uses the records of the training pair data stored in the training pair DB 204 to learn the parameters of the neural network that implements the ranking functions for the plurality of domains. The learned parameters are stored in the parameter DB 205.
<Ranking Function Generation Processing>
Next, processing of generating the ranking functions for the plurality of domains by using the ranking function generating apparatus 10 according to the present embodiment will be described with reference to
First, the case production unit 101 uses records of the search log data stored in the search log DB 201 and records of the relational feature value data stored in the relational feature value DB 202 to connect a record of the search log data and a record of the relational feature value data each having the same query ID and the same item ID to produce a record of the case data (Step S101). Then, the case production unit 101 stores the produced record of the case data in the case DB 203.
Next, the training pair production unit 102 connects the two records of case data each having the same query ID and the same domain among the records of the case data stored in the case DB 203, and assigns a number used as the pair ID to the connected case data to produce a record of training pair data (Step S102). Then, the training pair production unit 102 stores the produced record of the training pair data in the training pair DB 204.
Note that the training pair production unit 102 may produce a record of the training data for every pair of item IDs in records of the case data having the same query and the same domain or may also randomly select pairs of item IDs to produce records of the training data. Alternatively, the training pair production unit 102 may produce records of the training data for all combinations of the query IDs and the domains, or may also produce records of the training data for part of the combinations of the query IDs and the domains.
Next, the parameter learning unit 103 initializes the parameters of the neural network (hereinafter referred to also as the “learning target neural network”) that implements the ranking functions for the plurality of domains (Step S103). Note that, as the method for initialization, a known method may be used appropriately and, e.g., a method that initializes the parameters to random numbers according to a predetermined probability distribution or the like may be considered.
Next, the parameter learning unit 103 uses the records of the training pair data stored in the training pair DB 204 to calculate a loss function value to be used to update the parameters and a gradient related to the parameters (Step S104). Note that, as the method of calculating the gradient related to the parameters of the loss function, a known method may be used appropriately and, e.g., an error propagation method or the like may be used.
As the loss function value, L shown below is used.
where tϵT represents a domain (i.e., a user behavior), T={click, cart, conversion} is assumed, and wt represents a weight of a training pair in a domain t that takes a predetermined value. One may consider determining wt such that, e.g., when the weights wt of the training pairs in each domain t are added up as the inverse number of the number of records of the training pair data related to the domain, the resulting total number has an equal value (i.e., 1).
In addition, Dt represents a set of records of the training pair data related to the domain t, and i and j represent the two item IDs included in each record of the training pair data. Furthermore, the following expression is satisfied.
L
ij
t
=−
ij
t log Pijt−(1−
Here, the following expression is satisfied.
Note that the above expression represents a difference between an output value of a domain t obtained by inputting the feature value corresponding to an item ID “i” in a record of the training pair data to the learning target neural network. and an output value of the domain t obtained by inputting the feature value corresponding to an item ID “j” in the record of the training data to the learning target neural network. In other words, Pijt represents a probability that, in the domain t, the item having the item ID “i” is ranked higher than the item having the item ID “j”.
In addition, the following expression is satisfied.
Note that yiqt represents the number of times corresponding to an item ID “i” in the record of training pair data, and yjqt represents the number of times corresponding to the item ID “i” in the record of training pair data. In other words, denoting the query ID included in the record of training pair data as q, yiqt represents the number of times that the user behavior corresponding to the domain t was performed with respect to the item having the item ID “i” included in the search result of the search query having the query ID “q”.
Next, the parameter learning unit 103 uses the loss function value L calculated in Step S104 described above and the gradient related to the parameters to update (learn) the parameters of the learning target neural network according to a known optimization method (Step S105). In other words, the parameter learning unit 103 updates the parameters according to the known optimization method so as to minimize the loss function value L. This means that the parameters are updated through multi-task learning by regarding the domain t as a task t.
Subsequently, the parameter learning unit 103 determines whether or not to end the learning of the parameters (Step S106). Note that the parameter learning unit 103 may appropriately determine that the learning of the parameters is to be ended when a predetermined ending condition is satisfied. Examples of the predetermined ending condition include that, e.g., Steps S104 to S105 described above have been repeated for a predetermined number of times or more, that the learning of the parameters has converged, and the like.
If it is not determined in Step S106 described above that the learning is to be ended, the parameter learning unit 103 returns to Step S104 described above. Thus, Steps S104 to S105 described above are repeatedly executed until the predetermined ending condition is satisfied.
On the other hand, if it is determined in Step S106 described above that the learning is to be ended, the parameter learning unit 103 stores the learned parameters in the parameter DB 205 (Step S107). Thus, the parameters of the learning target neural network are learned, and the neural network that implements the ranking functions for the plurality of domains is obtained. Accordingly, e.g., when it is desired to obtain the ranking function for the domain “conversion”, a neural network including the input layer, the hidden layer, and the third output layer may be used appropriately as the ranking function. Likewise, when it is desired to obtain the ranking function for the domain “click”, a neural network including the input layer, the hidden layer, and the first output layer may be used appropriately as the ranking function, and when it is desired to obtain the ranking function for the domain “cart”, a neural network including the input layer, the hidden layer, and the second output layer may be used appropriately as the ranking function.
<Evaluation Experiment>
Next, a result of an evaluation experiment for ranking functions generated by the ranking function generating apparatus 10 according to the present embodiment will be described. In this experiment, in the same manner as in the embodiment described above, it is assumed that the domains are “click”, “cart”, and “conversion”, and the number of search queries was set to 100. It was also assumed that a method of generating the ranking functions by using the ranking function generating apparatus 10 according to the present embodiment was “MULTI”, whereas comparative methods were “TARGET”, “MIX”, “TFIDF”, and “BM25”. Note that TARGET is a method of performing learning using records of training pair data only in a target domain (conversion), MIX is a method of performing learning by mixing domains without distinguishing the domains from one another, and each of TFIDF and BMF is a method of performing ranking on the basis only of a relation between the search query and the item.
As the evaluation index, MAP (Mean Average Precision), MRR (Mean Reciprocal Rank), and NDCG (Normalized Discounted Cumulative Gain), which are typical evaluation indices in learning for ranking, were used.
A result of this experiment is shown below in Table 1.
As shown above in Table 1, it can be understood that, according to MULTI, a high value was obtained in any of MAP, MRR, and NDCG compared to those obtained by the other comparative methods. Therefore, it can be stated that the ranking function generating apparatus 10 according to the present embodiment successfully generated a high-performance ranking function compared to those generated by the other comparative methods.
<Hardware Configuration>
Finally, a hardware configuration of the ranking function generating apparatus 10 according to the present embodiment will be described with reference to
As illustrated in
The input device 301 is, e.g., a keyboard, a mouse, a touch panel, or the like. The display device 302 is, e.g., a display or the like. Note that the ranking function generating apparatus 10 does not need to have at least one of the input device 301 and the display device 302.
The external I/F 303 is an interface with an external device, such as a recording medium 303a. The ranking function generating apparatus 10 can perform reading, writing, or the like on the recording medium 303a via the external I/F 303. In the recording medium 303a, one or more programs that implement, e.g., the individual functional units (the case production unit 101, the training pair production unit 102, and the parameter learning unit 103) included in the ranking function generating apparatus 10 may also be stored. Note that examples of the recording medium 303a include a CD (Compact Disc), a DVD (Digital Versatile Disk), a SD memory card (Secure Digital memory card), a USB (Universal Serial Bus) memory, and the like.
The communication I/F 304 is an interface for connecting the ranking function generating apparatus 10 to a communication network. Note that the one or more programs that implement the individual functional units included in the ranking function generating apparatus 10 may also be acquired (downloaded) from a predetermined server device or the like via the communication I/F 304.
The processor 305 is one of various arithmetic/logic devices such as, e.g., a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit). The individual functional units included in the ranking function generating apparatus 10 are implemented by, e.g., processing that the one or more programs stored in the memory device 306 cause a processor 505 to execute.
The memory device 306 is one of various storage devices such as, e.g., an HDD (Hard Disk Drive), an SSD (Solid State Drive), a RAM (Random Access Memory), a ROM (Read-Only Memory), and a flash memory. Each of the DBs (the search log DB 201, the relational feature value DB 202, the case DB 203, the training pair DB 204, and the parameter DB 205) included in the ranking function generating apparatus 10 can be implemented by the memory device 306. However, at least one of the individual DBs included in the ranking function generating apparatus 10 may also be implemented by a storage device (such as, e.g., a database server) connected to the ranking function generating apparatus 10 via the communication network.
By having the hardware configuration illustrated in
The present invention is not limited to the specifically disclosed embodiment described above, and can be variously modified, changed, and combined with existing technologies without departing from the scope of claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/019630 | 5/18/2020 | WO |