The present application claims priority to and incorporates by reference the entire contents of Japanese Patent Application No. 2020-050226 filed in Japan on Mar. 19, 2020.
The present disclosure relates to a learning device, a learning method, and a learning program.
As the Internet has become common, various information analysis techniques have been proposed. For example, various query log analysis techniques have been proposed. The query log analysis techniques can be applied to search advertising. For example, in order to present search keyword options of advertisement, analyzing the search queries input by customers has been proposed.
However, in the above described technique, there are cases in which it cannot be said that the search queries input by customers are appropriately analyzed.
For example, in the above described technique, the search queries are merely analyzed in accordance with the frequency by which the search queries are input. Therefore, with the above described technique, facts that how customers became to be related to various targets of various events, companies, etc. may not be found out.
It is an object of the present invention to at least partially solve the problems in the conventional technology.
According to one aspect of the subject matter described in this disclosure, a learning device includes (i) an acquisition unit configured to acquire a search query(ies) input by a plurality of input customers who have input a reference query, the search queries having been input in mutually different periods, (ii) a specifying unit configured to specify a category of the search query input in each period for each input customer, and (iii) a learning unit configured to cause a model to learn a characteristic of a change in the category specified by the specifying unit.
The above and other objects, features, advantages and technical and industrial significance of this invention will be better understood by reading the following detailed description of presently preferred embodiments of the invention, when considered in connection with the accompanying drawings.
Hereinafter, embodiments of the present disclosure will be described in detail with reference to drawings. Note that the present invention is not limited by the embodiments. Details of one or a plurality of the embodiments are described in the following description and drawings. Also, a plurality of the embodiments can be appropriately combined within a range that does not cause a conflict in processing contents. Also, in following one or a plurality of the embodiments, the same parts are denoted by the same reference signs, and redundant description will be omitted.
First, with reference to
1-1. Time-Series Data Providing Process
In the example of
In the example of
In the example of
In the example of
As illustrated in
Then, the log server 600 provides a search result to the user device 500 (step S2).
Then, the information providing device 100 acquires a search log from the log server 600 (step S3). For example, the information providing device 100 collects histories of search queries from the log server 600.
Then, the information providing device 100 acquires a reference query from the operator device 700 (step S4). As described above, the reference query is a predetermined search query input at certain time and date. In the example of
Then, the information providing device 100 specifies an input customer(s) who have input the acquired reference query (step S5). For example, the information providing device 100 specifies the input customers, who have input the reference query, from the acquired search logs. In the example of
Then, the information providing device 100 specifies search queries input by the customers in periods based on the reference time and date (step S6). The periods based on the reference time and date are the periods around the reference time and date. For example, the interval of the periods may be one month. For example, in a case in which the reference time and date is “2020/03/19”, the periods around the reference time and date are the periods after the reference time and date such as “2020/03/19 to 2020/04/19” and “2020/04/19 to 2020/05/19” or the periods before the reference time and date such as “2020/02/19 to 2020/03/19” and “2020/01/19 to 2020/02/19”. In the present specification, the period after the reference time and date and the period before the reference time and date may be referred to as a “positive period” and a “negative period”, respectively. In this manner, the information providing device 100 extracts the search queries of the users, who have input the reference query, from the acquired search logs in each period based on the reference time and date.
Then, the information providing device 100 generates a list of the search queries, which have high relativity with the reference query, for each period (step S7). The relativity between the reference query and other search queries can be determined based on the relevance degrees between the search queries. For example, the relevance degrees between the search queries can indicate how often a single user inputs these search queries at the same time. For example, if many users input a search query “newborn baby” and a search query “Shichi-go-san (seven-five-three festival)” at the same time, the relevance degree between the search query “newborn baby” and the search query “Shichi-go-san” may be high. In the example of
In some implementation modes, the information providing device 100 may determine the relativity between the search queries by using word embedding. The information providing device 100 may obtain embedding vectors by training a language model by using training data including word strings. For example, keywords corresponding to the search queries included in the search logs may be used as the training data including the word strings. The relativity between the search queries may be cosine similarity between the embedding vectors corresponding to the search queries.
Then, the information providing device 100 provides the generated list to the operator device 700 (step S8). In the example of
Note that the vertical-direction order of the list may illustrate the degree of the relativity between the reference query and the other search queries. For example, the relevance degree between the reference query “newborn baby” and the search query “hospital stay” may be higher than the relevance degree between the reference query “newborn baby” and the search query “ritual visit”. The information providing device 100 may generate a table in which other search queries input in each period are arranged in the order of relevance degrees and provide the generated table to the operator device 700 as a list.
1-2. Grouping Evaluation Process
If the list of the search queries is generated simply based on the relativity between the reference query and other search queries, the context (also referred to as “relation”) of the reference query may be dispersed. The term “context of the reference query” represents the context of input of the reference query, the background of input of the reference query, the circumstances in which the user who has input the reference query is in, the behavior pattern, interest, or concern of the user who has input the reference query, etc.
For example, in a case in which the reference query is a company C1 (exemplary company name), users who have input this reference query include, for example, users who like the company president of the company C1, the users who like a mascot of the company C1, the users who want to change a model of a mobile phone of the company C1, users who want to go to an amusement park run by the company C1, and users who use a comics (manga) browsing service provided by the company C1. In such a case, the context of the reference query is different depending on the user. Therefore, the search queries included in the generated list may vary. For example, the generated list may include search queries such as “company president P1 (exemplary name of a person)”, “that dog (exemplary mascot name)”, “smartphone SP1 (exemplary smartphone name)”, “country of dreams and magic (exemplary name of facilities)”, and “that pirate (exemplary comics name)”. It is sometimes difficult for the operator OP to extract useful information (for example, typical needs of users) from the generated list if the context of the reference query is dispersed. In addition, the operator OP may not be able to appropriately find out the transition of the needs of the users from the time series of the search queries included in the generated list.
Therefore, the information providing device 100 groups the search queries, which have been input in each period by target users, in order to enable the operator OP to appropriately evaluate the list of the search queries. The information providing device 100 can group the search queries, which have been input in each period, based on the context of the reference query. For example, the information providing device 100 acquires the search logs of the users, who have input the reference query, for each time series (for example, period around the reference time and date). Then, the information providing device 100 groups the search queries for each time series. For example, if the number of groups or the number of the users who have input the search queries related to groups satisfies a threshold value, the information providing device 100 extracts the search queries related to the groups as the search queries included in the list. As a result, the grouped search queries included in the list can have coherence, thereby enabling the operator OP to appropriately evaluate the list of the search queries.
As illustrated in
Then, the information providing device 100 specifies input customers and specifies the search queries input by each of the input customers (step S12). As described above, the input customers are the users who have input the reference queries at the reference time and date. The information providing device 100 can specify the input customers from the search logs. Also, the information providing device 100 can specify the search queries, which have been input in the periods around the reference time and date by the input customers, from the search logs. In the example of
Then, the information providing device 100 specifies the search queries input in each period (step S13). As described above, each period is a period around the reference time and date. In the example of
Then, the information providing device 100 specifies an element group of the number of queries or the number of customers of each designated category in each period (step S14). The term “element group” may include an array of the values corresponding to the grouped search queries. For example, the element group of time-series search queries may be an array of the numbers of grouped search queries (in other words, the search queries belonging to particular designated categories) in particular time series (for example, periods). Alternatively, the element group of time-series search queries may be an array of the numbers of the users who have input the search queries belonging to each designated category in particular time series. In other words, the element group may be an array of the values of indexes generated by decomposing the search queries by categories. The term “element group” includes, for example, a data element group such as an array of data elements, a distribution of data elements, etc. This data element may include, for example, the number of the search queries belonging to a particular category or the number of the users who have input the search queries belonging to a particular category.
Regarding a display mode of the above described element group, the array of the values of the indexes (for example, the array of the numbers of the grouped search queries in particular time series) can be presented by using a bar graph. For example, the heights in the bar graph represents the numbers of queries or the numbers of users of each designated category in a certain period. The height in the bar graph representing a designated category #1 may be 250, the height in the bar graph representing a designated category #2 may be 200, and the height in the bar graph representing a designated category #3 may be 150.
In the example of
Note that, in this exemplary embodiment, the information providing device 100 specifies the element group of the number of queries or the number of customers of each designated category in each period, but is not limited thereto. The information providing device 100 may categorize the search queries into a plurality of categories for each period. The plurality of categories may include a designated category. In other words, the information providing device 100 may group the search queries for each time series in accordance with the categories other than the designated categories. If categories are not fixed in advance (for example, if the information providing device 100 has not acquired designated categories), the information providing device 100 may categorize the search queries in the search logs into a plurality of categories (for example, categories other than designated categories) based on the reference query and the search logs. If the reference query is fixed in advance, the information providing device 100 can search for a category appropriate for this reference query. In other words, the information providing device 100 may determine whether the reference query is appropriate or not based on the category designated in advance or may search for an appropriate category for this reference query based on the reference query designated in advance.
Then, the information providing device 100 determines whether the element group satisfies a predetermined condition or not and evaluates the reference query or the designated category from the determination result (step S15). The predetermined condition is, for example, a condition that “the number of the groups of search queries converges to a particular value (for example, a natural number n)”. The groups are, for example, a plurality of categories including a designated category or a category other than the designated category. For example, if the percentage of the search queries belonging to n designated categories with respect to all queries in each period (in other words, all search logs in each period) satisfies a threshold value (for example, 80%) (for example, n is “3”), the information providing device 100 can use the n designated categories as the groups for the list of the search queries. For simplicity, in the example of
Regarding the number of the groups, if the number of the groups is large, the information providing device 100 can determine that the original reference query is not an appropriate query. If the number of groups in a certain period is equal to the number of groups in another period, the information providing device 100 can determine that the original reference query is an appropriate query. The information providing device 100 can determine whether the list of the search queries is appropriate or not based on whether the element group of the number of queries or the number of users is appropriately expressed or not. If the category is fixed in advance (for example, if the category is a designated category), the information providing device 100 can evaluate the element group of the number of queries or the number of users based on the categories and the search logs and, therefore, can specify an appropriate reference query.
In the example of
As described above, the determination result is used for determining whether the reference query is appropriate or not. For example, if the reference query satisfies the predetermined condition in each period, the information providing device 100 determines that the reference query is appropriate. The information providing device 100 may determine that the reference query is appropriate if the reference query satisfies the condition in the majority of periods. If the reference query does not satisfy the predetermined condition in the periods, the reference query may not be appropriately expressing a desired target of the operator OP.
Then, the information providing device 100 provides the evaluation result to the operator device 700 (step S16). For example, the information providing device 100 provides the information corresponding to the determination result to the operator OP. The information providing device 100 may automatically optimize categories (for example, designated category, the categories other than the designated category) based on the determination result. The information providing device 100 may display, by the operator device 700, a message indicating that the designated category is not appropriate or the reference query is not appropriate. Alternatively, the information providing device 100 may display, by the operator device 700, a message indicating that the search behavior of users is appropriately expressed by decomposing the time-series search queries into the element group of the category. The information providing device 100 can provide a list that satisfies the conditions about the element groups to the operator device 700. The number (for example, “3”) of arrays in a certain period may be the same number of arrays in another period. In this case, the operator OP can read transition modes of designated categories from the list. Such designated categories in each period can be paraphrased and applied to analysis of time-series search behavior.
As described above, the information providing device 100 can group the search queries in each period so that the contents of the search queries in each period are not varied. Therefore, the information providing device 100 can provide, to the operator OP, a list of useful search queries which can be used for analysis of particular needs for company presidents, mascots, model change, etc.
Note that the above described designated categories and the categories other than the designated categories may belong to a higher-level category(ies) than these categories. The high-level categories are transition categories described later with reference to
1-3. Time-Interval Modifying Process
The operator OP sometimes does not know how the length of each period based on the above described reference time and date should be set. As described above, the periods based on the reference time and date are the periods around the reference time and date. For example, an appropriate length of the period for finding out search behavior may be 1 month. Alternatively, the length of the appropriate period may be one week. The operator OP may want to set an appropriate period for analyzing search behavior as each period in order to extract useful information (for example, relativity between search behavior) from the list of search queries. If the length of each period is not appropriate (for example, the length of each period is one year), the operator OP may not be able to appropriately find out the transition mode of categories of users. If the length of the period changes, the ranking of the relevance degrees between the reference query and the search queries also changes. For example, if the length of each period is too short, the list of search queries may include many buzzwords (in other words, words which have been popular topics in a particular period). In such a case, the operator OP may not be able to find out the change of the search queries related to the reference query from the list of search queries.
Therefore, the information providing device 100 specifies an appropriate length of the period by adjusting the length of the period. The information providing device 100 adjusts the density of search-query time series by using the above described element groups of the search queries.
As illustrated in
Then, the information providing device 100 determines whether the element group satisfies a predetermined condition or not and determines whether the setting of each period is appropriate or not from the determination result (step S25). In the example of
In the example of
If the type of the users who have input the search queries in each period satisfies a predetermined condition, the information providing device 100 may determine that the length of each period is appropriate. The type of the users is, for example, an attribute of the users such as a demographics attribute, a psychographics attribute, or the like. For example, the information providing device 100 may determine whether the length of each period is appropriate or not based on the male-to-female ratio of the input users who have input the reference query “newborn baby”. If the length of a certain period changes, the male-to-female ratio of the input users in this period also changes. For example, the information providing device 100 may determine the change of the male-to-female ratio of the input users by changing the length of the period. For example, if the male-to-female ratio of the input users is one to one, the information providing device 100 may determine that the length of the period is appropriate. Note that the predetermined condition may be different in each period. Therefore, a plurality of periods (for example, first period, second period) may have different lengths.
Then, the information providing device 100 modifies the length of each period based on the determination result and generates a list again (step S26). For example, the information providing device 100 modifies the length of each period so that the type of the users who have input search queries in each period satisfies the predetermined condition. The information providing device 100 may provide various information to the operator OP based on the density of adjusted periods. For example, if the length of the adjusted period is short, the information providing device 100 may display, by the operator device 700, a message indicating that this period is important or changes in the behavior of users are intense in this period. For example, if the reference query is a search query “newborn baby”, the length of the first period (negative period) (for example, period “−1 to 0 month”) may be one week. This is for a reason that circumstances of users may largely change at the timing of childbirth. On the other hand, the length of a tenth period (negative period) (for example, period “−10 to −9 month”) may be one month. This is for a reason that the circumstances of users do not change at the timing of pregnancy in some cases. In this manner, the length of each period can be determined based on whether the target shown by search queries are connected to important changes of the circumstances of the input users. A short period can be present at a hot point of search-query time series. The operator OP can find out important needs of users from the search queries at such a hot point.
As described above, the information providing device 100 can adjust an increment/decrement length of each period based on the above described reference time and date. The information providing device 100 may collect the conditions (for example, above described predetermined condition) for adjusting the increment/decrement length of each period from workers of crowdsourcing. The workers of crowdsourcing may be the input users who have input the reference query. For example, the predetermined condition collected from the workers of crowdsourcing may be a condition that “the length of each period is one month”. As another example, the collected predetermined condition may be a condition that “the male-to-female ratio of input users is one to one”.
1-4. Search-Query Predicting Process
In a case in which future search queries can be estimated from the history (for example, search logs) of search queries of users, buzzwords (in other words, words which have been popular topics in a particular period) may generate noise. Also, the search queries which have been input by users indicate specific targets such as a title of a comics on a certain magazine, an event name of an amusement park, etc. However, the targets desired to be grasped by the operator OP may be more abstract contexts. In some cases, the operator OP wants to find out behavior of users by more abstract context regardless of details of comics titles and event names.
For example, it is assumed that a user has input search queries, i.e., a search query “that devil (exemplary comics name)”, a search query “that pirate (exemplary comics name)”, and a search query “country of dreams and magic (exemplary name of facilities”) in this order. Furthermore, it is assumed that another user has input search queries, i.e., a search query “that adventure (exemplary comics name)”, a search query “that pirate (exemplary comics name)”, and a search query “country of dreams and magic (exemplary name of facilities”) in this order. In this example, that devil (exemplary comics name), that adventure (exemplary comics name), and that pirate (exemplary comics name) are on the same magazine. The users who are readers of this magazine may have a tendency to input the search query “country of dreams and magic (exemplary name of facilities)”. For example, the user may have purchased this magazine to read that devil (exemplary comics name) and then got to like that pirate (exemplary comics name) on the magazine. This user may have been to the country of dreams and magic (exemplary name of facilities) where an event related to that pirate (exemplary comics name) is held. In such a case, a context from a comics on a magazine to amusement park via another comics on the magazine can be found. For example, the operator OP can find out that users who have input the search queries corresponding to one of a plurality of comics on a magazine tend to go to the country of dreams and magic (exemplary name of facilities).
Therefore, the information providing device 100 predicts a future search query(ies) by carrying out machine learning using search queries. More specifically, the information providing device 100 causes a machine learning model to learn characteristics of the above described path to the reference query. For example, the information providing device 100 causes the machine learning model to learn a mode of change of categories (for example, designated categories) corresponding to the search queries. The information providing device 100 inputs the search queries of a user to a learned model to estimate a search query(ies) to be input by the user. For example, the learned model can output a vector representing a mode of change of categories. Alternatively, the learned model can output a vector representing a reference query. The information providing device 100 can estimate a search query, which is input by the user, based on such a vector.
As illustrated in
Then, the information providing device 100 categorizes the transition modes of the categories of input customers into a plurality of categories (step S32). In the example of
Then, the information providing device 100 learns a model so that, if a history of search queries of a customer having a similar transition category is input, a similar vector is generated and, if a history of search queries of customers having a dissimilar transition categories is input, a dissimilar vector is generated (step S33). For example, the information providing device 100 groups users based on the search queries input in the past by the users who have input the reference query. In the example of
In the example of
In the example of
In the example of
The information providing device 100 carries out machine learning so that, even in a case in which a plurality of users inputs the same reference query, the model outputs different vectors if changes of categories are different. Therefore, the information providing device 100 can cause the model to accurately learn whether the user is the user who reaches the reference query or the characteristics of future search queries input by a certain user by using the search queries of the user or the categories of the search queries. As described above, the information providing device 100 specifies the categories to which the search queries input by the user in the past belong and carries out learning of the model for each type of change in the specified categories. For example, if the user inputs a search query “country of dreams and magic (exemplary name of facilities), the information providing device 100 can cause the model to learn the way how the user reaches this search query. In this manner, the information providing device 100 can cause the model to learn high-level concepts (for example, transition modes) of the categories of search queries.
Then, the information providing device 100 predicts a search query, which is input by a target user in the future, from the history of the search queries of the target user by using the model (step S34). In the example of
Hereinafter, the information providing device 100, which carries out such an information providing process, will be described in detail.
Next, a configuration example of a system including the information providing device 100 will be described with reference to
2-1. Constituent Elements of Information Providing System
In the information providing system 1, each of the information providing device 100, the user device 500, the log server 600, and the operator device 700 is connected to a network N by wire or wirelessly. The network N is, for example, a network such as the Internet, a WAN (Wide Area Network), or a LAN (Local Area Network). The constituent elements of the information providing system 1 can communicate with each other via the network N.
The information providing device 100 (corresponding to one example of the learning device) is an information processing device, which executes processing for evaluating time-series data of search queries. The information providing device 100 can find out users who have particular needs from the time-series data of search queries. Also, the information providing device 100 can predict needs of users from the time-series data of search queries. The information providing device 100 may be an information processing device of an arbitrary type including a server. The plurality of information providing devices 100 may provide functions of various servers such as a web server, an application server, and a database server, respectively. A configuration example of the information providing device 100 will be described in detail in a following section.
The user device 500 is an information processing device used by a user. The user device 500 can transmit search queries via various services (for example, portal site, portal application) on the Internet. Also, the user device 500 can receive search results via these various services. The user device 500 may be an information processing device of an arbitrary type including a client device such as a smartphone, a desk-top PC, a laptop PC, or a tablet PC.
The log server 600 is an information processing device, which provides various services (for example, portal site, portal application) on the Internet. The log server 600 can receive search queries from the user device 500 via these various services. Also, the log server 600 can accumulate the received search queries as search logs. The log server 600 may be an information processing device of an arbitrary type including a server.
The operator device 700 is an information processing device used by an operator. The operator is, for example, a person who is related to a particular Internet company related to the information providing device 100 or the log server 600. The operator device 700 enables the operator to input information to the information providing device 100. For example, if the operator wants to analyze time-series data, the operator can set a keyword or a category of an analysis target with respect to the information providing device 100. As well as the case of the user device 500, the operator device 700 may be an information processing device of an arbitrary type including a client device.
2-2. Configuration of Information Providing Device
As illustrated in
Communication Unit 200
The communication unit 200 is realized, for example, by a Network Interface Card (NIC) or the like. The communication unit 200 is connected to a network by wire or wirelessly. The communication unit 200 may be communicably connected to the user device 500, the log server 600, and the operator device 700 via the network N. The communication unit 200 can carry out transmission/reception of information to/from the user device 500, the log server 600, and the operator device 700 via the network.
Storage Unit 300
The storage unit 300 is realized by, for example, a semiconductor memory element such as a Random Access Memory (RAM) or a flash memory (Flash Memory) or a storage device such as a hard disk or an optical disk. As illustrated in
Search-Query Database 310
In the example of
The “customer ID” represents an identifier for identifying a user (also referred to as “customer”). The “search query” represents a search query input by the user. The “search time and date” represents time and date at which the search query is input.
For example,
Also, for example,
Control Unit 400
The control unit 400 is a controller and can be realized, for example, when a processor such as a Central Processing Unit (CPU) or a Micro Processing Unit (MPU) executes a various program(s) (corresponding to an example of a learning program), which is stored in a storage device in the information providing device 100, by using a RAM or the like as a work area. Also, the control unit 400 is a controller and may be realized, for example, by an integrated circuit such as an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a General Purpose Graphic Processing Unit (GPGPU), or the like.
As illustrated in
First Determination Processing Unit 410
As illustrated in
For example, the first determination processing unit 410 groups the search queries, which have been input in each period by target users, in order to enable the operator to appropriately evaluate the list of the search queries. The first determination processing unit 410 can group the search queries, which have been input in each period, based on the context of the reference query. For example, the first determination processing unit 410 acquires the search logs of the users, who have input the reference query, for each time series (for example, period around the reference time and date). Then, the first determination processing unit 410 groups the search queries for each time series. For example, if the number of groups or the number of the users who have input the search queries related to groups satisfies a threshold value, the first determination processing unit 410 extracts the search queries related to the groups as the search queries included in the list. As a result, the grouped search queries included in the list can have coherence, thereby enabling the operator to appropriately evaluate the list of the search queries.
Acquisition Unit 411
The acquisition unit 411 can acquire various information used for evaluating the time-series data of search queries. The acquisition unit 411 can receive such various information from a predetermined information processing device (for example, a device of an entity related to the information providing device 100 (for example, a particular Internet company)). Also, for example, the acquisition unit 411 can receive such information from an administrator using the information providing device 100 via a user interface. The acquisition unit 411 may store received various information in the storage unit 300. For example, the acquisition unit 411 may store the received search queries in the search-query database 310. The acquisition unit 411 can acquire various information from the storage unit 300. For example, the acquisition unit 411 can acquire search queries (for example, search logs of search histories or the like) from the search-query database 310.
At least in one embodiment, the acquisition unit 411 acquires the search queries, which have been input by a plurality of input customers who have input the reference query.
At least in one embodiment, the acquisition unit 411 acquires the search queries which have been input in a period before the reference time and date, at which the input customer has input the reference query, by predetermined time and date as a predetermined period. For example, the acquisition unit 411 acquires the search queries, which have been input respectively in a plurality of different periods based on the reference time and date.
For example, the acquisition unit 411 acquires search logs from the log server 600. For example, the acquisition unit 411 collects histories of search queries from the log server 600.
For example, the acquisition unit 411 acquires a reference query from the operator device 700. As described above, the reference query is a predetermined search query input at certain time and date. As described above with reference to
For example, the acquisition unit 411 acquires a reference query and designated categories from the operator device 700. As described above, the reference query is a predetermined search query (for example, a search query “company C1”) input at certain time and date. On the other hand, the designated categories are the categories for grouping the time-series search queries. The acquisition unit 411 groups the search queries for each time series (for example, the period around the reference time and date) in accordance with the designated categories. The designated categories are designated, for example, in advance by the operator. The designated categories are, for example, categories such as company presidents, mascots, model change, amusement parks, comics, and the like. In this manner, the acquisition unit 411 receives designated categories from the operator.
Categorization Unit 412
At least in one embodiment, the categorization unit 412 categorizes the search queries, which have been input in a predetermined period among search queries, into a plurality of categories.
At least in one embodiment, the categorization unit 412 categorizes the search queries into a plurality of categories for each period.
For example, the categorization unit 412 specifies the input customers who have input the reference query acquired by the acquisition unit 411. For example, the categorization unit 412 specifies the input customers, who have input the reference query, from the search logs acquired by the acquisition unit 411. As described above with reference to
Then, the categorization unit 412 specifies the search queries, which have been input by the customers in each period based on the reference time and date. The periods based on the reference time and date are the periods around the reference time and date. For example, the interval of the periods may be one month. For example, in a case in which the reference time and date is “2020/03/19”, the periods around the reference time and date are the periods after the reference time and date such as “2020/03/19 to 2020/04/19” and “2020/04/19 to 2020/05/19” or the periods before the reference time and date such as “2020/02/19 to 2020/03/19” and “2020/01/19 to 2020/02/19”. In this manner, the categorization unit 412 extracts the search queries of the users, who have input the reference query, from the search logs acquired by the acquisition unit 411 in each period based on the reference time and date.
For example, the categorization unit 412 specifies input customers and specifies the search queries input by each input customer. As described above, the input customers are the users who have input the reference queries at the reference time and date. The categorization unit 412 can specify the input customers from search logs. Also, the categorization unit 412 can specify the search queries, which have been input in the periods around the reference time and date by the input customers, from the search logs. Then, the categorization unit 412 specifies the search queries input in each period. As described above, each period is a period around the reference time and date. Then, the categorization unit 412 specifies the element group of the number of queries or the number of customers of each designated category in each period. As described above with reference to
The categorization unit 412 may categorize the search queries into a plurality of categories for each period. The plurality of categories may include a designated category. More specifically, the categorization unit 412 may group the search queries for each time series in accordance with the categories other than the designated categories. If categories are not fixed in advance (for example, if the acquisition unit 411 has not acquired designated categories), the categorization unit 412 may categorize the search queries in the search logs into a plurality of categories (for example, categories other than designated categories) based on the reference query and the search logs. If the reference query is fixed in advance, the categorization unit 412 can search for a category appropriate for this reference query.
For example, the categorization unit 412 may automatically optimize categories (for example, designated category, the categories other than the designated category) based on the determination result of the determination unit 413.
Determination Unit 413
At least in one embodiment, the determination unit 413 determines whether the categorization result of the categorization unit 412 satisfies a predetermined determination condition or not.
At least in one embodiment, the determination unit 413 determines whether the number of the categories to which the search queries have been categorized satisfies a predetermined condition or not. For example, the determination unit 413 determines whether the number of the categories to which the search queries of predetermined percentage or higher among the search queries are categorized is equal to or less than a predetermined threshold value. Also, for example, the determination unit 413 specifies the category to which the search queries of predetermined percentage or higher among the search queries are categorized and determines whether the number of the customers who have input the search queries categorized to the specified category is equal to or higher than a predetermined threshold value or not.
At least in one embodiment, whether the categorization result satisfies the predetermined determination condition or not is determined for each period.
At least in one embodiment, the determination unit 413 determines whether the high-level categories to which the search queries are categorized satisfies a predetermined condition or not.
For example, the determination unit 413 determines whether the element group of search queries satisfies a predetermined condition or not and evaluates the reference query or the designated category from the determination result. The predetermined condition is, for example, a condition that “the number of the groups of search queries converges to a particular value (for example, a natural number n)”. The groups are, for example, a plurality of categories including a designated category or a category other than the designated category. For example, if the percentage of the search queries belonging to n designated categories with respect to all queries in each period (in other words, all search logs in each period) satisfies a threshold value (for example, 80%) (for example, n is “3”), the determination unit 413 can use the n designated categories as the groups for the list of the search queries. For example, if the number of designated categories is “3” and if these three designated categories include 80% of search queries or customers, the determination unit 413 determines that these three designated categories satisfy the predetermined condition.
Regarding the number of the groups, if the number of the groups is large, the determination unit 413 can determine that the original reference query is not an appropriate query. If the number of groups in a certain period is equal to the number of groups in another period, the determination unit 413 can determine that the original reference query is an appropriate query. The determination unit 413 can determine whether the list of the search queries is appropriate or not based on whether the element group of the number of queries or the number of users is appropriately expressed or not. If the category is fixed in advance (for example, if the category is a designated category), the determination unit 413 can evaluate the element group of the number of queries or the number of users based on the categories and the search logs and, therefore, can specify an appropriate reference query.
As described above with reference to
As described above, the determination result is used for determining whether the reference query is appropriate or not. For example, if the reference query satisfies the predetermined condition in each period, the determination unit 413 determines that the reference query is appropriate. The determination unit 413 may determine that the reference query is appropriate if the reference query satisfies the condition in the majority of periods. If the reference query does not satisfy the predetermined condition in the periods, the reference query may not be appropriately expressing a desired target of the operator.
Output Unit 414
At least in one embodiment, if it is determined by the determination unit 413 that the categorization result satisfies a predetermined determination condition, the output unit 414 (for example, the output unit 414 implemented as a first output unit) outputs the information that the reference query is appropriate.
At least in one embodiment, if it is determined by the determination unit 413 that the categorization result satisfies a predetermined determination condition, the output unit 414 (for example, the output unit 414 implemented as a second output unit) outputs the information that the plurality of categories that categorizes the search queries is appropriate.
For example, the output unit 414 provides an evaluation result to the operator device 700. For example, the output unit 414 provides the information corresponding to the determination result to the operator. The output unit 414 may display, by the operator device 700, a message indicating that the designated category is not appropriate or the reference query is not appropriate. Alternatively, the output unit 414 may display, by the operator device 700, a message indicating that the search behavior of users is appropriately expressed by decomposing the time-series search queries into the element group of the category.
Extraction Unit 415
At least in one embodiment, if it is determined by the determination unit 413 that the categorization result satisfies a predetermined determination condition, the extraction unit 415 extracts the search queries categorized into a category, which satisfies a predetermined categorization condition among a plurality of categories. For example, the extraction unit 415 specifies the category to which the search queries of predetermined percentage or higher are categorized among search queries and extracts the search queries categorized to the specified category. Also, for example, the extraction unit 415 extracts the search queries having high relativity with the reference query among search queries.
For example, the extraction unit 415 acquires a search query, which satisfies a predetermined condition in each period and similar to the reference query, from the search-query database 310. The predetermined condition is, for example, a condition about the element group of search queries.
Provision Unit 416
At least in one embodiment, the provision unit 416 provides a list of search queries extracted by the extraction unit 415.
For example, the provision unit 416 generates a list of search queries having high relativity with the reference query for each period. As described above with reference to
Then, the provision unit 416 provides the generated list to the operator device 700. As described above with reference to
For example, the provision unit 416 can provide a list that satisfies the conditions about the element groups of search queries to the operator device 700. The number (for example, “3”) of arrays in a certain period may be the same number of arrays in another period. In this case, the operator can read transition modes of designated categories from the list. Such designated categories in each period can be paraphrased and applied to analysis of time-series search behavior.
Second Determination Processing Unit 420
As illustrated in
For example, the second determination processing unit 420 specifies an appropriate length of the period by adjusting the length of the period. The second determination processing unit 420 adjusts the density of search-query time series by using the above described element groups of the search queries.
Acquisition Unit 421
The acquisition unit 421 can acquire various information used for evaluating the time-series data of search queries. The acquisition unit 421 can receive such various information from a predetermined information processing device (for example, a device of an entity related to the information providing device 100 (for example, a particular Internet company)). Also, for example, the acquisition unit 421 can receive such information from an administrator using the information providing device 100 via a user interface. The acquisition unit 421 may store received various information in the storage unit 300. For example, the acquisition unit 421 may store the received search queries in the search-query database 310. The acquisition unit 421 can acquire various information from the storage unit 300. For example, the acquisition unit 421 can acquire search queries (for example, search logs of search histories or the like) from the search-query database 310.
At least in one embodiment, the acquisition unit 421 acquires the search queries, which are the search queries input by a plurality of input customers who have input the reference query and input within a predetermined period.
At least in one embodiment, the acquisition unit 421 acquires search queries, which have been input by input customers in a plurality of different periods respectively based on the reference time and date at which reference query has been input.
For example, the acquisition unit 421 acquires search logs from the log server 600. For example, the acquisition unit 421 collects histories of search queries from the log server 600.
For example, the acquisition unit 421 acquires a reference query from the operator device 700. As described above, the reference query is a predetermined search query input at certain time and date. As described above with reference to
For example, the acquisition unit 421 acquires a reference query and designated categories from the operator device 700. As described above, the reference query is a predetermined search query (for example, a search query “company C1”) input at certain time and date. On the other hand, the designated categories are the categories for grouping the time-series search queries. The acquisition unit 421 groups the search queries for each time series (for example, the period around the reference time and date) in accordance with the designated categories. The designated categories are designated, for example, in advance by the operator. The designated categories are, for example, categories such as company presidents, mascots, model change, amusement parks, comics, and the like. In this manner, the acquisition unit 421 receives designated categories from the operator.
For example, the acquisition unit 421 may collect the conditions (for example, above described predetermined condition) for adjusting the increment/decrement length of each period from workers of crowdsourcing. The workers of crowdsourcing may be the input users who have input the reference query. For example, the predetermined condition collected from the workers of crowdsourcing may be a condition that “the length of each period is one month”. As another example, the collected predetermined condition may be a condition that “the male-to-female ratio of input users is one to one”.
Categorization Unit 422
At least in one embodiment, the categorization unit 422 (for example, the categorization unit 422 implemented as a first categorization unit) categorizes the input customers, who have input search queries in a predetermined period, into a plurality of categories in accordance with the attributes of the input customers.
At least in one embodiment, the categorization unit 422 (for example, the categorization unit 422 implemented as a second categorization unit) categorizes the search queries, which have been input in a predetermined period, into a plurality of categories.
For example, the categorization unit 422 specifies the input customers who have input the reference query acquired by the acquisition unit 421. For example, the categorization unit 422 specifies the input customers, who have input the reference query, from the search logs acquired by the acquisition unit 421. As described above with reference to
Then, the categorization unit 422 specifies the search queries, which have been input by the customers in each period based on the reference time and date. The periods based on the reference time and date are the periods around the reference time and date. For example, the interval of the periods may be one month. For example, in a case in which the reference time and date is “2020/03/19”, the periods around the reference time and date are the periods after the reference time and date such as “2020/03/19 to 2020/04/19” and “2020/04/19 to 2020/05/19” or the periods before the reference time and date such as “2020/02/19 to 2020/03/19” and “2020/01/19 to 2020/02/19”. In this manner, the categorization unit 422 extracts the search queries of the users, who have input the reference query, from the search logs acquired by the acquisition unit 421 in each period based on the reference time and date.
For example, the categorization unit 422 specifies input customers and specifies the search queries input by each input customer. As described above, the input customers are the users who have input the reference queries at the reference time and date. The categorization unit 422 can specify the input customers from search logs. Also, the categorization unit 422 can specify the search queries, which have been input in the periods around the reference time and date by the input customers, from the search logs. Then, the categorization unit 422 specifies the search queries input in each period. As described above, each period is a period around the reference time and date. Then, the categorization unit 422 specifies the element group of the number of queries or the number of customers of each designated category in each period. As described above with reference to
The categorization unit 422 may categorize the search queries into a plurality of categories for each period. The plurality of categories may include a designated category. More specifically, the categorization unit 422 may group the search queries for each time series in accordance with the categories other than the designated categories. If categories are not fixed in advance (for example, if the acquisition unit 421 has not acquired designated categories), the categorization unit 422 may categorize the search queries in the search logs into a plurality of categories (for example, categories other than designated categories) based on the reference query and the search logs. If the reference query is fixed in advance, the categorization unit 422 can search for a category appropriate for this reference query.
Determination Unit 423
At least in one embodiment, the determination unit 423 determines whether a predetermined period is appropriate or not based on the attributes of the input customers who have input search queries or based on whether these search queries satisfy predetermined conditions or not.
At least in one embodiment, the determination unit 423 determines whether the length of the predetermined period is appropriate or not.
At least in one embodiment, the determination unit 423 determines whether the predetermined period is appropriate or not based on whether the categorization result by the categorization unit 422 (for example, the categorization unit 422 implemented as a first categorization unit) satisfies a predetermined condition or not. For example, if the number of the categories to which the input customers are categorized is equal to or less than a predetermined threshold value, the determination unit 423 determines that the predetermined period is appropriate. Also, for example, if the percentage of the customers categorized into each category by the categorization unit 422 (for example, the categorization unit 422 implemented as the first categorization unit) satisfies a predetermined condition, the determination unit 423 determines that the predetermined period is appropriate.
At least in one embodiment, the determination unit 423 determines whether the predetermined period is appropriate or not based on whether the categorization result by the categorization unit 422 (for example, the categorization unit 422 implemented as a second categorization unit) satisfies a predetermined condition or not. For example, the determination unit 423 determines whether the number of the categories to which the search queries have been categorized satisfies a predetermined condition or not. For example, the determination unit 423 determines whether the number of the categories to which the search queries of predetermined percentage or higher among the search queries are categorized is equal to or less than a predetermined threshold value. For example, the determination unit 423 specifies the category to which the search queries of predetermined percentage or higher among the search queries are categorized and determines whether the number of the customers who have input the search queries categorized to the specified category is equal to or higher than a predetermined threshold value or not.
At least in one embodiment, whether this period is appropriate or not is determined for each period.
For example, the determination unit 423 determines whether the element group of search queries satisfies a predetermined condition or not and determines whether the setting of each period is appropriate or not from the determination result.
As described above with reference to
If the type of the users who have input the search queries in each period satisfies a predetermined condition, the determination unit 423 may determine that the length of each period is appropriate. The type of the users is, for example, an attribute of the users such as a demographics attribute, a psychographics attribute, or the like. For example, the determination unit 423 may determine whether the length of each period is appropriate or not based on the male-to-female ratio of the input users who have input the reference query “newborn baby”. If the length of a certain period changes, the male-to-female ratio of the input users in this period also changes. For example, the determination unit 423 may determine the change of the male-to-female ratio of the input users by changing the length of the period. For example, if the male-to-female ratio of the input users is one to one, the determination unit 423 may determine that the length of the period is appropriate. Note that the predetermined condition may be different in each period. Therefore, a plurality of periods (for example, first period, second period) may have different lengths.
Output Unit 424
If the determination unit 423 determines that the element group of search queries satisfies a predetermined condition, the output unit 424 can output the information that the setting of the period is appropriate.
For example, the output unit 424 modifies the length of each period based on the determination result and generates a list again. For example, the output unit 424 modifies the length of each period so that the type of the users who have input search queries in each period satisfies the predetermined condition. The output unit 424 may provide various information to the operator based on the density of adjusted periods. For example, if the length of the adjusted period is short, the output unit 424 may display, by the operator device 700, a message indicating that this period is important or changes in the behavior of users are intense in this period. For example, if the reference query is a search query “newborn baby”, the length of the first period (negative period) (for example, period “−1 to 0 month”) may be one week. This is for a reason that circumstances of users may largely change at the timing of childbirth. On the other hand, the length of a tenth period (negative period) (for example, period “−10 to −9 month”) may be one month. This is for a reason that the circumstances of users do not change at the timing of pregnancy in some cases. In this manner, the length of each period can be determined based on whether the target shown by search queries are connected to important changes of the circumstances of the input users. A short period can be present at a hot point of search-query time series. The operator can find out important needs of users from the search queries at such a hot point.
Learning Processing Unit 430
The above described categories of the search queries may be the categorization of time-series search queries based on the determination by the above described first determination processing unit 410. For example, the time-series search queries may be categorized based on the determination executed by the first determination processing unit 410 whether the categorization of the time-series search queries satisfies a predetermined condition or not. The time-series search queries may be categorized so that the categorization of the time-series search queries satisfies a predetermined condition. As described above, such determination can be used for improving the categorization of time-series search queries.
The time interval of the above described transition may be the length of a period based on the determination by the above described second determination processing unit 420. For example, the length of the period in the time series of search queries may be determined based on the determination of whether the period in the time series of search queries is appropriate or not executed by the second determination processing unit 420. As described above, such determination can be used for improving the length of the period in the time series of search queries.
The above described transition of categories can be used for evaluating time-series data of search queries. As illustrated in
For example, the learning processing unit 430 predicts a future search query(ies) by carrying out machine learning using search queries. More specifically, the learning processing unit 430 causes a machine learning model to learn characteristics of the above described path to the reference query. For example, the learning processing unit 430 causes the machine learning model to learn a mode of change of categories (for example, designated categories) corresponding to the search queries. The learning processing unit 430 inputs the search queries of a user to a learned model to estimate a search query(ies) to be input by the user. For example, the learned model can output a vector representing a mode of change of categories. Alternatively, the learned model can output a vector representing a reference query. The learning processing unit 430 can estimate a search query, which is input by the user, based on such a vector.
Acquisition Unit 431
The acquisition unit 431 can acquire various information used for evaluating the time-series data of search queries. The acquisition unit 431 can receive such various information from a predetermined information processing device (for example, a device of an entity related to the information providing device 100 (for example, a particular Internet company)). Also, for example, the acquisition unit 431 can receive such information from an administrator using the information providing device 100 via a user interface. The acquisition unit 431 may store received various information in the storage unit 300. For example, the acquisition unit 431 may store the received search queries in the search-query database 310. The acquisition unit 431 can acquire various information from the storage unit 300. For example, the acquisition unit 431 can acquire search queries (for example, search logs of search histories or the like) from the search-query database 310.
At least in one embodiment, the acquisition unit 431 acquires the search queries, which are the search queries input by a plurality of input customers who have input the reference query and input in mutually different periods.
At least in one embodiment, the acquisition unit 431 acquires a search query, which is input after the input customer inputs the reference query, as an objective query. For example, the acquisition unit 431 acquires a plurality of search queries, which are input after the input customer inputs the reference query, as objective queries.
For example, the acquisition unit 431 acquires search logs from the log server 600. For example, the acquisition unit 431 collects histories of search queries from the log server 600.
For example, the acquisition unit 431 acquires a reference query from the operator device 700. As described above, the reference query is a predetermined search query input at certain time and date. As described above with reference to
Specifying Unit 432
At least in one embodiment, the specifying unit 432 specifies the categories to which the search queries input in each period for each input customer.
For example, the specifying unit 432 specifies the input customers who have input the reference query acquired by the acquisition unit 431. For example, the specifying unit 432 specifies the input customers, who have input the reference query, from the search logs acquired by the acquisition unit 431. As described above with reference to
Then, the specifying unit 432 specifies the search queries, which have been input by the customers in each period based on the reference time and date. The periods based on the reference time and date are the periods around the reference time and date. For example, the interval of the periods may be one month. For example, in a case in which the reference time and date is “2020/03/19”, the periods around the reference time and date are the periods after the reference time and date such as “2020/03/19 to 2020/04/19” and “2020/04/19 to 2020/05/19” or the periods before the reference time and date such as “2020/02/19 to 2020/03/19” and “2020/01/19 to 2020/02/19”. In this manner, the specifying unit 432 extracts the search queries of the users, who have input the reference query, from the search logs acquired by the acquisition unit 431 in each period based on the reference time and date.
For example, the specifying unit 432 specifies the categories to which the search queries input by the input customer in each period belong. As described above with reference to
For example, the specifying unit 432 categorizes the transition modes of the categories of each input customer into a plurality of categories. As described above with reference to
Learning Unit 433
At least in one embodiment, the learning unit 433 causes a model to learn characteristics of changes in the categories specified by the specifying unit 432.
At least in one embodiment, the learning unit 433 carries out learning of a model so that objective queries are output when changes in the categories to which the search queries input by input customers are input. For example, the learning unit 433 carries out learning of a model so that objective queries are output in the input order when changes in the categories to which the search queries input by input customers are input.
At least in one embodiment, the learning unit 433 carries out learning of a model for each change of categories. For example, the learning unit 433 carries out learning of this model so that a search query input by an input customer after a reference query is output when search queries input by this input customer are input in the input order with respect to a model corresponding to changes in the categories to which the search queries input by this input customer belong.
At least in one embodiment, the learning unit 433 carries out learning of a model so that, a similar vector is output if a search query corresponding to a similar change in categories is input, and a dissimilar vector is output if a search query corresponding to a dissimilar change in categories is input.
At least in one embodiment, the learning unit 433 causes a model to learn the characteristics of changes in the categories of search queries input by an input customer on the way to a target corresponding to the reference query.
For example, the learning unit 433 learns a model so that, if a history of search queries of a customer having a similar transition category is input, a similar vector is generated and, if a history of search queries of customers having a dissimilar transition categories is input, a dissimilar vector is generated. For example, the learning unit 433 groups users based on the search queries input in the past by the users who have input the reference query. As described above with reference to
As described above with reference to
The learning unit 433 carries out machine learning so that, even in a case in which a plurality of users inputs the same reference query, the model outputs different vectors if changes of categories are different. Therefore, the learning unit 433 can cause the model to accurately learn whether the user is the user who reaches the reference query or the characteristics of future search queries input by a certain user by using the search queries of the user or the categories of the search queries. As described above, the learning unit 433 specifies the categories to which the search queries input by the user in the past belong and carries out learning of the model for each type of change in the specified categories. For example, if the user inputs a search query “country of dreams and magic (exemplary name of facilities), the learning unit 433 can cause the model to learn the way how the user reaches this search query. In this manner, the learning unit 433 can cause the model to learn high-level concepts (for example, transition modes) of the categories of search queries.
Estimation Unit 434
At least in one embodiment, the estimation unit 434 estimates a search query, which is to be input in the future by a customer, by using a model learned by the learning unit 433 from changes in the categories to which the search queries input by this customer belong.
For example, the estimation unit 434 predicts a search query, which is to be input by a target user in the future, from the history of the search queries of the target user by using a model. As described above with reference to
Then, with reference to
As illustrated in
Then, the first determination processing unit 410 (for example, acquisition unit 411) specifies the input customers who have input the reference query from the acquired search history (step S102).
Then, the first determination processing unit 410 (for example, the categorization unit 412) categorizes the search queries input by the specified input customer (for example, the input customer specified by the acquisition unit 411) into categories for each period (step S103).
Then, the first determination processing unit 410 (for example, the determination unit 413) determines whether the element group of the categorization result in each period (for example, the categorization result by the categorization unit 412 in each period) satisfies a predetermined condition or not (step S104).
Then, the first determination processing unit 410 (for example, the determination unit 413) determines whether the reference query or the category is appropriate or not in accordance with the determination result (step S105).
Then, with reference to
As illustrated in
Then, the second determination processing unit 420 (for example, acquisition unit 421) specifies the input customers who have input the reference query from the acquired search history (step S202).
Then, the second determination processing unit 420 (for example, the categorization unit 422) categorizes the search queries input by the specified input customer (for example, the input customer specified by the acquisition unit 421) into categories for each period (step S203).
Then, the second determination processing unit 420 (for example, the determination unit 423) determines whether the element group of the categorization result in each period (for example, the categorization result by the categorization unit 422 in each period) satisfies a predetermined condition or not (step S204).
Then, the second determination processing unit 420 (for example, the determination unit 423) determines whether the period is appropriate or not in accordance with the determination result (step S205).
Then, with reference to
As illustrated in
Then, the learning processing unit 430 (for example, the specifying unit 432) categorizes the acquired search queries (for example, the search queries acquired by the acquisition unit 431) into categories (step S302).
Then, the learning processing unit 430 (for example, the specifying unit 432) categorizes each input customer by each transition mode of categories (step S303). Each input customer is categorized into each transition category representing a transition mode of categories.
Then, the learning processing unit 430 (for example, the learning unit 433) learns a model so that a similar vector is output if a history of search queries of a similar transition category is input and that a dissimilar vector is output if a history of search queries of a dissimilar transition category is input (step S304).
The information providing device 100 according to the above described embodiment may be implemented by various other modes other than the above described embodiments. Therefore, hereinafter, other embodiments of the above described information providing device 100 will be described.
6-1. Expression of Designated Categories
The above described designated categories may be the categories indicating a path to a target represented by the reference query. For example, it is assumed that a user inputs the title of first comics on a first magazine as a search query in a period “−3 to −2 month”, inputs the title of second comics on a second magazine as a search query in a period “−2 to −1 month”, and inputs the title of third comics on a third magazine as a search query in a period “−1 to 0 month”. Furthermore, it is assumed that another user inputs a first event name of a first amusement park as a search query in a period “−3 to −2 month”, inputs a second event name of a second amusement park as a search query in a period “−2 to −1 month”, and inputs a third event name of a third amusement park as a search query in a period “−1 to 0 month”. In this example, the search query “the title of the first comics”, the search query “the title of the second comics”, and the search query “the title of the third comics” belong to a designated category “comics”. The search query “the title of the first comics” does not have to belong to the designated category “first magazine”, the search query “the title of the second comics” does not have to belong to a designated category “second magazine”, and the search query “the title of the third comics” does not have to belong to a designated category “third magazine”. More specifically, the designated categories may be higher-level categories (for example, comics, amusement parks) of normal categories (for example, the first magazine, the second magazine, the third magazine, the first amusement park, the second amusement park, and the third amusement park) for categorizing search queries.
The above described designated category “comics” can subject the path from the search query “the title of the first comics” to “the title of the third comics” via the search query “the title of the second comics” to coarse graining. Similarly, the above described designated category “amusement park” can subject the path from the search query “the first event name” to the “the third event name” via the search query “the second event name” to coarse graining. As described above, the first determination processing unit 410 (for example, the categorization unit 412) of the information providing device 100 can categorize search queries into any of a plurality of categories respectively corresponding to a plurality of paths to a target represented by the reference query for each period. Then, the first determination processing unit 410 (for example, the categorization unit 412) can determine whether the reference query or category (for example, designated category) is appropriate or not based on the categorization result.
6-2. Dispersion of Element Group of Search Queries
In the above described embodiments, if the element group of search queries converges, the first determination processing unit 410 (for example, the determination unit 413) of the information providing device 100 determines that the list of the search queries is appropriate, but the embodiment is not limited thereto. If the dispersion of the element group is high (for example, the element group is formed in a dispersed manner), it may be determined that the list of the search queries is appropriate. If dispersion of the element group is high, the completeness of the list of the search queries may be high. Such a list can be a list having a wide range of targets. For example, the list can include a keyword (for example, search query) in which many users are interested. On the other hand, if the dispersion of the element group is low, the list can include a keyword (for example, search query) in which particular target users are interested.
6-3. Prediction of Search Behavior by Machine Learning Model
In the above described embodiment, the learning processing unit 430 (for example, the learning unit 433) of the information providing device 100 carries out learning of a model so that the model outputs a vector representing a transition mode of categories, but the embodiment is not limited thereto. The learning processing unit 430 (for example, the learning unit 433) may predict a future search query from a particular time-series search history of a certain user by using a machine learning model, which has learned search histories of particular time series. Generally, a past behavior is a trigger of a future behavior. A future behavior may be caused by a plurality of past behaviors. In such machine learning, a search query input in a period before the reference time and date corresponds to an explanatory variable. On the other hand, a search query input in a period after the reference time and date corresponds to an objective variable. If a particular time-series search history of a user is input to a learned model, the learned model can estimate a future search query.
A plurality of search queries along time series may be converted into vectors in advance. As described above, for example, an embedding vector corresponding to a search query can be obtained by training various language expression models.
In some implementation modes, the learning processing unit 430 (for example, the learning unit 433) of the information providing device 100 may train a model architecture such as a series transformation model (Sequence To Sequence Model) by using training data including search logs. For example, the learning unit 433 can train a series transformation model by minimizing a negative logarithmic likelihood corresponding to the series transformation model by using training data including search logs. Examples of the series transformation model include a model having an attention mechanism such as a Transformer Model and a Recurrent Neural Network (RNN) (for example, gate-equipped RNN such as Long Short Term Memory (LSTM)). Instances included in the training data are, for example, time-series search queries and time-series categories (in other words, transitions of categories). Labels related to the instances are, for example, search queries (for example, reference query) and categories (for example, designated categories). The learning unit 433 can predict a future search query or a future category by inputting time-series search queries to a trained series transformation model.
6-4. Targeting Based on Time-Series Search Queries
In some embodiments, the first determination processing unit 410 or the second determination processing unit 420 of the information providing device 100 may have a specifying unit (not illustrated) which specifies a second search query, which is input before a first search query is input and is related to the first search query, as a keyword used in targeting based on the first search query among search queries. Regarding the learning processing unit 430 of the information providing device 100, the above described specifying unit 432 may be implemented as a first specifying unit. In this case, the above described specifying unit may be implemented as a second specifying unit in the learning processing unit 430.
The specifying unit specifies the second search query, which has relativity with the first search query input at certain time and date and has been input before the time and date, from time-series search queries. As described later, the specifying unit can use the specified second query for targeting based on the first search query.
The time-series search queries are, for example, search queries, which have been acquired by the acquisition unit 411 or the acquisition unit 421 and input by a plurality of input customers who have input the reference query. The first search query may be the reference query or a search query other than the reference query.
For example, the second search query having the relativity with the first search query may be specified based on the above described relevance degrees between the search queries with reference to
As another example, the second search query having relativity with the first search query may be specified based on the number of search queries or the number of users who have input search queries in a predetermined period before the first search query is input. For example, if the number of the search queries or the number of the users satisfies a threshold value, the specifying unit may specify these search queries as the second search query.
The targeting based on the first search query is the targeting of, for example, a trade target (for example, commercial product or service) corresponding to the first search query. The specifying unit can target the information about the trade target (for example, advertisement contents) on users. In other words, the targets of the advertisement contents are narrowed down to particular users. In some implementation modes, the storage unit 300 of the information providing device 100 may have a user database (not illustrated) which stores user information of the user of the user device 500. The specifying unit may process Structured Query Language (SQL) queries and specify the users who have input the second search query from the user database. Then, the specifying unit may provide the information about the trade target to the specified users.
As an example for explanation, it is assumed that the first search query is “model change”. Furthermore, it is assumed that the second search query having relativity with the first search query is “battery exchange”. In this example, the specifying unit can target the information about the trade target corresponding to the search query “model change” on the users who have input the search query “battery exchange”. The specifying unit can provide various information about the model change (for example, information about a model change campaign, advertisement contents) via electronic mail accounts of the users, push notifications to the users, personal pages of the users, etc.
The specifying unit may provide the specified second query to the entities related to the information providing device 100 as advertisement targeting data (for example, targeting keyword). Also, the specifying unit may specify the users who have input the specified second query from the user database and categorize the specified users into a targeting group. The targeting group includes, for example, the information (for example, user ID) about the users who have input the second query. The specifying unit may provide the targeting group, which is associated with the second query, to the entities related to the information providing device 100 as advertisement targeting data.
Also, among the processes described in the above described embodiments, some of the processes described to be automatically carried out can be also manually carried out. Alternatively, all or some of the processes described to be manually carried out may also be automatically carried out by a publicly known method. Other than that, the processing procedures illustrated in the above described document or drawings, specific names, and information including various data and parameters can be arbitrarily changed unless otherwise specifically stated. For example, the various information illustrated in the drawings is not limited to the illustrated information.
Also, each of the constituent elements of each illustrated device is a functional idea and is not necessarily required to be physically configured as the illustration. More specifically, the specific mode of the distribution/integration of each device is not limited to that of the illustration, and all or part thereof can be distributed/integrated functionally or physically in an arbitrary unit in accordance with various load, usage circumstances, etc.
For example, part or all of the storage unit 300 illustrated in
Also, the information providing device 100 according to the above described embodiment is realized, for example, by a computer 1000 having a configuration as illustrated in
The computation device 1030 operates based on, for example, a program(s) stored in the primary storage device 1040 and/or the secondary storage device 1050 and/or a program(s) read from the input device 1020 and executes various processes. The primary storage device 1040 is a memory device such as a RAM, which temporarily stores data used by the computation device 1030 in various computation. Also, the secondary storage device 1050 is a storage device in which data used in various computation and various databases by the computation device 1030 are registered and is realized by a Read Only Memory (ROM), a Hard Disk Drive (HDD), a flash memory, or the like.
The output IF 1060 is an interface for transmitting information, which serves as an output target, to the output device 1010 such as a monitor or a printer, which outputs various information, and is realized, for example by a connector of standards such as a Universal Serial Bus (USB), a Digital Visual Interface (DVI), a High Definition Multimedia Interface (HDMI (registered tradename)). Also, the input IF 1070 is an interface for receiving information from the various input devices 1020 such as a mouse, a keyboard, and a scanner and is realized, for example, by a USB or the like.
Note that the input device 1020 may be a device which reads information from, for example, an optical recording medium such as a Compact Disc (CD), a Digital Versatile Disc (DVD), or a Phase change rewritable Disk (PD), a magnetooptical recording medium such as a Magneto-Optical disk (MO), a tape medium, a magnetic recording medium, or a semiconductor memory. Also, the input device 1020 may be an external storage medium such as a USB memory.
The network IF 1080 receives data from another equipment via a network N, transmits the data to the computation device 1030, and also transmits the data generated by the computation device 1030 to another equipment via the network N.
The computation device 1030 carries out control of the output device 1010 and the input device 1020 via the output IF 1060 and the input IF 1070. For example, the computation device 1030 loads the program from the input device 1020 or the secondary storage device 1050 to the primary storage device 1040 and executes the loaded program.
For example, if the computer 1000 functions as the information providing device 100, the computation device 1030 of the computer 1000 realizes the function of the control unit 400 by executing the program loaded to the primary storage device 1040.
As described above, the learning processing unit 430 of the information providing device 100 according to the embodiment has the acquisition unit 431, the specifying unit 432, and the learning unit 433.
In the information providing device 100 according to the embodiment, the acquisition unit 431 acquires the search queries, which are the search queries input by a plurality of input customers who have input the reference query and input within mutually different periods. Also, in the information providing device 100 according to the embodiment, the specifying unit 432 specifies the categories to which the search queries input in each period for each input customer. Also, in the information providing device 100 according to the embodiment, the learning unit 433 causes a model to learn characteristics of changes in the categories specified by the specifying unit 432.
Also, in the information providing device 100 according to the embodiment, the acquisition unit 431 acquires a plurality of search queries, which are input after the input customer inputs the reference query, as objective queries. Also, in the information providing device 100 according to the embodiment, the learning unit 433 carries out learning of a model so that objective queries are output in the input order when changes in the categories to which the search queries input by input customers are input.
Also, in the information providing device 100 according to the embodiment, the acquisition unit 431 acquires a plurality of search queries, which are input after the input customer inputs the reference query, as objective queries. Also, in the information providing device 100 according to the embodiment, the learning unit 433 carries out learning of a model so that objective queries are output in the input order when changes in the categories to which the search queries input by input customers are input.
Also, in the information providing device 100 according to the embodiment, the learning unit 433 causes a model to carry out learning for each change in the categories.
Also, in the information providing device 100 according to the embodiment, the learning unit 433 carries out learning of this model so that a search query input by an input customer after a reference query is output when search queries input by this input customer are input in the input order with respect to a model corresponding to changes in the categories to which the search queries input by this input customer belong.
Also, in the information providing device 100 according to the embodiment, the learning unit 433 carries out learning of a model so that, a similar vector is output if a search query corresponding to a similar change in categories is input, and a dissimilar vector is output if a search query corresponding to a dissimilar change in categories is input.
Also, in the information providing device 100 according to the embodiment, the learning unit 433 causes a model to learn the characteristics of changes in the categories of search queries input by an input customer on the way to a target corresponding to the reference query.
Also, the learning processing unit 430 of the information providing device 100 according to the embodiment has the estimation unit 434 which estimates a search query, which is to be input in the future by a customer, by using a model learned by the learning unit 433 from changes in the categories to which the search queries input by this customer belong.
By the above described processes, the information providing device 100 can more appropriately analyze the relation between a customer and a target indicated by a predetermined search query.
Hereinabove, some of the embodiments of the present application have been described in detail based on the drawings. However, these are examples, and, including the aspects described in the section of disclosure of the invention, the present invention can be carried out in other modes with various modifications and/or improvements based on the knowledge of the persons skilled in the art.
Also, the above described information providing device 100 may be realized by a plurality of server computers. Also, the configuration can be flexibly changed depending on the functions, for example, by invoking and realizing an external platform or the like by an Application Programming Interface (API), a network computing, or the like.
Also, the above described “part (section, module, unit)” can be replaced by “means”, “circuit”, or the like. For example, the acquisition unit can be replaced by an acquisition means or an acquisition circuit.
Although the invention has been described with respect to specific embodiments for a complete and clear disclosure, the appended claims are not to be thus limited but are to be construed as embodying all modifications and alternative constructions that may occur to one skilled in the art that fairly fall within the basic teaching herein set forth.
Number | Date | Country | Kind |
---|---|---|---|
2020-050226 | Mar 2020 | JP | national |