This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2015-126705, filed on Jun. 24, 2015, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to an information extraction method, an information processing device, and a computer-readable storage medium storing an information extraction program.
In a system (information recommendation system), an item that matches a user's taste is extracted from items corresponding to a search condition, from among a large amount of items (information) accumulated in a database (DB) or the like, and the extracted item is recommended to the user when the search condition has been accepted from the user.
A technique in a related art is discussed in Japanese Laid-open Patent Publication No. 2014-203442.
According to an aspect of the embodiments, an information extraction method, includes: extracting, by a computer, a plurality of item candidates that are candidates of Pareto-optimal items from among items included in item information, based on a search condition; obtaining an order of the plurality of candidates based on history information indicating previously-selected items; calculating a score for each of one or more first items that do not satisfy the search condition and is included in the item information based on the order; and outputting one or more second items having a score which satisfies a specific condition from among the one or more first items based on the score.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
For example, the method in which an item that matches the user's taste is extracted includes a user-based recommendation method, an item-based recommendation method, and a method in which the user-based recommendation method and the item-based recommendation method are mixed. In the user-based recommendation method, an item that the user has accessed previously is determined to correspond to “taste of the user”, and an item that a further user having a similar taste has accessed is recommended to the user. In the item-based recommendation method, an item similar to the item that the user has accessed previously is recommended to the user. In the method in which the user-based recommendation method and the item-based recommendation method are mixed, both of the user-based recommendation method and the item-based recommendation method are used.
For example, the user searches for a desirable item by inputting a search condition and checking a search result that has been extracted as an item that matches the user's taste by using the above-described recommendation method.
In the above-described recommendation methods, it may take time for the user to find the desirable item.
For example, in the above-described recommendation methods, an item that matches the user's taste is recommended from among the items corresponding to the input search condition. Therefore, a further desirable item may be recommended by changing the search condition. The user finds a desirable item by repeatedly inputting a different search condition, checking the search result, and considering an item included in the search result.
An identical symbol is assigned to configurations having the same or similar function, and the duplicated description may be omitted herein.
A single device such as a personal computer may recommend an item, and a system including a terminal device and a server device may recommend an item. For example, the system may include a terminal device operated by the user and a server device that searches for an item in response to a search request from the terminal device and outputs a search result to the terminal device.
An item as a search target may be an electronic component (hereinafter may be referred to as a component). An item as a search target may be other than an electronic component, and may be, for example, a product, a property, or the like.
The item as the search target includes a plurality of elements and each of the elements has a value that characterizes the item. For example, the component may include elements such as life, cost, temperature resistance, and an error in addition to elements indicating electrical characteristics such as a resistance value and rated power. A certain element has the range of a value, and the direction (desirability) such as “good” or “bad” may be determined depending on the value. For example, it may be determined that it is bad (less profitable) when the cost is high, and may be determined that it is good (more profitable) when the cost is low. It may be determined that it is bad (less profitable) when the life is long, and may be determined that it is good (more profitable) when the life is short.
As illustrated in
The input unit 10 accepts inputs of information such as a user previous item selection history 110, user information 111, a user search condition 112, and item information 113, from a storage device that stores data in advance and the input device such as the keyboard and the mouse.
The user previous item selection history 110 is information in which a history of items that have been previously selected by the user is described. The user previous item selection history 110 is managed in the storage device or the like, and information indicating a selected item is added to the user previous item selection history 110 as a history when a selection instruction of the item is accepted from the user. For example, the information indicating the selected item (item name or the like), identification information indicating the user who has selected the item (user ID or the like), and information such as the selected date and time are described in the user previous item selection history 110.
The user information 111 is information indicating the user who has input the user search condition 112. For example, the user ID of the user who has been authenticated through login authentication or the like at the time of item search is input as the user information 111.
The user search condition 112 is a condition used to search for the item, which has been received from the user through the input device such as the keyboard or the mouse, and is a condition of a value in each of the elements of the item. For example, the user search condition 112 may include electrical characteristic values (resistance value and the like) and condition values of temperature resistance, a cost, and an error of the component that is the search target.
The item information 113 is information on various items that are search targets, and is managed in the storage device or the like. For example, the item information 113 may correspond to a DB in which information of a single item is described for each record, and values in elements are described for each of the items in the item information 113. For example, in the item information 113, the values of the elements such as the resistance value, the temperature resistance, the cost, and the error are described for each of the components.
The flag assignment unit 20 generates a user previous item selection history table 120 to which a flag indicating selection order of the items that have been previously selected (hereinafter may be referred to as a selection flag) has been assigned for each of the users based on the user previous item selection history 110 that has been input through the input unit 10. For example, the flag assignment unit 20 sections a certain interval for a history of the items that have been previously selected for each of the users, assigns a selection flag of “Refer” to an item that has been selected within the interval, and assigns a selection flag of “Select” to an item that has been finally selected.
For example, when the user repeatedly searches for a desirable item, an item that has been selected later may be the desirable item for the user as compared with an item that has been selected previously. An item that has been finally selected within a certain time period may be the most desirable item for the user. Thus, in the flag assignment unit 20, when the selection flag of “Refer” or “Select” is assigned to the item that the user has selected previously, creation of meaning of the items corresponding to the selection order is performed.
For example, the flag assignment unit 20 determines whether an item has been selected in a time that is less than the threshold value (T_th) (t<T_th), based on the information on the data and time in which the item has been selected (S12). When the time is less than the threshold value (T_th) (S12: YES), the flag assignment unit 20 records the items to the user previous item selection history table 120 by using serial numbers for each of the users (S13). At that time, the flag assignment unit 20 assigns the selection flag of “Refer” to the item. The flag assignment unit 20 increments by “t” (S14), and the processing returns to S12.
When the time is equal to or more than the threshold value (T_th) (S12: NO), the flag assignment unit 20 assigns the selection flag of “Select” to the item for each of the users, and records the item and the flag to the user previous item selection history table 120 (S15). After that, the flag assignment unit 20 initializes “t” (t=0) (S16), and the processing returns to S12.
In
For example, the similar user extraction unit 30 identifies the user who has input the user search condition 112, based on the user information 111. The similar user extraction unit 30 obtains items that the identified user has selected previously and items that a further user have selected previously, with reference to the user previous item selection history 110. The similar user extraction unit 30 obtains the similarity between the items of the users, identifies the further user who has selected the items having the similarity of a certain value or more, as a similar user. The similar user extraction unit 30 outputs the similar user table 121 in which identification information indicating the identified user (user ID or the like) is described.
The Pareto generation unit 40 generates a Pareto solution that is Pareto-optimal under a constraint in which the user search condition 112 is satisfied (referred to as “Pareto generation”), with reference to the user search condition 112 and the item information 113. The Pareto solution is, for example, a solution in which the merit or demerit for the desirability of the life and the cost is not determined under the constraint in which the user search condition 112 is satisfied. In the Pareto generation, a method by which the value of an optimization objective function is minimized may be used. For example, a known method may be used. The Pareto generation unit 40 extracts an item corresponding to the Pareto solution from among a large amount of items described in the item information 113, and outputs information on the extracted item as an item candidate table 123.
In
For example, the previous history adjustment unit 50 obtains a similar user previous item selection history table 122 that has been obtained by refining the user previous item selection history table 120 of the items (components) that the user has selected previously, by using the users described in the similar user table 121. The previous history adjustment unit 50 substitutes the components that have been selected previously with the components on the Pareto curve 140, for the users of the similar user previous item selection history table 122, with reference to the item candidate table 123 indicating the components on the Pareto curve 140. The score calculation unit 60 performs score calculation by using the components on the Pareto curve 140 as a reference, so that the previous history adjustment unit 50 associates all of the components that have been previously selected by the user with the components on the Pareto curve 140 respectively.
In
For example, the score calculation unit 60 obtains a score of each of the components of the item information 113 (height for the hyperplane), for a hyperplane that passes through the components on the Pareto curve 140 that has been obtained by reflecting the previous selection order of the user, in a multidimensional space including the elements of the components of the item information 113. The score corresponds to the score that corresponds to the previous selection order of the user.
For example, the score calculation unit 60 obtains a hyperplane that passes through a component in which the selection flag is “Select” and the selection order is last (component that has been selected as the most desirable item) and a component in which the selection flag is “Refer”, with reference to the Pareto history table 124. A plurality of hyperplanes is obtained by changing a combination with a component in which the selection flag is “Refer”. For example, the component in which the selection flag is “Select” may be included in all of the hyperplanes, and there may exist a sufficient number of other components to constitute the hyperplanes (components in each of which the selection flag is “Refer”).
The score calculation unit 60 calculates a score by obtaining a distance to each of the plurality of hyperplanes based on the following formula (1), for each of the components each of which does not satisfy the user search condition 112 in the items (components) included in the item information 113.
Score=sgΣjWjΣiajjxi+dj/√{square root over ((Σi(aij)2))} (1)
In the formula (1), “i” is a subscript indicating a dimension, and “j” is a subscript indicating a hyperplane. Here, “a” is a coefficient, and “x” is an item, and “d” is a constant. Here, “sg” is the value of 1 or −1, and the symbol by which the score on “good” side of the hyperplane is caused to be positive is selected. Here, “Wj” is a value indicating the weight of “0≦Wj≦1”. It is assumed that “1” is obtained by performing addition of “Wj” for all of the hyperplanes.
In the formula (1), “Σjαijxi+dj” indicates a hyperplane. A formula obtained by dividing the formula of the hyperplane by a square root of “Σi(αij)2” indicates the height to the hyperplane (general formula of a distance between a point and a plane). In the formula (1), the score is obtained by the weighted linear sum for the height of each of the hyperplanes.
When there are remaining similar users (S21: YES), the score calculation unit 60 extracts a user from the remaining similar users (S22), and extracts a component selection history that is a list of components that have been previously selected by the extracted user (S23). The score calculation unit 60 extracts a combination of a component in which the selection flag is “Select” and a component in which the selection flag is “Refer”, from the component selection history (S24), and the processing returns to S21. When there are components in each of which the selection flag is “Refer”, a plurality of combinations is obtained by changing a combination with a component in which the selection flag is “Select”.
When there is no remaining similar user (S21: NO), combinations of components used to obtain a plurality of hyperplanes have been extracted for the similar user, so that the processing proceeds to S25.
In S25, the score calculation unit 60 calculates the score (desirability) of an item (component) on each of the planes (hyperplanes) by the combinations that have been obtained in S24, based on the formula (1) (S25). The score calculation unit 60 generates an item-score table 125 that stores the score that has been obtained in S25, for each of the items (S26), and the processing ends.
As illustrated in
Formulas of the planes P1 and P2 are respectively obtained as “α0×Life+β0×Cost+y0” and “α1×Life+β1×Cost+y1”. The scores of the planes P1 and P2 are obtained as “W0×(α0×Life+β0×Cost+y0)+(1−W0) (α1×Life+β1×Cost+y1)”(0≦W0≦1) in accordance with the formula (1).
A target item on which the selection tendency of the user has been reflected exists between the above-described planes P1 and P2.
In
For example, the condition setting unit 70 accepts setting of an allowable range outside the range of the user search condition 112, from the user as the mitigation condition. For example, the condition setting unit 70 accepts setting of an allowable range (%) by an increase (+) or decrease (−) for an initial condition (default) of the user search condition 112. The content of the mitigation condition may be a specific numeric value in addition to a proportion for the initial condition, and the content is not particularly limited.
The recommendation unit 80 describes an item corresponding to the user search condition 112 from among the item information 113 in a recommendation item candidate table 130, and outputs the data. For example, the recommendation unit 80 describes an item described in the item candidate table 123, in the recommendation item candidate table 130 as the item corresponding to the user search condition 112. The recommendation unit 80 describes an item in which the score that has been calculated by the score calculation unit 60 satisfies a certain condition, from among the items each of which does not satisfies the user search condition 112, in the recommendation item candidate table 130, with reference to the item-score table 125 and the mitigation condition from the condition setting unit 70. The recommendation unit 80 stores the item corresponding to the user search condition 112 and the item in which the score satisfies the certain condition from among the items each of which does not satisfies the user search condition 112 so that the items to which flags or the like have been assigned are allowed to be identified.
The output unit 90 generates display data used to display the item described in the recommendation item candidate table 130 as an item recommended to the user, and outputs the display data to an output device such as a display. The item recommended to the user is displayed on a display screen of the output device based on the display data.
The output unit 90 generates, for the items that have been ordered in the recommendation item candidate table 130, display data used to perform display such as display corresponding to the order, for example, display for listing the items in the order. Therefore, the user may easily recognize, for example, the recommendation order or the like of the items.
The output unit 90 generates display data used to display the item corresponding to the user search condition 112 and the item in which the score satisfies the certain condition from among the items each of which does not satisfy the user search condition 112 so as to section the items into different display modes, for example, to divide the items into different display areas. For example, the user may easily recognize the divisions between the item corresponding to the user search condition 112 and the item in which the score satisfies the certain condition from among the items each of which does not satisfy the user search condition 112.
The recommendation unit 80 extracts an item that matches the mitigation condition from items (components) between the hyperplanes, which is described in the item-score table 125, with reference to the item information 113 (S32). Therefore, an item that matches the mitigation condition is selected from among the items outside the user search condition 112.
The recommendation unit 80 sorts the items that have been extracted in S32 in order from the highest score (S33). The recommendation unit 80 generates a recommendation item candidate table 130 including the item corresponding to the user search condition 112 and the items that have been obtained by sorting, in order from the highest score, the items each of which does not satisfy the user search condition 112 (S34).
The output unit 90 generates display data used to display an item described in the generated recommendation item candidate table 130 as an item recommended to the user, and performs output of the display data (S35).
The extraction results G4a and G4b are extraction results of items recommended to the user, which are displayed based on the recommendation item candidate table 130. For example, the extraction result G4a is an extraction result of the items that have been extracted in order from the highest score from the items each of which does not satisfy the user search condition 112. The extraction result G4b is an extraction result of the items corresponding to the user search condition 112. The user may easily recognize a recommendation item corresponding to the input user search condition 112, by referring to the extraction results G4a and G4b. The user may easily find an item having high score from among the items each of which does not satisfy the input user search condition 112, with reference to the extraction result G4a. Therefore, a desirable item may be found without repeatedly searching for an item with a different search condition.
For example, an item that is a search target may be an electronic component (resistor). As the search condition, a condition (fixed condition) in which it is assumed that the temperature resistance is 100° C. or more, and the resistance error is 5% or less and a condition in which it is assumed that the life is seven years or more, and the cost is 5.0 yen or less (optimization condition) may be input.
The score calculation unit 60 obtains a plurality of hyperplanes each of which passes through a component in which the selection flag is “Select” and a component in which the selection flag is “Refer”, based on the components on the Pareto curve 140 on which the previous selection order of the user has been reflected by the history adjustment. The score calculation unit 60 obtains a score for each of the components each of which does not satisfy the user search condition 112 by each of the obtained hyperplanes. The information extraction device 1 obtains an item candidate recommended to the user from the components each of which does not satisfy the user search condition 112, based on the scores.
For example, in the information extraction device 1, an item (R0) within the range 142 corresponding to the outside of the search condition 150, on the plane P3 may be an item candidate recommended to the user. An item (R1) in the range 142 corresponding to the outside of the search condition 150, on the plane P4 may be an item candidate recommended to the user. An item (R2) that exists at “W0=0.5” between the planes P3 and P4 may be an item candidate recommended to the user.
The score calculation unit 60 may sort the items in order of the calculated scores. For example, it is only sufficient that the sorting of the items is performed so that the user easily find a desirable item. In the sorting of the items, a degree that the item does not satisfy the user search condition 112 may be considered.
As illustrated in
In the information extraction device 1, the extraction results that have been obtained by sorting the items in order from the largest score difference with the item in which the selection flag is “Select” in the items each of which does not satisfy the user search condition 112 are output. Therefore, for example, the user may easily find an item far from the user search condition 112.
The above-described various processing may be achieved by causing a computer such as a personal computer or a workstation to execute a program that has been prepared in advance.
As illustrated in
The information extraction program 351 may be stored in the HDD 350. For example, the information extraction program 351 may be stored in “portable physical medium” such as a flexible disk (FD), a compact disc-read-only memory (CD-ROM), a magneto optical (MO) disk, a digital versatile disc (DVD) disk, or an integrated circuit (IC) card inserted into the medium reading device 340 of the computer 300. The information extraction program 351 may be stored in “fixed physical medium” such as an HDD provided inside or outside the computer 300. The information extraction program 351 may be stored in “further computer system” coupled to the computer 300 through a public line, the Internet, a local area network (LAN), or a wide area network (WAN). The computer 300 may read the program from the medium or the system and execute the program.
For example, the information extraction program 351 may be computer-readable stored in the recording medium such as “portable physical medium”, “fixed physical medium”, or “communication medium”. The computer 300 achieves the above-described functions by reading the information extraction program 351 from such a recording medium and executing the information extraction program 351. The program may be executed by the computer 300, may be executed by a further computer system, a server, or the like, or may be executed through a combination of the computer 300 and the further computer system, the server, or the like.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2015-126705 | Jun 2015 | JP | national |