This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2014-262321, filed Dec. 25, 2014, the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to a system, a server, and an electronic device.
In recent years, net shopping has been widely spread. With the spread of net shopping, searching for merchandise using speech recognition technology has been proposed in order to allow users who are not familiar with computers to also enjoy net shopping.
A general architecture that implements the various features of the embodiments will now be described with reference to the drawings. The drawings and the associated descriptions are provided to illustrate the embodiments and not to limit the scope of the invention.
Various embodiments will be described hereinafter with reference to the accompanying drawings.
In general, according to one embodiment, a system includes a first server, a second server, and an electronic device communicably connected to one another. The first server includes a first storage storing a database containing at least a plurality of names, and a second storage storing a first list comprising a plurality of notations of words, each of which is associated with at least one additional notation. The second server includes a third storage storing a second list generated based on the database and the first list, the second list associating the plurality of notations of words of the first list with a corresponding pronunciation. The electronic device includes one or more processor. The processor is configured to receive voice data. The processor is configured to identify a notation in the second list associated with a pronunciation obtained as a result of recognition processing applied to the received voice data. The processor is configured to present a user with the identified notation as a search word. The processor is configured to search the database for a first name including the presented search word. The processor is configured to present the user with the search result.
The net shopping system includes a net shopping server 10, a word-pronunciation connecting list distribution server 20, an electronic device 30, a display 40, etc., as illustrated in
The net shopping server 10 is a server having a function of holding a merchandise database, which keeps a list of merchandise, and an alias list, which is consulted at the time of merchandise search processing, and a function of distributing the database and the list to the electronic device 30.
The word-pronunciation connecting list distribution server 20 is a server having a function of holding a word-pronunciation connecting list, which is consulted at the time of speech recognition processing, and a function of distributing the word-pronunciation connecting list to the electronic device 30.
The electronic device 30 is a device having a box-shaped case, as illustrated in
The display 40 is a television set or a display monitor, for example, and is a device which displays on a screen the variety of information output from the computer 30.
Now, a merchandise database will be explained with reference to ” (Simple and convenient meat and vegetable dumplings, in which/gyōza/is written in Hiragana), and the “unit price” per “pack” of these “simple and convenient meat and vegetable dumplings” is “X yen.” Merchandise information A1 has been explained here by way of example. The same thing can be said of the remaining items of merchandise information A2 and A3. Therefore, their detailed explanation will be omitted here. Moreover, the case where the merchandise information stored in a merchandise database has such a data structure as illustrated in
Next, an alias list will be explained with reference to ” (/gyōza/, meaning “meat and vegetable dumplings”, in Kanji) can be differently written as “
” (/gyōza/, meaning “meat and vegetable dumplings”, in Hiragana) or “
” (/gyōza/, meaning “meat and vegetable dumplings”, in Katakana). It should be noted that “
” (in Hiragana) and “
” (in Katakana) are individually equivalent in both pronunciation and meaning to “
” (in Kanji) with which they are connected, and all of these three notations represents the same merchandise. Let us cite another example; B4 and B5 illustrated in
” (/supagetti/, meaning “spaghetti”, in Katakana) can be differently written as “
” (/supageti/, meaning “spaghetti”, in Katakana) and “
” (/supageti/, meaning “spaghetti”, in Katakana). It should be noted that “
” (/supageti/, in Katakana) and “
” (/supageti/, in Katakana) are individually equivalent in meaning and similar in pronunciation to “
” (/supagetti/, in Katakana) with which they are connected, and all of these three notations represents the same merchandise “spaghetti.” The same thing may be said of B3 illustrated in
By consulting the above-mentioned alias list while the search for merchandise is being executed with the use of a typical notation displayed on the display 40 as a search word, an extra search using different notations connected with the typical notation can also be executed in addition to the search using the typical notation at a single merchandise search operation. For example, “” (/sūpu gyōza/, meaning “meat and vegetable dumpling soup”, in Katakana and Kanji) may be found with the use of the typical notation “
” (in Kanji), and “
” (/otegaru gyōza/, meaning “simple and convenient meat and vegetable dumplings”, in Hiragana and Kanji) and “
” (/umai gyōza/, meaning “tasty meat and vegetable dumplings”, in Hiragana and Katakana) may be found with the use of the additional notations “
” (in Hiragana) and “
,” (in Katakana) both being related to the typical notation.
Next, a word-pronunciation connecting list will be explained with reference to ” (in Kanji) will be displayed on the display 40 as a search word (a typical notation). Similarly, C2 illustrated in
” (Japanese ginger in Hiragana) will be displayed on the display 40 as a search word (a typical notation). Moreover, C4-C6 illustrated in
” (spaghetti, in Katakana) will be displayed on the display 40 as a search word (a typical notation). It should be noted that C1, C2, and C4-C6 have been explained by way of example, but that the same thing can be said of C3. Therefore, the detailed explanation of C3 will be omitted here.
The above-mentioned word-pronunciation connecting list makes it possible to reduce useless presentation to a user of various search words having been obtained as a result of speech recognition processing. For example, when the pronunciation “/gyōza/” is obtained as a result of speech recognition processing, the word (or the typical notation) “” (in Kanji) alone will be presented without presenting useless words “
” (in Hiragana) and “
”(in Katakana) as additional search words.
The computer 30 includes a processor 100, a storage device 111, a radio communication module 112, a power supply management IC 113, an HDMI (registered trademark) interface module 114, etc., as illustrated in
The storage device 111 is a recording device which has a nonvolatile memory, a flash memory, a magnetoresistive memory, a hard disk drive, etc.
The radio communication module 112 communicates with servers connected to a network, including the net shopping server 10 and the word-pronunciation connecting list distribution server 20.
The power supply management IC 113 is a single-chip microcomputer for power supply management. Moreover, the power supply management IC 113 generates operating electric power, which should be supplied to each component, using the electric power supplied from an AC adaptor 120.
The HDMI interface module 114 changes a signal suitable for later mentioned low-voltage differential signaling (LVDS) into a signal suitable for High-Definition Multimedia Interface (HDMI).
The processor 100 includes a main processor 101, a main memory 102, a graphics processor 103, an LVDS interface module 104, a receiver 105, etc.
The main processor 101 controls the operation of various modules in the computer 30. The computer 30 executes various programs loaded from the storage device 111 into the main memory 102. The programs which the processor executes include an operating system (OS) 201 and various application programs, in which a net shopping application 202 is included. The net shopping application 202 is a program for enjoying net shopping.
The graphics processor 103 is a display controller which controls the display 40 used as a display monitor. The graphics processor 103 generates picture image data for displaying an image on the display 40. The LVDS interface module 104 changes the picture image data into a signal suitable for low voltage differential signaling (LVDS).
The receiver 105 has functions of receiving voice data, which is input from a microphone 131 in the controller 130, and outputting the received voice data to the main processor 101. Moreover, the receiver 105 has functions of receiving an input signal, which is input from and corresponds to any one of predetermined input keys arranged at an input section 132 in the controller 130, and outputting the received input signal to the main processor 101.
The net shopping application 202 includes a controller 301, a merchandise database acquisition module 302, an alias list acquisition module 303, a word-pronunciation connecting list acquisition renewal module 304, a speech recognition processor 305, a product name search processor 306, etc., as illustrated in
The controller 301 controls the operation of the net shopping application 202.
The merchandise database acquisition module 302 acquires from the net shopping server 10 using the radio communication module 112 a merchandise database which shows a list of merchandise currently dealt with in the net shopping server 10, as illustrated in
The alias list acquisition module 303 acquires an alias list, such as illustrated in
The word-pronunciation connecting list acquisition renewal module 304 acquires a word-pronunciation connecting list, such as illustrated in
The speech recognition processor 305 performs speech recognition processing on voice data which is input from the microphone 131 arranged in the controller 130 and is received by the receiver 105. Specifically, the speech recognition processor 305 analyzes the voice data and generates a text from the voice data. Moreover, the speech recognition processor 305 finds out a typical notation of the word (pronunciation), which has been obtained by generating the text from the voice data, with reference to the word-pronunciation connecting list stored in the storage device 111 and displays the found out typical notation as a search word on the display 40.
When one search word is selected by the user from one or more search words displayed on the display 40, the product name search processor 306 executes merchandise search processing using the selected search word and the alias list stored in the storage device 111 and searches the merchandise database stored in the storage device 111 for merchandise information. The merchandise information acquired as a result of this search is displayed on the display 40.
Next, the processing procedure, which the net shopping application 202 configured as mentioned above executes at the time of net shopping, will be explained below with reference to the flow chart illustrated in
First of all, the net shopping application 202 is made to start by a user's operation. Then, the net shopping application 202 causes the display 40 to display an initial screen G1 illustrated in
Then, the net shopping application 202 causes the display 40 to display an voice input screen G2 illustrated in
When the net shopping application 202 receives voice data having been input through the microphone 131 in the controller 130, it performs speech recognition processing to the voice data (Block 1003). Let us suppose that a user says “” (/gyōza/, meaning “meat and vegetable dumplings”, in Hiragana). That is, the following explanation will be given on the assumption that the net shopping application 202 should obtain the word (pronunciation) “
” (/gyōza/, meaning “meat and vegetable dumplings”, in Hiragana) as a result of the above-mentioned speech recognition processing.
Then, the net shopping application 202 reads from the word-pronunciation connecting list, which is illustrated in ” (meaning “meat and vegetable dumplings”)as a typical notation (in Kanji) related to the word (pronunciation) “
” (meaning “meat and vegetable dumplings”, in Hiragana) obtained by the processing of Block 1003 and causes the display 40 to display the word “
” (meaning “meat and vegetable dumplings”, in Kanji) as a search word. It is assumed here that “
” (/myōga/, meaning “Japanese ginger”, in Hiragana) and “
” (/yōkan/, meaning “sweet bean paste”, in Hiragana) are also read from a word-pronunciation connecting list as other typical notations that are similar in pronunciation to the word (pronunciation) “
” (/gyōza/, meaning “meat and vegetable dumplings”, in Hiragana), and are displayed on the display 40 as further search words. Namely, the net shopping application 202 causes the display 40 to display a search-word display screen G3 such as illustrated in
Next, the net shopping application 202 performs merchandise search processing using the word “” (meaning “meat and vegetable dumplings”, in Kanji) as a search word, when it receives an input signal from an input key which is indicative of “2” and is selected from the input keys arranged at the input section 132 in the controller 130 (Block 1005). Specifically, the net shopping application 202 first reads the words “
” (meaning “meat and vegetable dumplings”, in Hiragana) and “
”(meaning “meat and vegetable dumplings”, in Katakana) as the additional notations related to the word (typical notation) “
” (meaning “meat and vegetable dumplings”, in Kanji) from the alias list illustrated in
,” (in Kanji) “
,” (in Hiragana) and “
,” (in Katakana) from the merchandise database illustrated in
It should be noted that, when an input signal corresponding to an input key indicative of “1” is input by the user while the search-word display screen G3 is being displayed (in other words, when a search word desirable for the user is not displayed on the search-word display screen G3), it returns to processing of Block 1002 and the voice input screen G2 is displayed on the display 40 again.
After that, the net shopping application 202 causes the display 40 to display as a result of merchandise search processing the items of merchandise information A1-A3 acquired by the process of the Block 1005. Namely, the net shopping application 202 causes the display 40 to display the search-results screen G4 such as illustrated in
When desired merchandise is chosen by the user, a settlement screen for purchasing the merchandise will be displayed on the display 40. When the settlement of an account is completed using the settlement screen, a series of actions required for net-shopping by the net shopping application 202 will be terminated.
The first embodiment having been explained above is configured to perform speech recognition processing using a word-pronunciation connecting list, and thus makes it possible to present a user with only a typical notation of a word (pronunciation) obtained by converting voice data into text. Moreover, even if only a single typical notation of a predetermined word (pronunciation) is presented to a user as a search word, it will be possible to perform a comprehensive search using both the typical notation and additional notations related to the typical notation, so long as a word-pronunciation connecting list is connected with an alias list.
Although a case in which the computer 30 performs merchandise search processing has been presented to explain the present embodiment, it is possible that the net shopping server 10 performs merchandise search processing. In this case, it is necessary for the computer 30 to output to the net shopping server 10 an item of information indicative of a typical notation related to a word (pronunciation) obtained as a result of speech recognition processing, but the processing load imposed on the computer 30 will be greatly reduced, since the computer 30 is not required to perform merchandise search processing. Moreover, since the net shopping server 10 performs merchandise search processing, the storage device 111 of the computer 30 is only required to store at most a word pronunciation connecting list, but the storage device 111 is not required to further keep a merchandise database and an alias list.
In the present embodiment, the speech recognition processor 305 converts voice data into text with reference to a word-pronunciation connecting list, and outputs a typical notation of a word (pronunciation) obtained by the conversion of the voice data into the text. Instead, however, it is possible that the speech recognition processor 305 is configured to perform collation of voice data using a word-pronunciation connecting list, with which a pronunciation of each word is registered in advance, and to output as a result of the collation both a pronunciation and a typical notation of a word registered in the list.
It should be remembered that the computer 30 has been supposed to perform speech recognition processing in the above explanation of the present embodiment. However, it is possible that the net shopping server 10 or any other server, which is not illustrated in the drawings, performs speech recognition processing. In such cases, the computer 30 needs to send voice data to a server and to acquire from the server a speech recognition result, but the computer 30 does not need to perform speech recognition processing. Therefore, the processing load imposed on the computer 30 will be greatly reduced. Moreover, when the server performs speech recognition processing, it is the server that must have a word-pronunciation connecting list. Therefore, there is no need to distribute the word-pronunciation connecting list to the computer 30 from the word-pronunciation connecting list distribution server 20. That is, the word-pronunciation connecting list may not be stored in the storage device 111 of the computer 30.
Moreover, in the above explanation of the present embodiment, the product name search processor 306 is supposed to perform merchandise search processing after one search word has been chosen by the user from those displayed on the display 40. Instead, however, it is possible that the display 40 displays one search word alone and the product name search processor 306 performs merchandise search processing without requiring selection by a user.
In the above explanation of the present embodiment, a merchandise database, an alias list, and a word-pronunciation connecting list are supposed to be prepared in Japanese, but it is possible to prepare them in English, for instance. An exemplary data structure for an alias list and a word-pronunciation connecting list, both being prepared in English, will be explained below.
rmel
n” is obtained as a result of speech recognition processing. Moreover, C′2 illustrated in
rmel
nz” is obtained as a result of speech recognition processing. It should be noted that C′1 and C′2 have been explained by way of example, but that the same thing can be said of C′3 and C′4. Therefore, the detailed explanation of C′3 and C′4 will be omitted here.
As has been explained above, even if the alias list and the word-pronunciation connecting list are prepared in English, the effect similar to the above-mentioned effect can be obtained.
Now, a second embodiment will be explained below. What follows is an explanation of a series of acts which the second embodiment executes when the net shopping server 10 does not hold an alias list or when an alias list cannot be obtained from the net shopping server 10. In such a case the following inconvenience may occur: the word-pronunciation connecting list distribution server 20 cannot create a word-pronunciation connecting list, and by extension a series of acts which the first embodiment does at the time of the net shopping cannot be executed in the second embodiment. For this reason, the word-pronunciation connecting list distribution server 20 performs alias list generation processing to generate an alias list. The procedure for generating the alias list will be specifically explained below with reference to the flow chart of
First of all, the word-pronunciation connecting list distribution server 20 acquires a merchandise database from the net shopping server 10 (Block 2001). Then, the word-pronunciation connecting list distribution server 20 performs the merchandise search processing with reference to a search word list prepared beforehand (Block 2002). The search word list is a list of numerous words, each of which can be a search word. Let us suppose that the process of Block 2002 is performed using a word “” (meaning “meat and vegetable dumplings”, in Kanji) as a search word by way of example.
Now, the word-pronunciation connecting list distribution server 20 determines whether or not merchandise information acquired as a result of merchandise search processing includes any items of merchandise information that do not contain a word “” (meaning “meat and vegetable dumplings”, in Kanji) in a product name (Block 2003). In a case where the result of the determination executed at Block 2003 indicates that there are no items of merchandise information that do not contain the word “
” (meaning “meat and vegetable dumplings”, in Kanji) in the product name (NO of Block 2003), then the word-pronunciation connecting list distribution server 20 determines that a possibility that the word “
” (meaning “meat and vegetable dumplings”, in Kanji) is a typical notation is low, returns to processing of Block 2002, extracts from the search word list a word similar in sound or pronunciation to the word “
” (/gyōza/, meaning “meat and vegetable dumplings”, in Kanji), and performs merchandise search processing again using the extracted word as a search word. When there is no items of merchandise information, in each of which a product name does not contain the newly extracted search word, in merchandise information including at least one item of merchandise information and acquired by merchandise search processing executed for every search word, each search word is registered with a word-pronunciation connecting list as a typical notation.
In contrast, when there is an item of merchandise information that does not contain the word “” (meaning “meat and vegetable dumplings”, in Kanji) in the product name as a result of the determination of Block 2003 (YES of Block 2003), the word-pronunciation connecting list distribution server 20 extracts from the merchandise information a word that is identical to “
” (/gyōza/, meaning “meat and vegetable dumplings”, in Kanji) in sound or similar in pronunciation (Block 2004). For example, when the merchandise information A1-A3 illustrated in
” (/gyōza/, meaning “meat and vegetable dumplings”, in Hiragana) and “
” (/gyōza/, meaning “meat and vegetable dumplings”, in Katakana) each as a word that is identical to “
” (/gyōza/, meaning “meat and vegetable dumplings”, in Kanji) in sound or similar in pronunciation.
Then, the word-pronunciation connecting list distribution server 20 performs merchandise search processing using as a search word each of the words “” (meaning “meat and vegetable dumplings”, in Hiragana) and “
” (meaning “meat and vegetable dumplings”, in Katakana) having been extracted by the process of Block 2004. And so, it determines whether or not the merchandise information acquired as a result of the merchandise search processing includes any items of merchandise information that do not contain the search word in the respective product names. Namely, when merchandise search processing is performed using the word “
” (meaning “meat and vegetable dumplings”, in Hiragana) as a search word, it determines whether there is an item of merchandise information that does not contain the word “
” (in Hiragana) in a product name, and when merchandise search processing is performed using the word “
” (meaning “meat and vegetable dumplings”, in Katakana) as a search word, it determines whether or not there is an item of merchandise information that does not contain the word “
” (in Katakana) in a product name (Block 2005).
When it is determined as a result of the determination process executed at Block 2005 that there is no item of merchandise information that does not contain a search word in a product name (NO of Block 2005), the word-pronunciation connecting list distribution server 20 generates an alias list by registering the word used as a search word at the time of processing at Block 2002 as the typical notation, and the word extracted at the time of processing at Block 2004 as an additional notation related to the typical notation (Block 2006). Specifically, the word-pronunciation connecting list distribution server 20 performs merchandise search processing using the word “” (meaning “meat and vegetable dumplings”, in Kanji) as a search word. In such case, it obtains not only items of merchandise information, each containing the word “
” (in Kanji) in its product name, but also further items of merchandise information containing in their individual product names the word “
” (meaning “meat and vegetable dumplings”, in Hiragana) or “
” (meaning “meat and vegetable dumplings”, in Katakana) as a result of merchandise search processing. In contrast, when merchandise search processing is performed using the word “
” (meaning “meat and vegetable dumplings”, in Hiragana) or “
” (meaning “meat and vegetable dumplings”, in Katakana) as a search word, any item of merchandise information which does not contain in its product name neither of the search words but contains another word such as “
(meaning “meat and vegetable dumplings”, in Kanji),” for example, will not be obtained as a result of merchandise search processing. Therefore, the word-pronunciation connecting list distribution server 20 generates an alias list by registering the word “
” (in Kanji) as a typical notation and the rest words “
” (in Hiragana) and “
” (in Katakana) as extra words.
On the other hand, when it is determined that there is item of merchandise information that does not contain a search word in a product name as a result of the determination executed at Block 2005 (YES of Block 2005), the word-pronunciation connecting list distribution server 20 compares the number of items of merchandise information acquired as a result of the merchandise search processing performed at Block 2002 with the number of items of the merchandise information acquired as a result of the merchandise search processing performed at Block 2004, and registers as a typical notation a search word which acquires the largest number of merchandise information items and the other search words as additional notations (Block 2007). Specifically, the word-pronunciation connecting list distribution server 20 compares the number of merchandise information items acquired as a result of merchandise search processing using the words “” (in Kanji) “
,” (in Hiragana) and “
,” (in Katakana) (meaning “meat and vegetable dumplings”) as search words, and generates an alias list or an additional notation list by registering as a typical notation a search word which acquires the largest number of merchandise information items whereas the other search words as additional notations.
As has been explained above, the second embodiment has such a construction as to generate an alias list even when the net shopping server 10 does not have an alias list or an alias list cannot be obtained from the net shopping server 10. Therefore, it is possible that the second embodiment achieves the same effect as the first embodiment does.
It should be noted that the operational procedures of each of the embodiments can be reduced to a computer program, which makes it possible to easily accomplish the same effects as each of the embodiments only to install the computer program in a computer through a computer readable storage medium storing the computer program and to cause the computer to execute the installed computer program.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2014-262321 | Dec 2014 | JP | national |