This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2016-148756, filed on Jul. 28, 2016, and the prior Japanese Patent Application No. 2017-097903, filed on May 17, 2017, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to a technology of evaluating language description.
A sentence correction tool detects a grammatical error included in a sentence. A sentence analysis tool charts dependency relations in sentences. The tool is useful for evaluating sentences that express logical contents like business documents or articles. This is because logical content of a grammatically correct and easily interpretable sentence is easily transferred.
A technology relating to this is disclosed in, for example, Japanese Laid-open Patent Publication No. 2005-136810 and Japanese Laid-open Patent Publication No. 2008-46425.
According to an aspect of the invention, an evaluation device includes a memory and a processor coupled to the memory and the processor configured to acquire a first character string, to specify features of first meaning information and features of first sound information, the first meaning information corresponding to meanings denoted by the first character string, the first sound information corresponding to sounds denoted by the first character string, to determine a first evaluation value based on the features of the first meaning information, to determine a second evaluation value based on the features of the first sound information, and to output an evaluation value of the first character string based on the first evaluation value and the second evaluation value.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
Interpretation of, for example, literary works, and description for private communication or advertisement depends upon sensitivity of a reader, and thus, it does not have to be logical in many cases. Hence, in a case where a character string relating to such description is intended to be evaluated, a sentence correction tool or a sentence analysis tool does not help much. The sensitivity of the reader has various aspects.
A technology disclosed in the example estimates sensitive evaluation of a reader relating to a character string, in one aspect.
The first reception unit 103 receives teacher data. The learning unit 105 performs learning processing. During the learning processing, a first model relating to meaning and a second model relating to voice are generated by machine learning. The second reception unit 107 receives a character string which is evaluated. The evaluation unit 109 performs evaluation processing. During the evaluation processing, a first evaluation value relating to meaning is calculated by using the first model, and a second evaluation value relating to voice is calculated by using the second model. Furthermore, during the evaluation processing, an overall third evaluation value is calculated based on the first evaluation value and the second evaluation value. The output unit 111 outputs the third evaluation value.
The teacher data storage unit 121 stores teacher data. The teacher data will be described below with reference to
The first reception unit 103, the learning unit 105, the second reception unit 107, the evaluation unit 109, and the output unit 111 are realized by using hardware resources (for example,
The teacher data storage unit 121, the meaning vector database 123, the first model data storage unit 125, the voice notation database 127, and the second model data storage unit 129 are realized by using the hardware resources (for example,
The sample ID identifies the sample. It is assumed that a set of the character string and the evaluation value is prepared as a sample in advance. The evaluation value is a value which is obtained by evaluating the character string.
For example, the illustrated first record indicates that the evaluation value with respect to the character string “” which is identified by a sample ID “S001” is “0.9”.
It is assumed that the meaning vector database 123 is obtained by analyzing a sentence corpus (for example, a sentence registered in a dictionary site or a social network service (SNS) site) by using a word vectoring tool (for example, Word2Vec). A word appears in the sentence corpus. The meaning vector indicates semantic feature of the word.
In the Hiragana notation, reading of the word is represented by Hiragana. In this example, an example of the Hiragana notation is described, but Katakana notation may be used. In addition, a Roman notation or a phonetic symbol notation may be used. Furthermore, data indicating a waveform of sound may be stored in addition to voice notation.
For example, an illustrated first record indicates that words “” are written as “” in Hiragana.
Subsequently, a procedure of calculating parameters relating to meaning will be described. The parameter is used as an input value of the neural network.
Next, each word is converted into meaning vectors corresponding to features relating to the words. For example, the word “” is converted into meaning vectors (0.3, 0.2, . . . , 0.9). As described above, meaning vectors of each word are registered in the meaning vector database 123 which is prepared in advance.
Hence, parameters relating to meaning are obtained based on the meaning vectors. In this example, sets of three consecutive meaning vectors are sequentially specified, a maximum value of elements included in each meaning vector of each set is selected, and the selected maximum value is used as a parameter relating to meaning. In relation to a set of a first meaning vector to a third meaning vector, this example represents that a value of “0.9” which is maximum among elements included in each meaning vector becomes a first parameter relating to meaning. In addition, in relation to a set of a second meaning vector to a fourth meaning vector, this example represents that a value of “0.7” which is maximum among elements included in each meaning vector becomes a second parameter relating to meaning. Hereinafter, the same is used.
Here, an example in which the number of meaning vectors included in the set is three is described, but the number of meaning vectors included in the set may be four or more. In addition, the number of meaning vectors included in the set may be two or less. In a case where the number of meaning vector included in the set is one, a parameter relating to meaning for each word is obtained.
Subsequently, a configuration of the neural network which is used for machine learning relating to meaning will be described.
Input values of the neural network are each parameter (X1 to XM) relating to meaning. Accordingly, the input layer includes units corresponding to each parameter relating to meaning. For example, input layer includes a unit corresponding to a maximum number (M) among the number of sets specified in the character strings of each sample. Meanwhile, in a case where the number of sets specified in a certain sample is smaller than the number of units, a predetermined value may be set for a part of the units, for example.
In this example, the intermediate layer includes the same number of units as the input layer. However, the number of units of the intermediate layer may not be the same as the input layer. The output layer includes one unit corresponding to an evaluation value (S) which is an output value of the neural network. Here, an example of the neural network of three layers is described, but the neural network of four layers or more may be used. In addition, a learning instrument other than the neural network may be used.
Subsequently, a procedure of calculating parameters relating to voice will be described. The parameters are used as an input value of the neural network.
Next, the Hiragana notation is converted into a Roman notation. In this example, the Hiragana notation “” is converted into a Roman notation “Sakuratiri midorinomebuku yuuhodou”. Meanwhile, the character string of the sample may be directly converted into the Roman notation.
Next, syllables are extracted from the Roman notation. In this example, syllables “Sa”, “ku”, and the like are extracted. Meanwhile, one Hiragana character may be extracted from the Hiragana notation, and the syllables may be specified by the Roman notation corresponding to the Hiragana.
Next, the extracted each syllable is converted into features relating to voice. In this example, the features relating to voice are represented by 22 dimensional voice vectors. Each element included in the voice vector corresponds to a vowel or a consonant. Hence, in a case where a syllable includes a vowel or a consonant corresponding to the element, a value of the element is “1”. Values of the other elements are “0”. In this example, each element corresponds to a vowel “a”, a vowel “i”, a vowel “u”, a vowel “e”, a consonant “o”, a consonant “k”, a consonant “s”, a consonant “t”, a consonant “n”, a consonant “h”, a consonant “m”, a consonant “y”, a consonant “r”, a consonant “w”, a consonant “nn”, a consonant “g”, a consonant “z”, a consonant “d”, a consonant “b”, a consonant “p”, a consonant “x”, and no consonant, sequentially from the first. Meanwhile, a syllabic nasal “n” is regarded as one syllable by itself, and a value of an element corresponding to a consonant “nn” is set to “1”. In addition, in a case where the syllable does not include a consonant, a value of an element corresponding to no consonant is set to “1”. For example, since the first syllable “Sa” includes a consonant “s” and a vowel “a”, the syllable is converted into a voice vector 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0. In the same manner, since the second syllable “ku” includes a consonant “k” and a vowel “u”, the syllable is converted into a voice vector 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0. Meanwhile, here, an example in which a voice vector is specified based on the Roman alphabet is described, but the voice vector may be specified based on phonetic symbols.
Hence, the voice vector relating to voice is converted into a parameter relating to voice. In this example, 22 elements included in the voice vector correspond to each digit of a binary number with 22 digits. Hence, a value of the binary number is converted into a value of a hexadecimal number. For example, in a case of voice vector (1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) which is the first feature relating to voice, a binary number “1000001000000000000000” is converted into a hexadecimal number “0x208000”. Hence, the hexadecimal number “0x208000” becomes a first parameter relating to voice. In the same manner, in a case of the voice vector (0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) which is the second feature relating to voice, a binary number “0010010000000000000000” is converted into a hexadecimal number “0x90000”. Hence, the hexadecimal number “0x90000” becomes the second parameter relating to voice.
Subsequently, a configuration of the neural network which is used for machine learning relating to voice will be described.
Input values of the neural network are each parameter (Y1 to YN) relating to voice. Accordingly, the input layer includes units corresponding to each parameter relating to voice. For example, input layer includes a unit corresponding to a maximum number (N) among the number of syllables included in character strings of each sample. Meanwhile, in a case where the number of syllables included in the character string is smaller than the number of units, a predetermined value may be set for a part of the units, for example.
In this example, the intermediate layer includes the same number of units as the input layer. However, the number of units of the intermediate layer may not be the same as the input layer. The output layer includes one unit corresponding to the evaluation value (S) which is an output value of the neural network. Here, an example of the neural network of three layers is described, but the neural network of four layers or more may be used. In addition, a learning instrument other than the neural network may be used.
Subsequently, an operation of the learning unit 105 will be described.
The first generation unit 901 performs first generation processing. In the first generation processing, a first feature table and a first parameter table are generated. The second generation unit 903 performs second generation processing. In the second processing, a first model relating to meaning is generated by machine learning which uses the neural network. The third generation unit 905 performs third generation processing. In the third generation processing, a first notation table, a second notation table, a second feature table, and a second parameter table are generated. The fourth generation unit 907 performs fourth generation processing. In the fourth generation processing, a second model relating to voice is generated by the machine learning which uses the neural network.
The first feature storage unit 921 stores the first feature table. The first feature table will be described below with reference to
The first generation unit 901, the second generation unit 903, the third generation unit 905, and the fourth generation unit 907 are realized by using the hardware resources (for example,
The first feature storage unit 921, the first parameter storage unit 923, the first notation storage unit 925, the second notation storage unit 927, the second feature storage unit 929, and the second parameter storage unit 931 are realized by using the hardware resources (for example,
Hereinafter, learning processing of the learning unit 105 will be described.
The first generation unit 901 performs first generation processing (S1003). Before the first generation processing is described, the first feature table and the first parameter table which are generated in the first generation processing will be described.
The sample ID identifies the sample. The features relating to meaning are features relating to meaning of each word included in a character string of the sample. Hence, fields corresponding to the number of words included in the character string of the sample are included. In this example, the features relating to meaning is meaning vectors. However, the features relating to meaning may be vectors other than the meaning vectors.
For example, the illustrated first record indicates that features of words, which appear first, of a character string which is identified by a sample ID “5001” are represented by meaning vectors (0.3, 0.2, . . . , 0.9) and in the same manner, features of words which appear second are represented by meaning vectors (0.1, 0.4, . . . , 0.5).
The sample ID identifies the sample. The parameters relating to meaning are parameters based on the features relating to meaning of words included in the character string of the sample, as described with reference to
For example, the illustrated first record indicates that the first parameter based on the features relating to meaning of the words included in the character string which is identified by the sample ID “S001” is “0.9” and in the same manner, the second parameter is “0.7”. Meanwhile, in a case where the feature relating to meaning is a one-dimensional variable, the feature relating to meaning may be used as the parameter relating to meaning as it is.
Subsequently, the first generation processing will be described.
The first generation unit 901 provides new record to the first feature table (S1303). A sample ID which is specified in S1301 is stored in the new record.
The first generation unit 901 extracts words included in a character string of a sample (S1305). For example, the first generation unit 901 performs morphological analysis thereby dividing the character string into a plurality of words.
The first generation unit 901 specifies one word which is extracted (S1307). Specifically, the first generation unit 901 sequentially specifies the divided words from the head.
The first generation unit 901 acquires meaning vectors of the words from the meaning vector database 123 (S1309). Hence, the first generation unit 901 sequentially stores the meaning vectors in the field of the feature relating to meaning in the record provided in S1303 (S1311).
The first generation unit 901 determines whether or not there is an unprocessed word among the words extracted in S1305 (S1313). In a case where it is determined that there is an unprocessed word, the flow returns to the processing illustrated in S1307 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed word, the first generation unit 901 determines whether or not there is an unprocessed sample (S1315). In a case where it is determined that there is an unprocessed sample, the flow returns to the processing illustrated in S1301 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed sample, the flow proceeds to processing of S1321 illustrated in
Description of
The first generation unit 901 provides new record to the first parameter table (S1323). The sample ID specified in S1321 is stored in the new record.
The first generation unit 901 specifies a set of three consecutive meaning vectors (S1325). Firstly, meaning vectors which are set in the first to third fields among a plurality of fields in which features relating to meaning are stored are specified in the record of the first feature table. Secondly, meaning vectors set in the second to fourth fields are specified. Thereafter, sequential shifting of the fields is performed to specify three meaning vectors.
The first generation unit 901 specifies a maximum value among the elements of the meaning vectors included in the set (S1327). Hence, the first generation unit 901 stores the maximum value in the field of the parameters relating to meaning (S1329).
The first generation unit 901 determines whether or not there is an unprocessed set (S1331). In a case where it is determined that there is an unprocessed set, the flow returns to the processing illustrated in S1325 and repeats the aforementioned processing. Meanwhile, the number of sets specified by the repeated processing in this example is the number of words, which are included in the character string, minus 2.
Meanwhile, in a case where it is determined that there is no unprocessed set, the first generation unit 901 determines whether or not there is an unprocessed sample (S1333). In a case where it is determined that there is an unprocessed sample, the flow returns to the processing illustrated in S1321 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed sample, the first generation processing ends. At a point of time when it is determined that there is no unprocessed sample, generation of the first parameter table is completed. If the first generation processing ends, the flow returns to the learning processing of a caller.
Description of
The third generation unit 905 performs the third generation processing (S1007). Before the third generation processing is described, the first notation table, the second notation table, the second feature table, and the second parameter table which are generated by the third generation processing will be described.
The sample ID identifies the sample. The Hiragana notation of the character string represents reading of the character string of the sample by using Hiragana.
For example, the illustrated first record indicates that reading of the character string of the first sample is represented by Hiragana .
The sample ID identifies the sample. The Roman notation of the character string represents reading of the character string of the sample by using Roman alphabets.
For example, the illustrated first record indicates that reading of the character string of the first sample is represented by Roman notation “Sakuratiri midorinomebuku yuuhodou”.
The sample ID identifies a sample. The features relating to voice are features relating to voice of each syllable included in a character string of the sample. Hence, fields corresponding to the number of syllables included in the character string of the sample are provided. In this example, the features relating to voice are voice vectors. However, the features relating to voice may be vectors other than the voice vectors.
For example, the illustrated first record indicates that a first syllable of the first sample is represented by a voice vector (1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), and in the same manner the second syllable is represented by a voice vector (0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0).
The sample ID identifies a sample. The parameters relating to voice are obtained for each syllable included in the character string of the sample, as described with reference to
For example, the illustrated first record indicates that the parameter based on the feature of the first syllable included in the character string which is identified by a sample ID “S001” is a hexadecimal number “0x208000” and in the same manner the parameter based on the feature of the second syllable is a hexadecimal number “0x90000”. Meanwhile, in a case where the feature relating to voice is a one-dimensional variable, the feature relating to voice may be used as the parameter relating to voice as it is.
Subsequently, the third generation processing will be described.
The third generation unit 905 provides new record to the first notation table (S1803). The sample ID which is specified in S1801 is stored in the new record.
The third generation unit 905 extracts words included in a character string of a sample (S1805). For example, the third generation unit 905 performs morphological analysis thereby dividing the character string into a plurality of words.
The third generation unit 905 specifies one word which is extracted (S1807). Specifically, the first generation unit 901 sequentially specifies the divided words from the head.
The third generation unit 905 acquires Hiragana notation of the words from the voice notation database 127 (S1809). Hence, the third generation unit 905 adds the Hiragana notation of the words to the field of the Hiragana notation of the character string provided in S1803 (S1811).
The third generation unit 905 determines whether or not there is an unprocessed word among the words extracted in S1805 (S1813). In a case where it is determined that there is an unprocessed word, the flow returns to the processing illustrated in S1807, and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed word, the third generation unit 905 determines whether or not there is an unprocessed sample (S1815). In a case where it is determined that there is an unprocessed sample, the flow returns to the processing illustrated in S1801, and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed sample, the flow proceeds to processing of S1821 illustrated in
Description of
The third generation unit 905 provides new record to the second notation table (S1823). The sample ID specified in S1821 is stored in the new record.
The third generation unit 905 converts the Hiragana notation of the character string of the sample into Roman notation of the character string (S1825). The third generation unit 905 stores the Roman notation of the character string in the record provided in S1823 (S1827).
The third generation unit 905 determines whether or not there is an unprocessed sample (S1829). In a case where it is determined that there is an unprocessed sample, the flow returns to the processing illustrated in S1821 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed sample, the flow proceeds to processing of S1841 illustrated in
Description of
The third generation unit 905 provides new record to the second feature table (S1843). The sample IDs specified in S1841 are stored in the new record.
The third generation unit 905 specifies one syllable included in the Roman notation of the character string (S1845). In this example, the third generation unit 905 sequentially specifies the syllables from front to back. The syllable is only one vowel or a combination of one vowel and one consonant.
The third generation unit 905 converts the syllable into a voice vector (S1847). A value of an element corresponding to a vowel included in the syllable is set to “1”. In addition, in a case where the syllable is a consonant, a value of an element corresponding to the consonant is set to “1”. Values of the other elements are set to “0”. Meanwhile, a syllabic nasal “n” is treated as one unit by itself, and a value of an element corresponding to a consonant “nn” is “1”.
The third generation unit 905 sequentially stores the voice vectors in the field of the feature relating to voice in the record provided in S1843 (S1849).
The third generation unit 905 determines whether or not there is an unprocessed syllable (S1851). In a case where it is determined that there is an unprocessed syllable, the flow returns to the processing illustrated in S1845 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed syllable, the third generation unit 905 determines whether or not there is an unprocessed sample (S1853). In a case where it is determined that there is an unprocessed sample, the flow returns to the processing illustrated in S1841 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed sample, the flow proceeds to the processing of S1861 illustrated in
Description of
The third generation unit 905 provides new record in the second parameter table (S1863). The sample ID specified in S1861 is stored in the new record.
The third generation unit 905 specifies one voice vector stored in the field of the feature relating to voice in the record of the sample in the second feature table (S1865). Specifically, the third generation unit 905 sequentially specifies the fields of the feature relating to voice from front to back, and reads the vector stored in the field.
The third generation unit 905 converts the voice vector into a numeral value (S1867). As described above, the third generation unit 905 converts binary values of 22 digits which are obtained by respectively associating 22 elements included in the voice vector with each digit of a binary number into a hexadecimal number.
The third generation unit 905 sequentially stores the numeral values in the field of the parameter relating to voice provided in S1863 (S1869).
The third generation unit 905 determines whether or not there is an unprocessed voice vector (S1871). In a case where it is determined that there is an unprocessed voice vector, the flow returns to the processing illustrated in S1865 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed voice vector, the third generation unit 905 determines whether or not there is an unprocessed sample (S1873). In a case where it is determined that there is an unprocessed sample, the flow returns to the processing illustrated in S1861 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed sample, the third generation processing ends. At a point of time when it is determined that there is no unprocessed sample, generation of the second parameter table is completed. If the third generation processing ends, the flow returns to the learning processing of a caller.
Description of
Subsequently, an operation of the evaluation unit 109 will be described.
The fifth generation unit 1901 performs fifth generation processing. In the fifth generation processing, parameters relating to meaning are generated based on features relating to meaning of each word included in a character string which is evaluated. The first application unit 1903 performs a first application processing. In the first application processing, the parameters relating to meaning are applied to a first model, and a first evaluation value relating to meaning is estimated. The sixth generation unit 1905 performs sixth generation processing. In the sixth generation processing, parameters relating to voice are generated based on features relating to voice of each syllable included in a character string which is evaluated. The second application unit 1907 performs a second application processing. In the second application processing, the parameters relating to voice are applied to a second model, and a second evaluation value relating to voice are estimated. The calculation unit 1909 calculates an overall third evaluation value, based on the first evaluation value and the second evaluation value.
The fifth generation unit 1901, the first application unit 1903, the sixth generation unit 1905, the second application unit 1907, and the calculation unit 1909 are realized by using the hardware resources (for example,
The fifth generation unit 1901 performs the fifth generation processing (S2003). In the fifth generation processing, a parameter based on the character string which is evaluated is generated by using the same procedure as in a case of the first generation processing of the learning processing.
The fifth generation unit 1901 specifies one word (S2103). For example, the fifth generation unit 1901 sequentially specifies the divided words from the head.
The fifth generation unit 1901 acquires a meaning vector of the word from the meaning vector database 123 (S2105). Hence, the fifth generation unit 1901 adds the acquired meaning vector to a string of the feature relating to meaning (S2107). The string of the feature relating to meaning is retained as an internal parameter, specifically in the string of the meaning vector.
The fifth generation unit 1901 determines whether or not there is an unprocessed word (S2109). In a case where it is determined that there is an unprocessed word, the flow returns to the processing illustrated in S2103 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed word, the fifth generation unit 1901 specifies a set of three consecutive meaning vectors in the string of the feature relating to meaning (S2111). Firstly, first to third meaning vectors are specified. Secondly, second to fourth meaning vector are specified. Even thereafter, sequential shifting is performed to specify three meaning vectors.
The fifth generation unit 1901 specifies a maximum value among elements of the meaning vectors included in the set (S2113). Hence, the fifth generation unit 1901 adds the maximum value to the string of the parameters relating to meaning (S2115). The string of the parameters relating to meaning is retained as internal parameters.
The fifth generation unit 1901 determines whether or not there is an unprocessed set (S2117). In a case where it is determined that there is an unprocessed set, the flow returns to the processing illustrated in S2111 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed set, the fifth generation processing ends. If the fifth generation processing ends, the flow returns to an evaluation processing of a caller.
Description of
The sixth generation unit 1905 performs the sixth generation processing (S2007). In the sixth generation processing, parameters relating to voice are generated based on the features relating to voice of each syllable included in the character string which is evaluated. The parameters are generated by the same procedure as in a case of the third generation processing of the learning processing.
The sixth generation unit 1905 specifies one word (S2203). For example, the sixth generation unit 1905 sequentially specifies the divided words from the head.
The sixth generation unit 1905 acquires Hiragana notation of the words from the voice notation database 127 (S2205). The sixth generation unit 1905 adds the Hiragana notation of the words to the Hiragana notation of the character string (S2207). The Hiragana notation of the character string is retained as an internal parameter.
The sixth generation unit 1905 determines whether or not there is an unprocessed word (S2209). In a case where it is determined that there is an unprocessed word, the flow returns to the processing illustrated in S2203 and repeats the aforementioned processing.
In a case where it is determined that there is no unprocessed word, the sixth generation unit 1905 converts the Hiragana notation of the character string into the Roman notation of the character string (S2211). The Roman notation of the character string is retained as an internal parameter. Hence, the flow proceeds to processing of S2213 illustrated in
Description of
The sixth generation unit 1905 converts the syllable into voice vectors in the same manner as in a case of the third generation processing (S2215). The sixth generation unit 1905 adds the voice vectors to a string of the feature relating to voice (S2217). The string of the feature relating to voice is specifically a string of voice vectors and is retained as an internal parameter.
The sixth generation unit 1905 determines whether or not there is an unprocessed syllable (S2219). In a case where it is determined that there is an unprocessed syllable, the flow returns to the processing illustrated in S2213 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed syllable, the sixth generation unit 1905 specifies one voice vector of the string of the feature relating to voice (S2221). For example, the voice vectors are sequentially specified from the head of the string.
The sixth generation unit 1905 converts the voice vector into a numeral value in the same manner as in a case of the third generation processing (S2223). Hence, the sixth generation unit 1905 adds the numeral value to the string of the parameters relating to voice (S2225). The string of the parameters relating to voice is retained as an internal parameter.
The sixth generation unit 1905 determines whether or not there is an unprocessed voice vector (S2227). In a case where it is determined that there is an unprocessed voice vector, the flow returns to the processing illustrated in S2221 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed voice vector, the sixth generation processing ends. If the sixth generation processing ends, the flow returns to the evaluation processing of a caller.
Description of
The calculation unit 1909 performs calculation processing (S2011). In the calculation processing, an overall third evaluation value is calculated based on the first evaluation value and the second evaluation value. Hereinafter, two examples of the calculation processing will be described.
First, the calculation unit 1909 determines whether or not the first evaluation value exceeds a threshold (S2401). In a case where it is determined that the first evaluation value does not exceed the threshold, the calculation unit 1909 sets the third evaluation value to “0” (S2403).
Meanwhile, in a case where the first evaluation value exceeds the threshold, the calculation unit 1909 determines whether or not the second evaluation value exceeds the threshold (S2405). In a case where it is determined that the second evaluation value does not exceed the threshold, the calculation unit 1909 sets the third evaluation value to “0” (S2403).
Meanwhile, in a case where it is determined that the second evaluation value exceeds the threshold, the third evaluation value is set to “1” (S2407).
In this example, in a case where the first evaluation value and the second evaluation value exceed the threshold, the third evaluation value is set to “1”. However, in a case where at least the first evaluation value or the second evaluation value exceeds the threshold, the third evaluation value may be set to “1”. If the calculation processing ends, the flow returns to the evaluation processing of a caller.
Description of
According to the present embodiment, it is possible to estimate sensitive evaluation of a reader with respect to a character string. For example, it is expected to reflect a rule which is hard to formulate like implicit associations and linguistic rhythms.
In addition, it is possible to reflect semantic impression that a reader receives from a word in a first model.
In addition, it is possible to reflect voice impression that the reader receives from a syllable in a second model.
As such, one embodiment is described, and the embodiment is not limited to this. For example, the aforementioned functional block configuration may not match a program module configuration.
In addition, the configurations of the aforementioned each storage region are examples, and each storage region does not have to be configured as described above. Furthermore, also in the processing flow, the order of processing may be changed or a plurality of processing may be performed in parallel, unless processing results are not changed.
The present embodiment describes an application example relating to a character string in English.
A module configuration of an evaluation apparatus 101 according to a second embodiment is the same (
In the phonetic symbol notation, the reading of the word is represented by phonetic symbols. The voice notation database 127 may store data of waveforms of the voice in addition to a voice notation.
A procedure of calculating parameters relating to meaning in the second embodiment is the same (
Next, each word is converted into a meaning vector corresponding to features relating to meaning. For example, a word “cherry” is converted into the meaning vector (0.3, 0.2, . . . , 0.9). As described above, the meaning vector of each word is registered in the meaning vector database 123 which is prepared in advance.
A configuration of a neural network used for machine learning relating to the meaning in the second embodiment is the same as in a case (
Next, words are extracted from the phonetic symbol notation, and the extracted phonetic symbol notation of each word is converted into features relating to voice.
The voice vectors which are features relating to the voice are converted into parameters relating to the voice. In this example, each of the 45 elements included in a voice vector corresponds to one digit of a 45-digit binary number. Binary values are converted into hexadecimal values. For example, in a case of the voice vector (0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0) which is a first feature relating to the voice, the binary number “000000001000100000000000001000000001000000000” is converted to the hexadecimal number “0x001100040200”. The hexadecimal number “0x001100040200” is a first parameter relating to the voice.
A configuration of a neural network used for the machine learning relating to voice in the second embodiment is the same as in a case of the first embodiment (
The third notation storage unit 3101 is realized by using hardware resources (for example,
A third generation unit 905 performs third generation processing (B) instead of the third generation processing (A).
A first generation unit 901 performs first generation processing in the same manner as in a case of the first embodiment (S1003).
The first generation processing is the same as in a case (
Second generation processing is also the same as in a case of the first embodiment.
The third generation unit 905 performs the third generation processing (B) (S3201). Before describing the third generation processing (B), the third notation table, a second feature table, and a second parameter table will be described.
The sample ID identifies the sample. In the phonetic symbol notation of the character string, the reading of the character string of the sample is represented by phonetic symbols.
A configuration of the second feature table according to the second embodiment is the same as in a case (
For example, the illustrated first record indicates that a first word of the first sample is represented by the voice vector (0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0).
A configuration of the second parameter table according to the second embodiment is the same as in a case (
For example, the illustrated first record indicates that the parameter based on the feature of the first word included in a character string identified by a sample ID “S001” is a hexadecimal number “0x001100040200”.
Next, the third generation processing (B) will be described.
The third generation unit 905 provides a new record to the third notation table (S3601). The sample ID specified in S1801 is stored in the new record.
The third generation unit 905 extracts words included in the character string of the sample (S1805). For example, the third generation unit 905 performs morphological analysis thereby dividing the character string into a plurality of words.
The third generation unit 905 specifies one word which is extracted (S1807) in the same manner as in a case of the first embodiment. Specifically, the first generation unit 901 sequentially specifies the divided words from the head.
The third generation unit 905 acquires the phonetic symbol notation of each of the words from the voice notation database 127 (S3603). Hence, the third generation unit 905 adds the phonetic symbol notation of each of the words to the field of the phonetic symbol notation of the character string provided in S3601 (S3605).
The third generation unit 905 determines whether or not there is an unprocessed word among the words extracted in S1805 (S1813). In a case where it is determined that there is an unprocessed word, the flow returns to the processing illustrated in S1807, and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed word, the third generation unit 905 determines whether or not there is an unprocessed sample (S1815). In a case where it is determined that there is an unprocessed sample, the flow returns to the processing illustrated in S1801, and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed sample, the flow proceeds to processing of S1841 illustrated in
Description of
The third generation unit 905 provides a new record to the second feature table (S1843) in the same manner as in a case of the first embodiment. The sample ID specified in S1841 is stored in the new record.
The third generation unit 905 specifies one word denoted by the phonetic symbol notation of the character string (S3607). In this example, the third generation unit 905 sequentially specifies words from front to back.
The third generation unit 905 converts the phonetic symbol notation of the word into a voice vector (S3609). A value of an element corresponding to a vowel included in the phonetic symbol notation of the word is set to “1”. In a case where the phonetic symbol notation of the word includes a consonant, the value of the element corresponding to the consonant is set to “1”. Values of other elements are set to “0”.
The third generation unit 905 sequentially stores the voice vector in the field of features relating to the voice in the record provided in S1843 (S1849).
The third generation unit 905 determines whether or not there is an unprocessed word (S3611). In a case where it is determined that there is an unprocessed word, the flow returns to the processing illustrated in S3607, and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed word, the third generation unit 905 determines whether or not there is an unprocessed sample (S1853). In a case where it is determined that there is an unprocessed sample, the flow returns to the processing illustrated in S1841 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed sample, the flow proceeds to the processing of S1861 illustrated in
Description of
The third generation unit 905 provides a new record in the second parameter table (S1863) in the same manner as in a case of the first embodiment. The sample ID specified in S1861 is stored in the new record.
The third generation unit 905 specifies one voice vector stored in the field of the feature relating to voice in the record of the sample in the second feature table (S1865) in the same manner as in a case of the first embodiment. Specifically, the third generation unit 905 sequentially specifies the field of the feature relating to voice from front to back and reads the vector stored in the field.
The third generation unit 905 converts the voice vector into a numeral value (S1867). As described above, the third generation unit 905 performs processing so that each of the 45 elements included in a voice vector corresponds to one digit of a 45-digit binary number and binary values of 45 digits are converted into hexadecimal values.
The third generation unit 905 sequentially stores the numeral values in the field of the parameter relating to the voice provided in S1863 (S1869).
The third generation unit 905 determines whether or not there is an unprocessed voice vector (S1871). In a case where it is determined that there is an unprocessed voice vector, the flow returns to the processing illustrated in S1865 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed voice vector, the third generation unit 905 determines whether or not there is an unprocessed sample (S1873). In a case where it is determined that there is an unprocessed sample, the flow returns to the processing illustrated in S1861 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed sample, the third generation processing (B) ends. At a point of time when it is determined that there is no unprocessed sample, generation of the second parameter table is completed. If the third generation processing (B) ends, the flow returns to the learning processing of a caller.
Description of
Next, an operation of the evaluation unit 109 will be described. A module configuration of the evaluation unit 109 according to the second embodiment is the same as in a case (
The sixth generation unit 1905 performs sixth generation processing (B). In the sixth generation processing (B), parameters relating to voice are generated based on features relating to the voice of each word included in a character string which is evaluated.
The fifth generation unit 1901 performs fifth generation processing (S2003) in the same manner as in a case of the first embodiment.
The first application unit 1903 performs first application processing (S2005) in the same manner as in a case of the first embodiment.
The sixth generation unit 1905 performs sixth generation processing (B) (S3701). In the sixth generation processing (B), parameters relating to voice are generated based on the features relating to the voice of each word included in the character string which is evaluated. The parameters are generated by the same procedure as in a case of the third generation processing (B) of the learning processing.
The sixth generation unit 1905 specifies one word (S2203). For example, the sixth generation unit 1905 sequentially specifies the divided words from the head.
The sixth generation unit 1905 acquires phonetic symbol notation of the words from the voice notation database 127 (S3801). The sixth generation unit 1905 adds the phonetic symbol notation of the words to the phonetic symbol notation of the character string (S3803). The phonetic symbol notation of the character string is retained as an internal parameter.
The sixth generation unit 1905 determines whether or not there is an unprocessed word (S2209). In a case where it is determined that there is an unprocessed word, the flow returns to the processing illustrated in S2203 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed word, the flow proceeds to processing of S3805 illustrated in
Description of
The sixth generation unit 1905 converts the phonetic symbol notation of the word into a voice vector in the same manner as in a case of the third generation processing (B) (S3807). The sixth generation unit 1905 adds the voice vectors to a string of the feature relating to the voice (S2217). The string of the feature relating to the voice is specifically a string of voice vectors and is retained as an internal parameter.
The sixth generation unit 1905 determines whether or not there is an unprocessed word (S3809). In a case where it is determined that there is an unprocessed word, the flow returns to the processing illustrated in S3805 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed word, the sixth generation unit 1905 specifies one voice vector in the string of the feature relating to the voice (S2221). For example, the voice vectors are sequentially specified from the head of the string.
The sixth generation unit 1905 converts the voice vector into a numeral value in the same manner as in a case of the third generation processing (B) (S2223). Hence, the sixth generation unit 1905 adds the numeral value to the string of the parameters relating to the voice (S2225). The string of the parameters relating to the voice is retained as internal parameters.
The sixth generation unit 1905 determines whether or not there is an unprocessed voice vector (S2227). In a case where it is determined that there is an unprocessed voice vector, the flow returns to the processing illustrated in S2221 and repeats the aforementioned processing.
Meanwhile, in a case where it is determined that there is no unprocessed voice vector, the sixth generation processing (B) ends. If the sixth generation processing (B) ends, the flow returns to the evaluation processing of a caller.
Description of
The calculation unit 1909 performs calculation processing (S2011) in the same manner as in a case of the first embodiment. In the calculation processing, an overall third evaluation value is calculated based on the first evaluation value and the second evaluation value. In the second embodiment, the calculation unit 1909 may perform the calculation processing (A) illustrated in
The output unit 111 outputs a third evaluation value (S2013) in the same manner as in a case of the first embodiment.
According to the present embodiment, it is possible to estimate sensitive evaluation of a reader with respect to a character string. For example, it is expected to reflect a rule which is hard to formulate like implicit associations and linguistic rhythms.
In addition, it is possible to reflect semantic impression that a reader receives from a word in a first model.
In addition, it is possible to reflect voice impression that the reader receives from a word in a second model.
Furthermore, feature of voice of a character string may be based on voice of each of a plurality of words included in the character string.
By doing so, it is possible to reflect voice impression that a reader receives from a word in a second model.
The evaluation apparatus 101 is a computer device, and as illustrated in
The aforementioned embodiments may be summarized as follows.
An evaluation method according to the present embodiment includes (A) generating a first model which derives an evaluation value based on feature of meaning of a character string by using teacher data including a plurality of samples in which a character string and an evaluation value are associated with each other, (B) generating a second model which derives an evaluation value based on feature of voice of a character string by using the teacher data, and (C) calculating a third evaluation value, based on a first evaluation value derived by applying the feature of meaning of a character string which is evaluated to the first model, and a second evaluation value derived by applying the feature of voice of the character string which is evaluated to the second model.
By doing so, it is possible to estimate sensitive evaluation of a reader with respect to a character string.
Furthermore, feature of meaning of a character string may be based on meaning of each of a plurality of language units included in the character string.
By doing so, it is possible to reflect semantic impression that a reader receives from the language unit to a first model.
Furthermore, feature of voice of a character string may be based on voice of each of a plurality of syllables included in the character string.
By doing so, it is possible to reflect voice impression that a reader receives from a syllable in a second model.
It is possible to create a program of performing processing which is performed by the aforementioned method by using a computer, and the program may be stored in a computer-readable storage medium, such as a flexible disk, a CD-ROM, a magneto-optical disk, a semiconductor memory, and a hard disk, or a storage device. Meanwhile, intermediate processing results are temporarily stored in a storage device such as a main memory in general.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2016-148756 | Jul 2016 | JP | national |
2017-097903 | May 2017 | JP | national |