This application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2005-100032, filed on Mar. 30, 2005; the entire contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates to a communication support apparatus, a communication support method, and a computer program product for supporting communication by performing translation between a plurality of languages.
2. Description of the Related Art
In recent years, with the development of natural language processing technology, a machine translation system that translates, for example, a text written in Japanese into a text in another language such as English has been put into practical use and been widely prevalent.
With the development of speech processing technology, there have been also utilized an speech dictation system in which the input of a natural language string by speech is enabled by converting sentences spoken by a user into letters, and an speech synthesis system that converts sentences obtained as electronic data and a natural language string output from a system into speech output.
With the development of image processing technology, there has been realized a character recognition system in which a sentence in an image is converted into machine-readable character data by analyzing a character image photographed by a camera or like. Moreover, with the development of handwritten-character technology, there has been realized a technique of converting a sentence input by a user through handwriting using a pen-based input device into a machine-readable character data.
With the globalization in culture and economy, chances of communication between persons who are native speakers of different languages have been increased. Consequently, there have been raised expectations for a technique applied to a communication support apparatus that supports communications between persons who are native speakers of different languages by integrating the natural language processing technology, speech processing technology, image processing technology, handwritten-character recognition technology.
As such a device, for example, the following communication support apparatus can be considered. First, a Japanese sentence spoken or input with a pen by a Japanese speaker is converted into machine-readable, Japanese character data, utilizing the speech recognition technology or handwritten-character recognition technology. Next, using the machine translation technology, the data is translated into a semantically equivalent English sentence and the result is presented as an English string. Alternatively, the result is presented to an English speaker in a form of English speech, utilizing the speech synthesis technology. On the other hand, an English sentence spoken or input with a pen by an English speaker is subjected to the adverse processing to thereby present a translated Japanese sentence to a Japanese speaker. By such a method, the realization of the communication support apparatus that enables two-way communication between persons who are native speakers of different languages has been attempted.
Furthermore, as another example, the following communication support apparatus can be considered. First, a string of a local sign, cautionary statement or the like, expressed in English is photographed with a camera. Next, the photographed string is converted into machine-readable, English string data utilizing the image processing technology and character recognition technology. Further, the data is translated into a semantically equivalent Japanese sentence, using the machine translation technology, and the result is presented to a user as a Japanese string. Alternatively, the result is presented to the user in a form of Japanese speech, utilizing the speech synthesis technology. By such a method, the realization of a communication support apparatus by which a traveler who is a native speaker of Japanese and does not understand English and who travels in an English-speaking area can understand the sign and cautionary statement expressed in English has been attempted.
In such a communication support apparatus, when the input sentence in an source language, which is input by the user, is recognized by the speech recognition processing, handwritten-character recognition processing or image character recognition processing to be converted into machine-readable character data, it is very difficult to obtain a proper candidate without fail, and generally, there arises ambiguity caused by obtaining a plurality of interpretation candidates.
In the machine translation processing, since there also arises ambiguity when an source language sentence is converted into a semantically equivalent target language sentence, a plurality of candidates for the target language sentence exist. Consequently, in many cases, the semantically equivalent object sentence cannot be uniquely selected and the ambiguity cannot be eliminated.
As its causes, for example, a case where the source language sentence itself is an ambiguous expression in which a plurality of interpretations exist, a case where a plurality of interpretations arise because the source language sentence itself is an expression having high context dependency, and a case where a plurality of translation candidates arise because linguistic and cultural backgrounds, a conceptual system and the like are different between the source language and the target language can be considered.
In order to eliminate such ambiguity, when a plurality of candidates are obtained, there are proposed a method of selecting a candidate obtained first and a method of presenting the plurality of candidates to a user so that the user makes a selection among them. Also, there is proposed a method in which, when a plurality of candidates are obtained, the respective candidates are scored according to some criterion to select a candidate with a high score. For example, in Japanese Patent Application Laid-Open (JP-A) No. H07-334506, there is proposed a technique in which a translated word in which the similarity of a concept recalled from the word is high is selected from a plurality of translated words resulting from the translation to thereby improve the quality of a translated sentence.
However, the method of selecting the candidate obtained first, in spite of having an effect of shortening processing time, has a problem in that there is no assurance of selecting an optimal candidate and that there is a high possibility that an target language sentence not matching the intention of the source language sentence is output.
The method in which the user makes a selection from a plurality of candidates has a problem in that the burden of the user is increased, and a problem in that when a number of interpretation candidates are obtained, they cannot be efficiently presented to the user. Moreover, there are a problem in that even if the user can properly select an interpretation candidate for the source language, ambiguity caused at the time of subsequent translation processing cannot be eliminated, and a problem in that even if in order to eliminate this, a translation processing result is also designed to be selected by the user, it is not an effective method because the user, normally, does not understand the target language.
In the method in JP-A No. H07-334506, since the user does not select the translated sentence candidate, but the translated sentence candidate is selected based on values calculated according to the criterion of the conceptual similarity, the burden of the user is reduced. However, there is a problem in that since it is difficult to set the criterion as a basis of scoring, there is no assurance of selecting optimal candidate and there is a possibility that an target language sentence not matching the intention of the source language sentence is selected.
According to one aspect of the present invention, a communication support apparatus includes an analyzing unit that analyzes an source language sentence to be translated into a target language, and outputs at least one source language interpretation candidate which is a candidate for interpretation of the source language sentence; a detecting unit that, when there are a plurality of the source language interpretation candidates, detects an ambiguous part which is a different part between the respective candidates in the plurality of source language interpretation candidates; a translation unit that translates the source language interpretation candidate except the ambiguous part into the target language.
According to another aspect of the present invention, a communication support apparatus includes an analyzing unit that analyzes of a source language sentence to be translated into a target language, and outputs at least one source language interpretation candidate which is a candidate for interpretation of the source language sentence; a translation unit that translates the source language interpretation candidate into a target language, and outputs at least one target language interpretation candidate which is a candidate for the interpretation in the target language; a detecting unit that, when there are a plurality of the target language interpretation candidates, detects an ambiguous part which is a different part between the respective candidates in the plurality of target language interpretation candidates; and a generating unit that generates a target language sentence which is a sentence described in the target language, based on the target language interpretation candidate except the ambiguous part, and outputs at least one target language sentence candidate which is a candidate for the target language sentence.
According to still another aspect of the present invention, a communication support apparatus includes an analyzing unit that analyzes a source language sentence to be translated into a target language, and outputs at least one source language interpretation candidate which is a candidate for interpretation of the source language sentence; a translation unit that translates the source language interpretation candidate into a target language, and outputs at least one target language interpretation candidate which is a candidate for the interpretation in the target language; a generating unit that generates a target language sentence which is a sentence described in the target language, based on the target language interpretation candidate, and outputs at least one target language sentence candidate which is a candidate for the target language sentence; a detecting unit that, when there are a plurality of the target language sentence candidates, detects an ambiguous part which is a different part between the respective candidates in the plurality of target language sentence candidates; and a deleting unit that deletes the ambiguous part.
According to still another aspect of the present invention, a communication support apparatus includes an analyzing unit that analyzes a source language sentence to be translated into a target language, and outputs at least one source language interpretation candidate which is a candidate for interpretation of the source language sentence; a detecting unit that, when there are a plurality of the source language interpretation candidates, detects an ambiguous part which is a different part between the respective candidates in the plurality of source language interpretation candidates; a parallel translation pair storing unit that stores a parallel translation pair of the source language interpretation candidate and a target language sentence candidate semantically equivalent to each other; and a selecting unit that selects the target language sentence candidate, based on the source language interpretation candidate except the ambiguous part and the parallel translation pair stored in the parallel translation storing unit.
According to still another aspect of the present invention, a communication support method includes analyzing the a source language sentence to be translated into a target language; outputting at least one source language interpretation candidate which is a candidate for interpretation of the source language sentence; when there are a plurality of the source language interpretation candidates, detecting an ambiguous part which is a different part between the respective candidates in the plurality of source language interpretation candidates; translating the source language interpretation candidate except the ambiguous part into the target language.
According to still another aspect of the present invention, a communication support method includes analyzing the a source language sentence to be translated into a target language; outputting at least one source language interpretation candidate which is a candidate for interpretation of the source language sentence; translating the source language interpretation candidate into a target language; outputting at least one target language interpretation candidate which is a candidate for the interpretation in the target language; when there are a plurality of the target language interpretation candidates, detecting an ambiguous part which is a different part between the respective candidates in the plurality of target language interpretation candidates; generating a target language sentence which is a sentence described in the target language, based on the target language interpretation candidate except the ambiguous part; and outputting at least one target language sentence candidate which is a candidate for the target language sentence.
According to still another aspect of the present invention, a communication support method includes analyzing the a source language sentence to be translated into a target language; outputting at least one source language interpretation candidate which is a candidate for interpretation of the source language sentence; translating the source language interpretation candidate into a target language; outputting at least one target language interpretation candidate which is a candidate for the interpretation in the target language; generating a target language sentence which is a sentence described in the target language, based on the target language interpretation candidate; outputting at least one target language sentence candidate which is a candidate for the target language sentence; when there are a plurality of the target language sentence candidates, detecting an ambiguous part which is a different part between the respective candidates in the plurality of target language sentence candidates; and deleting the ambiguous part.
Exemplary embodiments of a communication support apparatus, a communication support method, and a computer program product according to this invention are described in detail below with reference to accompanying drawings.
A communication support apparatus according to a first embodiment interprets the semantic content of an source language sentence recognized from a speech, translates the interpreted semantic content in the source language into the semantic content in an target language, generates an target language sentence from the translated semantic content in the target language, and synthesizes and outputs a speech in the target language from the generated target language sentence. At this time, when a plurality of candidates are obtained in the processing results in speech recognition processing, source language analysis processing, translation processing and target language generation processing, a different between the respective candidates is detected and deleted as an ambiguous part to thereby eliminate the ambiguity of the target language sentence output finally.
Here, the source language sentence indicates a string expressed in an source language which is an source language to be translated, and the target language sentence indicates a string expressed in an target language which is a language to be translated into. Each of the source language sentence and the target language sentence is not limited to a sentence with a period, but sentences, a paragraph, a phrase, a word or the like may be applied.
Furthermore, while in the first embodiment, a communication support apparatus in which Japanese input by user's speech is translated into English and is output as a speech is explained as an example, the combination of the source language and the target language is not limited to this, but the present invention can be applied to any combination as long as an source language is translated into a different language.
The source language speech recognizing unit 101 receives a speech in an source language which is uttered by a user, and performs speech recognition to thereby output an source language sentence candidate in which the speech content is transcribed. To the speech recognition processing performed by the source language speech recognizing unit 101 can be applied any commonly used speech recognition method using Linear Predictive Cording analysis, Hidden Markov Model (HMM), dynamic programming, a neural network, an n-gram language model or the like.
The source language analyzing unit 102 receives the source language sentence recognized by the source language speech recognizing unit 101, and performs natural language analysis processing such as morphological analysis, syntactic analysis, dependency parsing, semantic analysis and context analysis referring to vocabulary information and grammar rules of the source language to thereby output an source language interpretation candidate which is a candidate for interpretation of the semantic content indicated by the source language sentence. Further, the source language analyzing unit 102 outputs a correspondence relation between the source language sentence and the source language interpretation candidate as interpretation correspondence information.
The individual source language interpretation candidate obtained by the natural language analysis processing is a tree-structure graph which expresses a syntax structure and a dependency relation between concepts in the source language sentence, with the concepts corresponding to the source language vocabulary expressed as nodes. Accordingly, the interpretation correspondence information stores information in which a partial string included in the source language sentence is associated with a number for identifying each node (node identification number) in the tree-structure graph one-on-one.
To the natural language analysis processing performed by the source language analyzing unit 102 can be applied any commonly used method such as morphological analysis by the CYK method and syntactic analysis by Earley's method, Chart method, or generalized left to right (LR) parsing. Furthermore, a dictionary for natural language processing including the morphological information, syntax information, semantic information and the like is stored in a commonly used storage such as an HDD (Hard Disk Drive), an optical disk and a memory card, and is referred toxin the natural language analysis processing.
The translation unit 103 receives the source language interpretation candidate output by the source language analyzing unit 102 and outputs an target language interpretation candidate in reference to vocabulary information of the source language and the target language, structure conversion rules for absorbing structural differences between both languages, and a parallel translation dictionary indicating correspondence relations between the vocabularies of both languages. Furthermore, the translation unit 103 outputs a correspondence relation between the source language interpretation candidate and the target language interpretation candidate as translation correspondence information.
The target language interpretation candidate obtained by the translation processing is a candidate for an internal expression in English which is the target language. The target language interpretation candidate, similar to the source language interpretation candidate, is a tree-structure graph which expresses a syntax structure and a dependency relation between concepts of the target language sentence to be translated, with the concepts corresponding to the source language vocabulary expressed as nodes. Accordingly, the translation correspondence information stores information in which the node identification numbers of the tree-structure graph representing the source language interpretation candidate are associated with node identification numbers in the tree-structure graph representing the target language interpretation candidate one-on-one. To the translation processing by the translation unit 103 can be applied any method utilized in a general transfer method.
The target language generating unit 104 receives the target language interpretation candidate output by the translation unit 103 and outputs an target language sentence candidate in reference to the vocabulary information and grammar rules defining the syntax structure of English which is the target language and the like. Furthermore, the target language generating unit 104 outputs a correspondence relation between the target language interpretation candidate and the target language sentence candidate as generation correspondence information. The generation correspondence information stores information in which a node identification number of the tree-structure graph representing the target language interpretation candidate is associated with a partial string included in the target language sentence candidate one-on-one. To the target language generation processing performed here can be applied any commonly used language generation method.
The target language speech synthesizing unit 105 receives the target language sentence output by the target language generating unit 104 and outputs the content as a synthesized speech of English that is the target language. To the speech synthesis processing performed here can be applied any commonly used method such as a text-to-speech system speech using speech segment edition and speech synthesis, formant speech synthesis or the like.
When there exist a plurality of source language sentence candidates output by the source language speech recognizing unit 101, a plurality of source language interpretation candidates output by the source language analyzing unit 102, a plurality of target language interpretation candidates output by the translation unit 103, or a plurality of target language sentence candidates output by the target language generating unit 104, the ambiguous part detecting unit 106 detects and outputs a different part between the plurality of candidates as an ambiguous part.
The ambiguous part deleting unit 107 deletes the ambiguous part output by the ambiguous part detecting unit 106 from the source language sentence candidates or the source language interpretation candidates or the target language interpretation candidates or the target language sentence candidates. As a result, the plurality of candidates can be integrated into one candidate including no ambiguous part.
The translated-part presenting unit 108 identifies the partial string in the source language sentence corresponding to the target language sentence translated finally (hereinafter, translated part), by sequentially referring to the interpretation correspondence information output by the source language analyzing unit 102, the translation correspondence information output by the translation unit 103, and the generation correspondence information output by the target language generating unit 104, and screen display or the like is performed to thereby feed back to the user.
The correspondence information storing unit 110 is a storage that stores the interpretation correspondence information, the translation correspondence information, and the generation correspondence information, and can be composed of any commonly used storage such as an HDD, an optical disk and a memory card. The interpretation correspondence information, the translation correspondence information, and the generation correspondence information stored in the correspondence information storing unit 110 are referred to when the translated-part presenting unit 108 identifies the translated part.
Next, the communication support processing by the communication support apparatus 100 according to the first embodiment configured as described above is explained.
The source language speech recognizing unit 101 first receives the input of a speech in the source language uttered by a user (step S201), performs the speech recognition processing for the received speech in the source language, and outputs an source language sentence (step S202).
Next, the source language analyzing unit 102 analyzes the source language sentence output by the source language speech recognizing unit 101, and outputs an source language interpretation candidate, and at the same time, outputs the interpretation correspondence information to the correspondence information storing unit 110 (step S203). More specifically, general natural language analysis processing such as morphological analysis, syntactic analysis, semantic analysis, and context analysis and the like is executed and an source language interpretation candidate with relations between the respective morphemes represented by a tree-structure graph is output.
Suppose, for example, a speech in Japanese which is pronounced as “ASUKURUMADEMATSU” and which, when translated into English, is interpreted in two ways of “I will wait until you come tomorrow” and “I will wait in the car tomorrow” is recognized, and as a result, a Japanese sentence 401 shown in
Here, each node is expressed in a format of “<concept label>@<node identification number>.” The concept label includes a label indicating an “object” or an “event” mainly corresponding to a noun such as, for example, “tomorrow” or “car,” a label indicating an “action” or a “phenomenon” mainly corresponding to a verb such as, for example, “wait” and “buy,” and a label indicating an “intention” or a “state” mainly corresponding to an auxiliary verb such as, for example, “ask,” “hope,” and “impracticable.” Furthermore, the node identification number is a number for uniquely identifying each node.
After the source language interpretation candidates are output at step S203, the ambiguous part exclusion processing is executed in which an ambiguous part is deleted from the plurality of source language interpretation candidates to output one source language interpretation candidate (step S204). Hereinafter, the detail of the ambiguous part exclusion processing is described.
When a plurality of candidates exist (step S301: Yes), the ambiguous part detecting unit 106 detects a difference between the plurality of candidates as an ambiguous part (step S302). For example, in the example (Japanese sentence 401), the Japanese language 408 is detected as the ambiguous part.
Next, the ambiguous part deleting unit 107 deletes the ambiguous part detected by the ambiguous part detecting unit 106 to thereby integrate the plurality of candidates into one candidate and output it (step S303), and the ambiguous part exclusion processing is finished. For example, in the example (the Japanese sentence 401), a candidate having a Japanese language 411 and a Japanese language 412 as two nodes of a tree-structure graph, with the Japanese language 408 deleted is output as the source language interpretation candidate.
After the ambiguous part exclusion processing for the source language interpretation candidate at step S204 is finished, the translation unit 103 translates the source language interpretation candidate with the ambiguous part excluded and outputs an target language interpretation candidate, and at the same time, outputs the translation correspondence information to the correspondence information storing unit 110 (step S205). For example, for the source language interpretation candidate having the Japanese 411 and the Japanese 412 as two nodes of the tree-structure graph, the target language interpretation candidate having “TOMORROW” and “WAIT” as two nodes of a tree-structure graph is output.
Next, the ambiguous part exclusion processing for the target language interpretation candidate is executed (step S206). Here, since only different point from the processing is that the ambiguous part exclusion processing is executed for the target language interpretation candidate instead of being executed for the source language interpretation candidate and the processing content is the same, the description thereof is not repeated. In the example, there exists no ambiguity in the target language interpretation candidate, so that the deletion processing of the ambiguous part is not executed and the ambiguous part exclusion processing is finished (Step S301: No).
After the ambiguous part exclusion processing for the target language interpretation candidate is executed (step S206), the target language generating unit 104 generates an target language sentence from the target language interpretation candidate, and at the same time, outputs the generation correspondence information to the correspondence information storing unit 110 (step S207). For example, an target language sentence “I will wait, tomorrow” is generated from the target language interpretation candidate having “TOMORROW” and “WAIT” as the two nodes of the tree-structure graph.
In this manner, in reference to knowledge of grammar and vocabulary of English which is the target language, the target language generating unit 104 arranges the style as English and complements a subject and the like which are omitted in the original Japanese text as the source language as necessary to thereby output an English surface text presenting the content of the target language interpretation candidate as the target language sentence.
Next, the translated-part presenting unit 108 acquires a translated part corresponding to the target language sentence generated by the target language generating unit 104 by sequentially referring to the interpretation correspondence information, the translation correspondence information, and the generation correspondence information which are stored in the correspondence information storing unit 110, and presents it to the user by screen display (step S208). It is intended to allow the user to easily understand which part of the partial strings included in the source language sentence has been translated and output as the target language sentence. The configuration in this manner allows the user to understand which part has been deleted by the translation, and to complement it in the conversation after that, etc. so that the support for the communication can be effectively executed. An example of the screen display for the presentation of the translated part (translated-part display screen) is described later.
Next, the target language speech synthesizing unit 105 synthesizes a speech in the target language from the target language sentence to output (step S209), and the communication support processing is finished. As a result of the screen display, when the user determines not to execute the speech output, the speech synthesis processing by the target language speech synthesizing unit 105 may not be performed, but the processing may return to the speech recognition processing to input again.
Furthermore, while the ambiguous part exclusion processing is executed only for the source language interpretation candidate and the objective language interpretation candidate, when a plurality of source language sentences which are the output results by the source language speech recognizing unit 101 exist and when a plurality of target language sentence candidates which are the output results by the target language generation unit 104 exist, a configuration in which the ambiguous part exclusion processing is executed in a manner similar to the foregoing may be employed. In this case, a configuration in which the ambiguous part exclusion processing is executed with the output results by the source language speech recognizing unit 101 expressed in lattice or the like may be employed. That is, the ambiguous part exclusion processing can be applied to any processing, as long as a plurality of processing results are output in a processing course and a different part between them can be detected as an ambiguous part.
Next, specific examples of the communication support processing in the communication support apparatus 100 according to the first embodiment are described.
The source language interpretation candidates are represented by tree-structure graphs as described above, and each node of the tree-structure graphs is represented in a format of <concept label>@<identification number>. Furthermore, an arc connecting the respective nodes of the tree-structure graph of the interpretation candidate indicates a semantic relation between the respective nodes, being represented in a format of “$<relation label>$.” The relation label includes semantic relations such as $TIME$ (time), $LOCATION$ (location), $UNTIL$ (temporally sequential relation), $BACKGROUND$ (background), $OBJECT$ (object), $ACTION$ (action), $REASON (reason), $TYPE$ (type) and the like, for example. The relation label is not limited to these, but any relation that indicates a semantic relation between the nodes can be included.
In
The example of T2a and T2b is an example in which a plurality of interpretations arise in semantic analysis or context analysis which analyzes the semantic relation between the nodes and the speech intention.
The example of T3a and T3b is an example in which a plurality of interpretations arise in semantic analysis.
Each of the target language interpretation candidates is a tree-structure graph, similar to the source language interpretation candidate, and each node indicates a concept in the target language, being represented in the form of “<concept label>@<node identification number>.” The notation and meaning of each arc of the target language interpretation candidate are similar to the notation and meaning of each arc in the source language interpretation candidate.
In the examples shown in
Furthermore, U2a indicates that there exists a background of “WANT” ($BACKGROUND$) to an action of “BUY” ($ACTION$) to an object of “COFFEE” ($OBJECT$), and that an action of “EXCHANGE” ($ACTION$) is in an impracticable state (CANNOT). On the other hand, U2b indicates having the intention of “REQUEST” to the action of “EXCHANGE” ($ACTION$) for the reason of “WANT” ($REASON$) to the action of “BUY” ($ACTION$) to the object of “COFFEE” ($OBJECT$).
Furthermore, U3a indicates having the intention of “REQUEST” to the object of “ROOM” as a target, whose price is “EXPENSIVE” ($PRICE$) and whose type is “OCEANVIEW” ($TYPE$). On the other hand, U3b indicates having the intention of “REQUEST” for the object of “ROOM” ($OBJECT$) as a target, whose location is “UPPERFLOOR” ($LOCATION$) and whose type is “OCEANVIEW” ($TYPE$).
The respective nodes of each of the target language interpretation candidates are translations of the concepts of the source language of the nodes corresponding to the relevant source language interpretation candidate into concepts of the target language. In the examples shown in
For example, when the source language sentence S1 is input, the source language interpretation candidates T1a and T1b are output by the source language analyzing unit 102, and through the detection of the ambiguous part by the ambiguous part detecting unit 106 and the deletion of the ambiguous part by the ambiguous part deleting unit 107, the source language interpretation candidate X1 with ambiguous part excluded is output.
Furthermore, the translation unit 103 executes the translation processing for the source language interpretation candidate X1 with the ambiguous part excluded, and outputs the target language interpretation candidate U1 with the ambiguous part excluded. Finally, the target language generating unit 104 executes the target language generation processing for the target language interpretation candidate U1 with the ambiguous part excluded, and outputs the target language sentence Z1 with the ambiguous part excluded.
Since the correspondence information storing unit 110 stores the correspondence information between the respective pieces of data as shown by arrow in
For example, the screen example in
Similarly, in the screen example in
In this manner, the translated-part presenting unit 108 displays the translated part on the screen, which allows the user to confirm, in Japanese which is the source language, what translation result has finally been communicated to the other partner.
In the related art, for example, when it is ambiguous whether the price is high or the floor is high as shown in the screen example of
While in the fist embodiment, the commonly used transfer method composed of three courses of analysis of the source language sentence, the conversion (translation) into the target language and the generation of the target language sentence is described as a method for machine translation, the present invention can be applied to any method for machine translation such as example-based machine translation, statistics-based machine translation and interlanguage system machine translation, as long as ambiguity arises in the results output in the respective processing courses.
Furthermore, while in the first embodiment, the example in which the input of the source language sentence by the speech recognition and the output of the target language by the speech synthesis processing are executed is shown, a configuration in which the input of the source language sentence by pen-based input and the output of the target language by the screen display are executed may be employed. The input of the source language sentence and the output of the target language sentence are not limited to these, but any commonly used method can be applied.
As described above, in the communication support apparatus according to the first embodiment, when a plurality of processing result candidates are obtained in the speech recognition processing, the source language analysis processing, the translation processing, or the target language generation processing, by detecting and deleting a different part between the respective candidates as the ambiguous part, the ambiguity of the target language sentence output finally is deleted without the user's special operation, so that a proper target language sentence including no error can be obtained.
In a communication support apparatus according to a second embodiment, when a plurality of processing result candidates are obtained in the speech recognition processing, the source language analysis processing, the translation processing, or the target language generation processing, a different part between the respective candidates is detected as the ambiguous part and when there exists a superordinate concept of the semantic content of the ambiguous part, the ambiguous part is replaced by the superordinate concept to thereby exclude the ambiguity of the target language sentence output finally.
In the second embodiment, the addition of the concept replacing unit 1209 and the concept hierarchy storing unit 1220 is different from the first embodiment. Since the other configurations and functions are similar to those of
The concept replacing unit 1209 retrieves a superordinate concept of the semantic content of an ambiguous part detected by the ambiguous part detecting unit 106 and when the superordinate concept can be retrieved, the ambiguous part is replaced by the retrieved superordinate concept.
The concept hierarchy storing unit 1220 is a storing unit in which a hierarchy relation between the concepts is stored in advance, and can be composed of any commonly used storage such as an HDD, an optical disk and a memory card. The concept hierarchy storing unit 1220 is utilized for searching for the superordinate concept of the semantic content indicated by the ambiguous part.
For example, in
Next, the communication support processing by the communication support apparatus 1200 configured as described above, according to the second embodiment is explained. In the second embodiment, although the detail of the ambiguous part exclusion processing is different from that of the first embodiment, the other processing is similar to that of the communication support processing shown in
After the ambiguous part detecting unit 106 detects an ambiguous part (step S1402), the concept replacing unit 1209 retrieves a superordinate concept of the ambiguous part from the concept hierarchy storing unit 1220 (step S1403). More specifically, the concept replacing unit 1209 detects a superordinate concept in the lowest tier containing a plurality of concepts included in the ambiguous part, referring to the concept hierarchy storing unit 1220.
For example, on the premise of the data example of the concept hierarchy storing unit 1220 shown in
In order to avoid excessive abstraction, a configuration in which the limitation is imposed on the superordinate concept to be retrieved may be employed. For example, the configuration may be such that, when the number of arcs between the nodes representing the respective concepts is larger than the preset number, the superordinate concept is not retrieved. Furthermore, the configuration may be such that points are added according to a difference in hierarchy from the superordinate concept, and that when the points become larger than a preset value, the superordinate concept is not retrieved.
Next, the concept replacing unit 1209 determines whether or not the superordinate concept is retrieved (step S1404). When it is retrieved (step S1404: YES), the concept replacing unit 1209 replaces the ambiguous part by the retrieved superordinate concept to thereby integrate the plurality of candidates into one candidate (step S1405), and the ambiguous part exclusion processing is finished.
When the superordinate concept is not retrieved (step S1404: NO), the ambiguous part deleting unit 107 deletes the ambiguous part to thereby integrate the plurality of candidates into one candidate (step S1406) and the ambiguous part exclusion processing is finished.
In this manner, in the communication support apparatus 1200 according to the second embodiment, when the ambiguous part exists and when the superordinate concept of the ambiguous part exists, the ambiguous part can be replaced by the superordinate concept instead of simply deleting the ambiguous part. Therefore, the deletion of the ambiguous part can reduce the possibility that the intention of the user is not sufficiently communicated.
Next, specific examples of the communication support processing in the communication support apparatus 1200 according to the second embodiment are described.
In the example shown in
In this example, the plurality of target language interpretation candidates U4a and U4b are output from the one source language interpretation candidate T4. This is because for the node to be identified with the node identification number 627 in T4, a plurality of nodes “BARRIER@727” and “GATE@730” are obtained as the translation candidates.
For example, when the source language sentence S4 is input, the source language interpretation candidate T4 is output by the source language analyzing unit 102. In this example, since no ambiguity exists in the source language interpretation candidate, T4 corresponds to the source language interpretation candidate with the ambiguous part excluded.
Furthermore, the translation unit 103 executes the translation processing for the source language interpretation candidate T4 with the ambiguous part excluded, and outputs the target language interpretation candidates U4a and U4b. For these candidates, the detection of the ambiguous part by the ambiguous part detecting unit 106 and the replacement by the superordinate concept by the concept replacing unit 1209 are performed and the target language interpretation candidate Y4 with the ambiguous part excluded is output. Finally, the target language generating unit 104 executes the target language generation processing for the target language interpretation candidate Y4 with the ambiguous part excluded and outputs the target language sentence Z4 with the ambiguous part excluded.
In this manner, in the second embodiment, since the ambiguous part can be replaced by the superordinate concept without deleting the ambiguous part, the translation result including no ambiguous part and matching the intention of the user can be communicated to the other partner.
As described above, the communication support apparatus according to the second embodiment, when a plurality of the processing result candidates are obtained in the speech recognition processing, the source language analysis processing, the translation processing or the target language generation processing, a different part between the respective candidates is detected as the ambiguous part and when a superordinate concept of the detected ambiguous part exists, the ambiguous part can be replaced by the superordinate concept. Furthermore, when no superordinate concept exists, the ambiguous part is deleted as in the first embodiment. This allows the ambiguity of the target language sentence output finally to be excluded, so that a proper target language sentence including no error can be obtained.
While in the first and second embodiments, using the communication devices utilizing the source language analysis, the language translation and the target language generation, the present invention is described, for example, pairs of the source language and the target language semantically equivalent to each other are stored in a storage (parallel translation pair storage) as parallel translation pairs, and when by selecting an target language sentence candidate from the parallel translation pairs, the communication support is realized, the technique of the present proposal can be applied.
A communication support program executed in the communication support apparatus according to the first or second embodiment is provided by being incorporated into a ROM (Read Only Memory) or the like in advance.
A configuration may be employed in which the communication support program executed in the communication support apparatus according to the first or second embodiment is provided by being recorded as a file in an installable format or executable format on a computer-readable recording medium such as a CD-ROM (Compact Disk Read Only Memory), a flexible disk (FD), a CD-R (Compact Disk Recordable), and a DVD (Digital Versatile Disk).
Furthermore, a configuration may be employed in which the communication support program executed in the communication support apparatus according to the first or second embodiment is provided by being stored on a computer connected to a network such as the Internet, and being downloaded via the network. Furthermore, a configuration may be employed in which the communication support program executed in the communication support apparatus according to the first or second embodiment is provided or delivered via a network such as the Internet.
The communication support program executed in the communication support apparatus according to the first or second embodiment has a module configuration including the units (the source language speech recognizing unit, the source language analyzing unit, the translation unit, the target language generating unit, the target language speech synthesizing unit, the ambiguous part detecting unit, the ambiguous part deleting unit, the translated-part presenting unit and the concept replacing unit), and as actual hardware, a CPU (Central Processing Unit) reads the communication support program from the ROM to execute, and thereby the units are loaded on a main storage and generated on the main storage.
Additional advantages and modifications will readily occur to those skilled in the art. Therefore, the invention in its broader aspects is not limited to the specific details and representative embodiments shown and described herein. Accordingly, various modifications may be made without departing from the spirit or scope of the general inventive concept as defined by the appended claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2005-100032 | Mar 2005 | JP | national |