This application is related to and claims priority to Japanese Patent Application No. 2010-209936 filed on Sep. 17, 2010, the entire contents of which are herein incorporated by reference.
1. Field
The present invention relates to a terminal apparatus and a speech processing program.
2. Description of the Related Art
Currently, there is an example of the related art in which, when a telephone call is conducted using a terminal apparatus such as a mobile phone or a smartphone, a user is informed of the direction of a person with whom he/she is conducting the telephone call by calculating the direction of the person with whom the user is conducting the telephone call and processing speech during the telephone call in accordance with the direction. The related art will be described hereinafter with reference to
In the related art, first, the positional information of a terminal apparatus used by “U2” is obtained, and then the positional relationship between a terminal apparatus used by “U1” and the terminal apparatus used by “U2” is obtained. It is to be noted that the positional information is obtained by, for example, using a Global Positioning System (GPS) or the like.
Next, in the related art, as illustrated in
Next, in the related art, as illustrated in
Japanese Unexamined Patent Application Publication No. 2008-184621 and Japanese Unexamined Patent Application Publication No. 2005-341092 are documents relating to the above description.
It is an aspect of the embodiments discussed herein to provide a terminal apparatus which obtains positional information indicating a position of another apparatus, and which obtains positional information indicating a position of the terminal apparatus.
The terminal apparatus is operable to obtain a first direction, which is a direction to the obtained position of the another apparatus and calculated using the obtained position of the terminal apparatus as a reference point.
The terminal apparatus is operable to obtain a second direction, which is a direction in which the terminal apparatus is oriented.
The terminal apparatus is operable to obtain, using a sensor that detects a direction in which the terminal apparatus is inclined, inclination information indicating whether the terminal apparatus is inclined to the right or to the left.
The terminal apparatus is operable to switch an amount of correction for a relative angle between the first direction and the second direction in accordance with whether the obtained inclination information indicates an inclination to the right or an inclination to the left.
The terminal apparatus is operable to determine an attribute of speech output from a speech output unit in accordance with the relative angle corrected by the amount of correction.
The object and advantages of the invention will be realized and attained by
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
In the above-described related art, as illustrated in
In addition, in the above-described related art, the relative angle θ is obtained on the assumption that the front direction of a user and the terminal direction are the same, and then output speech is generated in accordance with the relative angle θ. Therefore, in a situation in which the front direction of a user and the terminal direction are not the same, it is impossible for the user to exactly perceive the direction of a person with whom he/she is conducting a telephone call. It is to be noted that the front direction of a user and the terminal direction are not the same in most cases when the user conducts a telephone call while using either of his/her ears. Therefore, unless the above-mentioned relative angle θ is accurately calculated even in a situation in which the front direction of a user and the terminal direction are not the same, it is impossible for the user who is conducting a telephone call while using either of his/her ears to exactly perceive the direction of a person with whom he/she is conducting the telephone call.
In an embodiment of the technology that will be described hereinafter, therefore, a terminal apparatus and a speech processing program are provided that are capable of allowing a user who is conducting a telephone call with a person while using either of his/her ears to accurately perceive the direction of the person with whom he/she is conducting the telephone call.
An embodiment of a terminal apparatus that will be disclosed hereinafter includes an another-terminal position obtaining unit, an own-terminal position obtaining unit, a first direction obtaining unit, a second direction obtaining unit, an inclination direction obtaining unit, a correction unit, and a determination unit. The another-terminal position obtaining unit obtains positional information indicating a position of another apparatus. The own-terminal position obtaining unit obtains positional information indicating a position of the terminal apparatus. The first direction obtaining unit obtains a first direction, which is a direction to the obtained position of the another-terminal apparatus and calculated using the obtained position of the own-terminal apparatus as a reference point. The second direction obtaining unit obtains a second direction, which is a direction in which the own-terminal apparatus is oriented. The inclination direction obtaining unit obtains, using a sensor that detects a direction in which the own-terminal apparatus is inclined, inclination information indicating whether the own-terminal apparatus is inclined to the right or to the left. The correction unit switches an amount of correction for a relative angle between the first direction and the second direction in accordance with whether the obtained inclination information indicates an inclination to the right or an inclination to the left. The determination unit determines an attribute of speech output from a speech output unit in accordance with the relative angle corrected by the amount of correction.
According to an embodiment that will be described hereinafter, it is possible to allow a user who is conducting a telephone call with a person while using either of his/her ears to exactly perceive the direction of the person with whom he/she is conducting the telephone call.
An embodiment of the terminal apparatus and the speech processing program that are disclosed herein will be described hereinafter in detail with reference to the drawings. It is to be understood that the technology disclosed herein is not limited by the embodiment that will be described as an embodiment of the terminal apparatus and the speech processing program that are disclosed herein.
As described in
The position obtaining unit 120 obtains the positional information of the terminal apparatus 100. The position obtaining unit 120 obtains the position of the terminal apparatus 100 on a plane rectangular coordinate system on the basis of, for example, the latitude and the longitude obtained by using a GPS or the like. The position of the terminal apparatus 100 will be represented as, for example, “sender_pos (x_sender, y_sender)” hereinafter. It is to be noted that the position of the terminal apparatus 100 on a plane rectangular coordinate system can be obtained by using existing technologies on the basis of the latitude and the longitude. An example of the existing technologies is disclosed in, for example, B. R. Bowring “TOTAL INVERSE SOLUTIONS FOR THE GEODESIC AND GREAT ELLIPTIC” (Survey Review 33, 261 (July, 1996) 461-476, URL “http://vldb.gsi.go.jp/sokuchi/surveycalc/algorithm/” (searched on Sep. 1, 2010)). In addition, another example of the existing technologies is disclosed in a URL “http://vldb.gsi.go.jp/sokuchi/surveycalc/algorithm/bl2xy/bl2xy.htm” (searched on Sep. 1, 2010). The position transmission unit 130 transmits, to the terminal apparatus 200, the positional information of the terminal apparatus 100 obtained by the position obtaining unit 120.
As illustrated in
The microphone 201, the encoder 210, the position obtaining unit 220, and the position transmission unit 230 illustrated in
The position reception unit 240 illustrated in
The direction obtaining unit 250 illustrated in
As illustrated in
The calculation unit 260 illustrated in
First, the calculation unit 260 obtains the positional information (x_receiver, y_receiver) of the terminal apparatus 200 from the position obtaining unit 220 and the positional information (x_sender, y_sender) of the terminal apparatus 100 from the position reception unit 240. As illustrated in
The calculation unit 260 then obtains the angle “ang1 (receiver_angle)” between the terminal direction D3 and the north direction from the direction obtaining unit 250. As illustrated in
relative_angle1=receiver_angle+sender_angle (2)
The decoder 270 illustrated in
The detection unit 280A illustrated in
In addition, if the number of channels is 2, the detection unit 280A judges that the user is in a telephone call state other than a telephone call state in which he/she is using either of his/her ears, that is, a telephone call state in which, for example, he/she is using headphones, earphones, or the like. When the detection unit 280A has judged that, for example, the user is in a telephone call state in which he/she is using headphones, earphones, or the like, the detection unit 280A then sets, for a certain flag, a certain value (receiver_flag=2) indicating that the user is in a telephone call state in which he/she is using headphones, earphones, or the like. It is to be noted that the detection unit 280A is an example of a telephone call state judgment unit.
It is to be noted that the detection unit 280A may be configured to judge whether or not a user who is beginning to conduct a telephone call is in a telephone call state in which he/she is using either of his/her ears by referring to, for example, a register in which information regarding the number of output signals of speech or the output state of speech such as monaural, stereo, or the like is stored.
The judgment unit 280B illustrated in
For example, the judgment unit 280B obtains the value of a flag set by the detection unit 280A and judges whether or not the obtained value of a flag is a value (receiver_flag=1) indicating that the user is in a telephone call state in which he/she is using either of his/her ears. If it has been judged that the user is in a telephone call state in which he/she is using either of his/her ears, the judgment unit 280B obtains the acceleration of the terminal apparatus 200 from an acceleration sensor.
The judgment unit 280B then judges whether the user is conducting a telephone call while using his/her right or left ear on the basis of the obtained acceleration. For example, as illustrated in
On the other hand, as illustrated in
The correction unit 280C illustrated in
For example, the correction unit 280C obtains the value of a flag set by the detection unit 280A and judges whether or not the obtained value of a flag is a value (receiver_flag=1) indicating that the user is in a telephone call state in which he/she is using either of his/her ears.
If it has been judged that the user is in a telephone call state in which he/she is using either of his/her ears, the correction unit 280C obtains the value of a flag set by the judgment unit 280B and judges whether the obtained value of a flag is a value indicating a telephone call in which the user's left ear is used or a value indicating a telephone call in which the user's right ear is used. If it has been judged that the obtained value of a flag is a value (hold_flag=1) indicating a telephone call in which the user's left ear is used, the correction unit 280C obtains a correction value “ang4 (delta_angle_L)” for a telephone call the left ear is used as illustrated in
In the case of a telephone call in which the right ear is used, too, the corrected relative angle can be obtained in the same manner. For example, if the value of a flag set by the judgment unit 280B is a value (hold_flag=0) indicating a telephone call in which the user's right ear is used, the correction unit 280C obtains a correction value “ang5 (delta_angle_R)” for a telephone call in which the right ear is used as illustrated in
The generation unit 280D illustrated in
As illustrated in
artSig(n)=gain(relative_angle2)×pattern_sig: n=0, . . . , N−1 (3)
pattern_sig(n)(n=0, . . . , N−1): Pattern sound
gain(relative_angle2): Gain for adjusting volume
artSig(n): Characteristic sound
N: Frame length for speech processing
For example, as illustrated in
When the user is in a telephone call state in which he/she is using either of his/her ears, the mixing unit 280E mixes a characteristic sound generated by the generation unit 280D with speech input from the processing unit 290, which will be described later. The mixing unit 280E will be described with reference to
For example, the mixing unit 280E obtains the value of a flag set by the detection unit 280A and judges whether or not the obtained value of a flag is a value (receiver_flag=1) indicating that the user is in a telephone call state in which he/she is using either of his/her ears. If it has been judged that the user is in a telephone call state in which he/she is using either of his/her ears (if receiver_flag=1), the mixing unit 280E turns on the switch SW. The mixing unit 280E then mixes the characteristic sound “artSig(n)” generated by the generation unit 280D with the speech “SpOut(n)” input from the processing unit 290 in order to generate “SigOut(n)”. The mixing unit 280E then plays back “SigOut(n)” and outputs “SigOut(n)” from the speaker 202 in monaural output, where a single output system is used.
In addition, if a certain value indicating that the user is in a telephone call state in which he/she is using headphones, earphones, or the like is set for a flag input from the above-described detection unit 280A (if receiver_flag=2), the mixing unit 280E turns off the switch SW. In this case, the mixing unit 280E plays back the speech “SpOut(n)” input from the processing unit 290 and outputs the speech “SpOut(n)” from the speaker 202 in stereo output, where two output system different between the left and right are used. It is to be noted that the generation unit 280D and the mixing unit 280E are examples of a determination unit.
The processing unit 290 processes, in accordance with the content of a flag set by the detection unit 280A with respect to the telephone call state of the user, speech decoded by the decoder 270. The processing unit 290 will be described with reference to
For example, if a certain value indicating that the user is in a telephone call state in which he/she is using either of his/her ears is set for a flag input from the above-described detection unit 280A (if receiver_flag=1), the processing unit 290 performs processing in the following manner. That is, the processing unit 290 transmits speech decoded by the decoder 270 to the mixing unit 280E as it is.
On the other hand, if a certain value indicating that the user is in a telephone call state in which he/she is using headphones, earphones, or the like is set for a flag input from the above-described detection unit 280A (if receiver_flag=2), the processing unit 290 performs processing in the following manner. That is, the processing unit 290 substitutes the relative angle calculated by the calculation unit 260 for “θ” and uses the following expressions (4-1) and (4-2) to generate speech for the left ear and the right ear, respectively. It is to be noted that the expressions (4-1) and (4-2) are convolution calculation between a head-related transfer function (impulse response) and a speech signal S from the speech source 1, and, for example, a finite impulse response (FIR) filter is used therefor.
sig(n): Speech signal S
hrtfL(θ,m)(m=0, . . . , M−1): Impulse response of HL(θ)
hrtfR(θ,m)(m=0, . . . , M−1): Impulse response of HR(θ)
M: Length of impulse response
The processing unit 290 then transmits the speech for the right ear and the left ear generated by using the above expressions (4-1) and (4-2) to the mixing unit 280E. It is to be noted that, as described above, if the user is in a telephone call state in which he/she is using headphones, earphones, or the like, the mixing unit 280E does not mix a characteristic sound with the speech for the right ear and the left ear, and the speech for the right ear and the left ear are output from the mixing unit 280E as they are.
The above-described terminal apparatus 100 and terminal apparatus 200 have, for example, a semiconductor memory devices such as random-access memories (RAMs) or flash memories, which are used for various processes. In addition, the above-described terminal apparatus 100 and terminal apparatus 200 have electronic circuits such as central processing units (CPUs) or micro processing units (MPUs) and use the RAMs or the flash memories to execute various processes. It is to be noted that the above-described terminal apparatus 100 and terminal apparatus 200 may have integrated circuits such as application-specific integrated circuits (ASICs) or field-programmable gate arrays (FPGAs) instead of the CPUs or the MPUs.
Processing Performed by Terminal Apparatus (First Embodiment)
The flow of processing performed by the above-described terminal apparatus 200 will be described with reference to
First, the general processing flow of the terminal apparatus 200 will be described with reference to
Next, the direction obtaining unit 250 obtains information regarding the terminal direction of the terminal apparatus 200 (step S103). Next, the calculation unit 260 calculates the relative angle of the direction of the person, that is, a direction from the position of the terminal apparatus 200 to the position of the terminal apparatus 100, in relation to the terminal direction of the terminal apparatus 200 (step S104).
Next, the detection unit 280A executes the telephone call state detection process (step S105). Next, the detection unit 280A judges, as a result of the telephone call state detection process in step S105, whether or not the user is in a telephone call state in which he/she is using either of his/her ears (step S106). If it has been judged by the detection unit 280A, for example, that the user is in a telephone call state in which he/she is using either of his/her ears (YES in step S106), the judgment unit 280B executes the hold type judgment process, where whether the user is conducting the telephone call while using his/her right ear or left ear is judged (step S107).
Next, the correction unit 280C uses the correction values predetermined for a telephone call in which the right ear is used and a telephone call in which the left ear is used in order to execute the direction correction process, where the direction of the person calculated in step S104, that is, the relative angle, is corrected (step S108). Next, the generation unit 280D generates, in accordance with the direction of the person corrected in step S108, a characteristic sound to be mixed to the speech received from the terminal apparatus 100 (step S109).
Next, if the user is in a telephone call state in which he/she is using either of his/her ears, the mixing unit 280E mixes the characteristic sound generated in step S109 to the speech received from the terminal apparatus 100 (step S110). The mixing unit 280E then outputs the speech and the characteristic sound mixed in step S110 in monaural (step S111), where a single system is used, and the processing returns to the above-described process in step S102.
Now, description returns to step S106. If it has been judged by the detection unit 280A that the user is in a telephone call state other than the telephone call state in which he/she is using either of his/her ears, that is, a telephone call state in which, for example, he/she is using headphones, earphones, or the like (NO in step S106), the processing unit 290 executes the following process. That is, the processing unit 290 generates speech for the right ear and the left ear from the speech received from the terminal apparatus 100 on the basis of the relative angle calculated in step S104 (step S112). Next, the mixing unit 280E outputs the speech for the right ear and the left ear generated in step S112 as they are in stereo (step S113), where output systems different between the left and right are used. The processing returns to the above-described process in step S102.
It is to be noted that if a telephone call in which the terminal apparatus 200 is used can be assumed to be invariably performed in a situation in which either of the user's ears is used, the processing need not be necessarily executed in accordance with the above-described flow illustrated in
Next, the flow of the telephone call state detection process will be described with reference to
Description returns to step S202. If it has been judged that the number of channels is not one (NO in step S202), the detection unit 280A judges that the user is in a telephone call state other than a telephone call state in which he/she is using either of his/her ears, that is, a telephone call state in which, for example, he/she is using headphones, earphones, or the like. The detection unit 280A then sets, for a certain flag, a certain value indicating that the user is in a telephone call state in which, for example, he/she is using headphones, earphones, or the like (receiver_flag=2, step S204).
Next, the flow of the hold type judgment process will be described with reference to
As illustrated in
If it has been judged that the acceleration along the x-axis has a positive value (YES in step S302), the judgment unit 280B judges that the user is conducting the telephone call while using his/her left ear. The judgment unit 280B then sets, for a certain flag, a certain value indicating a telephone call in which the left ear is used (hold_flag=1, step S303).
Description returns to step S302. If it has been judged that the acceleration along the x-axis does not have a positive value, that is, the acceleration along the x-axis has a negative value (NO in step S302), the judgment unit 280B judges that the user is conducting the telephone call while using his/her right ear. The judgment unit 280B then sets, for a certain flag, a certain value indicating a telephone call in which the right ear is used (hold_flag=0, step S304).
Next, the flow of the direction correction process will be described with reference to
As illustrated in
If it has been judged that the obtained value of a flag is a value indicating that the user is in a telephone call state in which he/she is using either of his/her ears (YES in step S402), the correction unit 280C obtains the value of a flag set by the judgment unit 280B. The correction unit 280C then judges whether or not the obtained value of a flag is a value (hold_flag=1) indicating a telephone call in which the left ear is used (step S403). If it has been judged that the obtained value of a flag is a value indicating a telephone call in which the left ear is used (YES in step S403), the correction unit 280C obtains the correction value “ang4 (delta_angle_L)” for a telephone call in which the left ear is used. The correction unit 280C then uses the correction value for a telephone call in which the left ear is used in order to correct the relative angle “ang3 (relative_angle1)” calculated by the calculation unit 260 in a way that suits a telephone call in which the left ear is used (step S404). By this correction, for example, the corrected relative angle “ang6 (relative_angle2)” illustrated in
Description returns to step S403. If it has been judged that the obtained value of a flag is not a value indicating a telephone call in which the left ear is used, that is, the obtained value of a flag is a value (hold_flag=0) indicating a telephone call in which the right ear is used (NO in step S403), the correction unit 280C performs the following process. That is, the correction unit 280C obtains the correction value “ang5 (delta_angle_R)” for a telephone call in which the right ear is used. The correction unit 280C then uses the correction value for a telephone call in which the right ear is used in order to correct the relative angle “ang3 (relative_angle1)” calculated by the calculation unit 260 in a way that suits a telephone call in which the right ear is used (step S405). By this correction, for example, the corrected relative angle “ang6 (relative_angle2)” illustrated in
Description returns to step S402. If it has been judged that the value of a flag set by the detection unit 280A is not a value indicating that the user is in a telephone call state in which he/she is using either of his/her ears (NO in step S402), the correction unit 280C immediately ends the direction correction process.
As described above, when a telephone call state in which the user is using either of his/her ears has been detected, the terminal apparatus 200 according to the first embodiment corrects, by a certain angle, the relative angle between the direction from the terminal apparatus 200 to the terminal apparatus 100, which is used by a person with whom the user is conducting a telephone call, and the terminal direction of the terminal apparatus 200. The terminal apparatus 200 then determines the attribute of output speech in accordance with the corrected relative angle. Therefore, according to the first embodiment, it is possible to allow a user who is conducting a telephone call while using either of his/her ears to exactly perceive the direction of a person with whom he/she is conducting the telephone call.
In addition, according to the first embodiment, the relative angle between the direction from the terminal apparatus 200 to the terminal apparatus 100, which is used by a person with whom the user is conducting a telephone call, and the terminal direction of the terminal apparatus 200 is corrected by using correction values predetermined for a telephone call in which the right ear is used and a telephone call in which the left ear is used. Therefore, even when the user is conducting a telephone call while using either of his/her ears, it is possible to accurately calculate the relative angle for a telephone call in which the right ear is used and a telephone call in which the left ear is used at a time when the terminal direction of the terminal apparatus 200 and the front direction of the user are matched. As a result, it is possible to improve the accuracy of the direction of a person with whom the user is conducting a telephone call, which is perceived by the user. It is to be noted that the correction of the relative angle between the direction from the terminal apparatus 200 to the terminal apparatus 100, which is used by a person with whom the user is conducting a telephone call, and the terminal direction of the terminal apparatus 200 is not limited to a case in which the correction values predetermined for a telephone call in which the right ear is used and a telephone call in which the left ear is used are used. For example, in view of the fact that the angle between the terminal direction and the front direction of the user is frequently about 180°, the relative angle between the direction from the terminal apparatus 200 to the terminal apparatus 100, which is used by a person with whom the user is conducting a telephone call, and the terminal direction of the terminal apparatus 200 may be, for example, corrected by 180°.
In addition, according to the first embodiment, a characteristic sound is generated whose volume becomes larger as the corrected relative angle becomes smaller, and the generated characteristic sound is mixed to speech during a telephone call. Therefore, since the direction of a person with whom a user is conducting a telephone call is not expressed by the difference between speech for the left ear and speech for the right ear, it is possible to allow a user who is conducting a telephone call while using either of his/her ears to exactly perceive the direction of the person with whom he/she is conducting the telephone call. It is to be noted that even if the person with whom the user is conducting a telephone call remains silent, it is possible to allow the user to perceive the direction of the person with whom he/she is conducting the telephone call by mixing a characteristic sound to the silence during the telephone call received from the terminal apparatus 100.
In addition, in the above-described first embodiment, for example, when mixing a characteristic sound generated by the generation unit 280D to speech input from the processing unit 290, the mixing unit 280E may perform acoustic processing using a head-related transfer function. For example, the mixing unit 280E performs acoustic processing in such a way that speech input from the processing unit 290 and a characteristic sound generated by the generation unit 280D are transmitted from virtual speech sources whose positions are different from each other. The mixing unit 280E then mixes the characteristic sound with the speech and outputs the resulting sound. For example, the mixing unit 280E superimposes the characteristic sound upon the speech and outputs the resulting sound. In doing so, the speech of the person (output speech) and the characteristic sound can be played back from different directions (for example, from upper and lower directions), thereby making it possible for the user to easily distinguish between the speech and the characteristic sound. That is, even when a characteristic sound has been mixed with speech of the person in a telephone call state in which the user is using either of his/her ears, the speech and the characteristic sound can be prevented from being difficult to distinguish.
A terminal apparatus and a speech processing program according to a second embodiment disclosed herein will be described hereinafter.
(1) Configuration of Apparatus etc.
For example, the above-described configuration of the terminal apparatus 200 illustrated in
(2) Hardware Configuration of Terminal Apparatus
Next, an example of the hardware configuration of a terminal apparatus according to the second embodiment will be described with reference to
The wireless communication unit 310, the display unit 320, the speech input/output unit 330, the input unit 340, and the storage unit 350 are connected to the processor 360. In addition, the antenna 311 is connected to the wireless communication unit 310. In addition, the microphone 331 and the speaker 332 are connected to the speech input/output unit 330.
The wireless communication unit 310 corresponds to, for example, a communication control unit, which is not illustrated in
The storage unit 350 and the processor 360 realize, for example, the functions of the detection unit 280A, the judgment unit 280B, the correction unit 280C, the generation unit 280D, the mixing unit 280E, and the like illustrated in
All examples and conditional language recited herein are intended for pedagogical objects to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Although the embodiment(s) of the present inventions have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2010-209936 | Sep 2010 | JP | national |