This invention relates to a communication terminal having a communication function and installing a common function to a function that an associated communication terminal installs and a communication method of the communication terminal.
Hitherto, a video telephone provided with a function of sending a character called avatar to an associated communication terminal instead of a photograph image of the user has been developed (for example, refer to patent document 1).
Patent document 1: JP-A-003-109036 (page 3, page 4,
However, in the video telephone in the related art, all video telephones have not necessarily the same processing capability and when communications are conducted between the video telephones different in processing capability, communications are conducted in accordance with the processing capability of the video telephone having the lower processing capability and smooth processing cannot be accomplished between the video telephones; this is a problem.
It is therefore an object of the invention to provide a communication terminal capable of causing an associated communication terminal to execute the function at the level required by the home terminal and a communication method of the communication terminal.
The communication terminal of the invention is a communication terminal having a communication function and installing a common function to a function that an associated communication terminal installs, the communication terminal including data generation means for generating data to execute the function that the home terminal installs and data to execute the function that the associated communication terminal installs; and transmission means for transmitting the data to execute the function that the associated communication terminal installs.
According to the configuration, the data generation means for generating the data to execute the function that the home terminal installs and the data to execute the function that the associated communication terminal installs is provided, whereby if the terminal capability of the associated communication terminal is lower than that of the home terminal, the associated communication terminal can be caused to execute the function at the level required by the home terminal.
The communication terminal of the invention has a video telephone function; input data analysis means for analyzing input data; and data matching means for outputting data provided by matching the data of the home terminal and the data of the associated communication terminal based on the analysis result to the input data analysis means. The communication terminal of the invention includes input means for inputting at least one data selected from among image data, voice data, and key input data to the input data analysis means as the input data. According to the configuration, the input data analysis means for analyzing the input data is provided, whereby data on which the input data is reflected can be generated.
The communication method of the invention is a communication method of a communication terminal installing a common function to a function that an associated communication terminal installs, and includes the steps of generating data to execute the function that the home terminal installs and data to execute the function that the associated communication terminal installs; and transmitting the data to execute the function that the associated communication terminal installs.
According to the invention, the data to execute the function that the home terminal installs and the data to execute the function that the associated communication terminal installs are generated, whereby if the terminal capability of the associated communication terminal is lower than that of the home terminal, the associated communication terminal can be caused to execute the function at the level required by the home terminal.
[
[
[
[
[
[
[
[
[
[
[
[
[
[
[
[
[
[
[
[
[
[
The video telephones 1 and 2 have input data sections 10A and 10B, data transmission sections 11A and 11B, data reception sections 12A and 12B, display image generation sections 13A and 13B, and video telephone display sections 14A and 14B as common parts. The video telephone 1 further has a character data storage section 15, an expression and emotion analysis section 16, an action data generation section 17, and an action matching section 18. The display image generation section 13A of the video telephone 1 generates data to execute the function that the video telephone 1 (home terminal) installs and data to execute the function that the video telephone 2 (associated communication terminal) installs, and the data transmission section 11A transmits the data to execute the function that the video telephone 2 installs. The expression and emotion analysis section 16 of the video telephone 1 analyzes the input data, and the action data generation section 17 outputs the data provided by matching the data of the video telephone 1 and the data of the video telephone 2 based on the analysis result to the display image generation section 13A. The input data section 10A of the video telephone 1 inputs any one selected from among image data, voice data, and key input data as input data into the expression and emotion analysis section 16.
The input data sections 10A and 10B are connected to various input means such as a camera, a microphone, and a key input section (not shown), and are used to acquire information representing user's expression, emotion, and action (user information). The input data section 10B of the video telephone 2 inputs any one selected from among image data, voice data, and key input data as input data into the expression and emotion analysis section 16 through the data transmission section 11B and the data reception section 12A. The data transmission section 11A transmits the image data to be displayed on the video telephone 2. The data transmission section 11B transmits information representing the expression and emotion of the user of the video telephone 2 to the video telephone 1. The data reception section 12A receives the information representing the expression and emotion of the user of the video telephone 2 transmitted from the video telephone 2. The data reception section 12B receives the image data transmitted from the video telephone 1.
The display image generation section 13A generates an image to be displayed on the video telephone display section 14A and an image to be displayed on the video telephone display section 14B based on the input data from the input data section 10A and the input data from the input data section 10B. The display image generation section 13A passes the generated image data to be displayed on the video telephone display section 14B to the data transmission section 11A.
The display image generation section 13B generates a display image from the image data generated by the display image generation section 13A and acquired through the data reception section 12B. The display image generation section 13B may display the acquired image data intact on the video telephone display section 14B without processing the image data. The video telephone display section 14A has a liquid crystal display and displays the image generated by the display image generation section 13A. The video telephone display section 14B has a liquid crystal display and displays the image generated by the display image generation section 13B. Data to create a character image is stored in the character data storage section 15. The character data is image data to display a character on the video telephones 1 and 2, and a plurality of pieces of the character data are provided corresponding to pieces of action data generated by the action data generation section 17. In the embodiment, two types of characters can be displayed.
The expression and emotion analysis section 16 analyzes the expression and emotion of the user of the video telephone 1 based on the image data, the voice data, or the key input data from the input data section lA. The expression and emotion analysis section 16 also analyzes the expression and emotion of the user of the video telephone 2 based on the image data, the voice data, or the key input data from the video telephone 2. If the facial image of the user is input, the expression and emotion analysis section 16 analyzes the facial image and detects the expression and emotion of laughing, being angered, etc.
As a method of detecting the expression and emotion, for example, face recognition processing is performed from the image input data acquired periodically and the average values of the feature point coordinates of the face parts of eyebrows, eyes, a mouth, etc., detected are found as average expression feature point coordinates. A comparison is made between the feature point coordinates of the face parts of the eyebrows, the eyes, the mouth, etc., undergoing the face recognition processing according to the image input data acquired this time and the average expression feature point coordinates and if change in each face part satisfied a specific condition, the expression and emotion of “laughing,” “being surprised,” “being grieved,” etc., are detected.
In the case of “laughing,” three conditions that both ends of the eyebrow change upward a threshold value W3 or more, that the lower end of the eye changes upward a threshold value W2 or more, and that both ends of the mouth change upward a threshold value W1 or more are all satisfied. In the case of “being surprised,” three conditions that both ends of the eyebrow change upward a threshold value O1 or more, that the top and bottom width of the eye increases a threshold value N2 or more, and that the top and bottom width of the mouth increases a threshold value N3 or more are all satisfied. In the case of “being grieved,” three conditions that both ends of the eyebrow change downward a threshold value Ni or more, that the top and bottom width of the eye decreases a threshold value N2 or more, and that both ends of the mouth change downward a threshold value N3 or more are all satisfied.
The expression and emotion analysis section 16 detects face motion for a given time, thereby detecting action of “head shaking,” “nodding,” etc.
The expression and emotion analysis section 16 analyzes the key input data and detects the expression and emotion associated with each key. Here, various expressions and emotions are associated with the keys of a key operation section (not shown) and as the user operates (presses) the key matching his or her expression and emotion during telephone conversation, the expression and emotion analysis section 16 detects the expression and emotion and determines the action corresponding to the expression and emotion. For example, the expression and emotion of “getting angry” are associated with a key of “1” and the user presses the key, whereby the action of “getting angry” is confirmed. The expression and emotion of “laughing” are associated with a key of “2” and the user presses the key, whereby the action of “laughing” is confirmed. The expression and emotion of “being surprised” are associated with a key of “3” and the user presses the key, whereby the action of “being surprised” is confirmed. The expression and emotion of “being scared” are associated with a key of “4” and the user presses the key, whereby the action of “being scared” is confirmed.
The action of “hand raising” is associated with a key of “5” and the user presses the key, whereby the action of “hand raising” is confirmed. The action of “thrusting away” is associated with a key of “6” and the user presses the key, whereby the action of “thrusting away” is confirmed. The action of “attacking” is associated with a key of “7” and the user presses the key, whereby the action of “attacking” is confirmed. The action of “hand joining” is associated with a key of “8” and the user presses the key, whereby the action of “hand joining” is confirmed. The action of “embracing” is associated with a key of “9” and the user presses the key, whereby the action of “embracing” is confirmed.
From the expression and emotion detected by the face recognition processing described above, the action is associated with a sole action table or a mutual action table by performing expression and emotion conversion processing, and the action of “laughing,” “being surprised,” “head shaking,” “nodding,” “hand joining,” or “embracing” of the character is confirmed.
The expression and emotion analysis section 16 analyzes voice data and detects the emotion of yelling, etc., of the user. As a method of detecting the emotion, the user's emotion is detected from magnitude change in the rhythm and sound, for example, in such a manner that if the rhythm of voice input data becomes fast and the sound becomes large, “laughing” is confirmed, that if the rhythm is unchanged and the sound becomes large, “being surprised” is confirmed, or that if the rhythm is slow and the sound becomes small, “being grieved” is confirmed. From the detected emotion, the action is associated with the sole action table or the mutual action table by performing expression and emotion conversion processing, and the action of “laughing,” “being surprised,” “being grieved,” “hand joining,” or “embracing” of the character is confirmed.
Thus, the expression and emotion analysis section 16 analyzes the expression and emotion of the user based on the image data, the voice data, and the key input data, and inputs the analysis result to the action data generation section 17. All of the image data, the voice data, and the key input data are not required and any one of them may be used.
The action data generation section 17 generates action data DA from the sole action table TA if input data IA of the video telephone 1 indicates sole action; generates action data DB from the sole action table TB if input data IB of the video telephone 2 indicates sole action; generates action data DA from the mutual action table TC if input data IA of the video telephone 1 indicates mutual action; and generates action data DB from the mutual action table TC if input data IB of the video telephone 2 indicates mutual action.
Although the example described above applies to the video telephone 1, similar description also applies to the video telephone 2 regardless of whether the input data IB is image or voice. This means that the input data IA of the video telephone 1 is replaced with the input data IB and the action data DA is replaced with the action data DB. Of course, the sole action table TB in
The action data generation section 17 inputs the action data DA, DB generated as described above to the display image generation section 13A and the action matching section 18. The action matching section 18 matches the action data DA and DB as follows:
When input data from the expression and emotion analysis section 16 does not exist (none of image data, voice data, and key input data are input), the action data generation section 17 generates action data of “default action” in the sole action table TA, TB as shown in
The display image generation section 13A acquires the character data corresponding to the action data DA generated by the action data generation section 17 or the action data DA′ provided by matching the action data DA by the action matching section 18 from the character data storage section 15 and displays the image on the video telephone display section 14A. It also acquires the character data corresponding to the action data DB for the video telephone 2 generated by the action data generation section 17 or the action data DB′ provided by matching the action data DB by the action matching section 18 from the character data storage section 15 and transmits the character data through the data transmission section 11A to the video telephone 2.
For example, if the action data DA of mutual action of “thrusting away” and the action data DB of sole action of “laughing,” “crying,” “being surprised,” or “being scared” are generated, display based on the action data DA is produced on the video telephone display section 14A, namely, a character image where the character Ca of the video telephone 1 thrusts the character Cb of the video telephone 2 away is displayed as shown in
If the action data DB is action data of mutual action and occurs later than the action data DA, the character images displayed on the video telephone display section 14A and the video telephone display section 14B in
After the expression and emotion are analyzed from the input data IA, reception of input data IB from the video telephone 2 is started (ST13). When the input data IB transmitted from the video telephone 2 is received, the expression and emotion of the user of the video telephone 2 are analyzed from the input data IB (ST14). For example, if a crying face of the user of the video telephone 2 is fetched, the analysis result of “crying” is produced. Action data DA is generated from the analysis result of the input data IA (ST15) and subsequently action data DB is generated from the analysis result of the input data IB (ST16).
After the action data DA and DB are generated, if one of them is data of mutual action, matching is performed (ST17). If both are data of mutual action, matching is performed so that the action data based on the input data occurring earlier becomes active action. After the action data DA and DB are matched, the display images of the characters to be displayed on the video telephone display sections 14A and 14B are generated (ST18). The display image data of the character for the video telephone 2 is transmitted to the video telephone 2 (ST19). After the display image data of the character is transmitted to the video telephone 2, the display image of the character for the video telephone 1 is displayed on the video telephone display section 14A (ST20). During the telephone conversation (NO at ST21), steps ST11 to ST20 are repeated. When the telephone conversation terminates (YES at ST21), the processing is terminated.
In contrast, if action data DB exists (YES at ST24), the combination priority of the action data DA and DB is determined (ST26). In this case, mutual action takes precedence over sole action and for mutual actions, for example, the mutual action corresponding to the earlier acquired input data is selected. After the combination priority of the action data DA and DB is determined, the action data DA, DB is changed according to the priority (ST27). That is, as described above, if the action data DA is “laughing” and the action data DB is “thrusting away,” the action data DB of mutual action takes precedence over the action data DA and accordingly, the action data DA of “laughing” is changed to action data DA′ of “blowing off.” After the action data DA, DB is changed, they are output (ST28).
Thus, according to the video telephone system described above, the video telephone 1 generates the image data to be displayed on the associated communication terminal (video telephone 2) in addition to the image data displayed on the home terminal and transmits the image data to be displayed on the video telephone 2 to the video telephone 2, whereby if the terminal capability of the associated communication terminal is lower than that of the home terminal, the associated communication terminal can be caused to execute the function at the level required by the home terminal.
In the description given above, the video telephone 1 has the character data to be displayed on the video telephones 1 and 2, but the character data may be transmitted from the video telephone 2 to the video telephone 1 at the telephone conversation start time. In the description given above, the image data corresponding to the action is acquired from the character data storage section 15 and is transmitted to the video telephone 2, but the character data on which the image is to be displayed is based may be transmitted at the telephone conversation start time and only the difference data corresponding to the character action may be transmitted during the telephone conversation. Accordingly, the data communication amount can be decreased as compared with the case where all image data is transmitted during the telephone conversation as in the related art.
In the embodiment described above, as the sole actions, “laughing,” “crying,” “being surprised,” “being scared,” “getting angry,” and “shouting” are taken as examples and as the mutual actions, “thrusting away” —>“blowing off,” “attacking” —>“falling,” “hand joining” —>“hand joining,” and “embracing” —>“being embraced” are taken as examples, but the invention is not limited to them and various examples can be named. The sole action data can also be used as the mutual action data. For example, “being surprised” can be set to mutual action with “shouting.”
In the embodiment described above, to confirm the action by key operation, if the user simply operates (presses) a key, the action assigned to the key is confirmed, but a new action may be able to be confirmed depending on a key operation manner (of continuing to press the key, intermittently pressing the key, accentually pressing the key, etc., for example).
After the character data CB is received and is retained, input data IA is acquired (ST54). That is, at least one of image data, voice data, and key input data is acquired from an input data section 10A of the home machine. When the input data IA is acquired, then the expression and emotion of the user of the home machine are analyzed from the input data IA (ST55). For example, if a laughing face of the user is photographed, the analysis result of “laughing” is produced. After the expression and emotion of the user of the home machine are analyzed, action data DA responsive to the expression and emotion of the user of the home machine is generated from the analysis result (ST56). The generated action data DA is transmitted to the associated video telephone 5 (ST57). After the action data DA is transmitted, reception of the action data DB from the associated video telephone 5 is started (ST58).
When the action data DB of the video telephone 5 is acquired, if one of the action data DB and the action data DA of the home terminal is data of mutual action, matching is performed (ST59). If both are data of mutual action, matching is performed so that the action data obtained earlier becomes active action, for example. The details of the matching processing are described later. After the action data DA and DB are matched, a character display image is generated based on the action data DA (ST60) and is displayed on a video telephone display section 14A (ST61). During the telephone conversation (NO at ST62), steps ST54 to ST62 are repeated. When the telephone conversation terminates (YES at ST62), the processing is terminated.
In contrast, if action data DB is obtained (YES at ST75), the combination priority of the action data DA and DB is determined (ST77). In this case, mutual action takes precedence over sole action and for mutual actions, for example, the mutual action corresponding to the earlier obtained action data is selected. However, if time determination is made, when first communications are started, the video telephones 4 and 5 are synchronized with each other.
After the combination priority of the action data DA and DB is thus determined, the action data DA, DB is changed according to the priority (ST78). That is, as described above, if the action data DA is “laughing” and the action data DB is “thrusting away,” the action data DB of mutual action takes precedence over the action data DA and accordingly, the action data DA of “laughing” is changed to action data DA′ of “blowing off.” After the action data DA, DB is changed, they are output (ST79).
After the character data CA is received and is retained, input data IB is acquired (ST94). That is, at least one of image data, voice data, and key input data is acquired from an input data section 10B of the home terminal. When the input data IB is acquired, then the expression and emotion of the user of the home terminal are analyzed from the input data IB (ST95). For example, if a crying face of the user is photographed, the analysis result of “crying” is produced. After the expression and emotion of the user of the home terminal are analyzed, action data DB responsive to the expression and emotion of the user of the home terminal is generated from the analysis result (ST96). The generated action data DB is transmitted to the associated video telephone 4 (ST97). After the action data DB is transmitted, reception of the action data DA from the associated video telephone 4 is started (ST98).
When the action data DA of the video telephone 4 is acquired, if one of the action data DA and the action data DB of the home terminal is data of mutual action, matching is performed (ST99). If both are data of mutual action, matching is performed so that the action data obtained earlier becomes active action, for example. The details of the matching processing are described later. After the action data DB and DA are matched, a character display image is generated based on the action data DB (ST100) and is displayed on a video telephone display section 14B (ST101). During the telephone conversation (NO at ST102), steps ST94 to ST102 are repeated. When the telephone conversation terminates (YES at ST102), the processing is terminated.
In contrast, if action data DAis obtained (YES at ST115), the combination priority of the action data DB and DA is determined (ST117). In this case, mutual action takes precedence over sole action and for mutual actions, for example, the mutual action corresponding to the earlier obtained action data is selected. However, if time determination is made, when first communications are started, the video telephones 5 and 4 are synchronized with each other.
After the combination priority of the action data DB and DA is thus determined, the action data DB, DA is changed according to the priority (ST118). That is, if the action data DB is “crying” and the action data DA is “thrusting away,” the action data DA of mutual action takes precedence over the action data DB and accordingly, the action data DB of “crying” is changed to action data DB′ of “blowing off.” After the action data DB, DA is changed, they are output (ST119).
In the embodiment, the display image and the transmission image to be created are process images based on camera input images rather than characters. The video telephone image is made up of images of both the video telephones 6 and 7 and only the video telephone 6 performs all display data combining processing. Only the video telephone 7 mayperform all display data combining processing.
The image process determination section 21 generates image process data DPA from the sole process table TD if input data IA of the video telephone 6 indicates sole process; generates image process data DPB from the sole process table TE if input data IB of the video telephone 7 indicates sole process; generates image process data DPA from the mutual process table TF if input data IA of the video telephone 6 indicates mutual action; and generates image process data DPB from the mutual process table TF if input data IB of the video telephone 7 indicates mutual action.
The following image process data is generated by way of example:
The image process determination section 21 stores the generated image process data DPA and DPB in the image process data storage section 20.
The image process matching section 22 matches the image processing method from the image process data DPA of the video telephone 6 determined by the image process determination section 21 and stored in the image process data storage section 20 and the image process data DPB of the video telephone 7. For example, when the image process data of the video telephone 6 is “scale up,” the image process data of the video telephone 7 becomes “scale down.”
The image process matching section 22 operates in any of the following four manners depending on the image process data combination:
If no image process data is selected because of no effective input (any of image, voice, or key input), “default” in the sole process table TD, TE is output.
In
After the expression and emotion are analyzed from the input data IA, reception of input data IB from the video telephone 7 is started (ST133). When the input data IB transmitted from the video telephone 7 is received, the expression and emotion of the user of the video telephone 7 are analyzed from the input data IB (ST134). For example, if a crying face of the user of the video telephone 7 is fetched, the analysis result of “crying” is produced. Image process data DPA is determined from the analysis result of the input data IA (ST135) and subsequently image process data DPB is determined from the analysis result of the input data IB (ST136).
After the image process data DPA and DPB are generated, if one of them is data of mutual action, matching is performed (ST137). If both are data of mutual action, matching is performed so that the action data based on the input data occurring earlier becomes active action. After the image process data DPA and DPB are matched, the display images of the characters to be displayed on the video telephone display sections 14A and 14B are generated (ST138). The display image data of the character for the video telephone 7 is transmitted to the video telephone 7 (ST139). After the display image data of the character is transmitted to the video telephone 7, the display image of the character for the video telephone 6 is displayed on the video telephone display section 14A (ST140). During the telephone conversation (NO at ST141), steps ST131 to ST140 are repeated. When the telephone conversation terminates (YES at ST141), the processing is terminated.
In contrast, if image process data DPB exists (YES at ST154), the combination priority of the image process data DPA and DPB is determined (ST156). In this case, mutual process takes precedence over sole process and for mutual processes, for example, the mutual process corresponding to the earlier acquired input data is selected. After the combination priority of the image process data DPA and DPB is determined, the image process data DPA, DPB is changed according to the priority (ST157). That is, as described above, if the image process data DPA is “heart” and the image process data DPB is “hummer,” the image process data DPB of mutual action takes precedence over the image process data DPA and accordingly, the image process data DPA is “heart” is changed to image process data DPA′ of “lump.” After the image process data DPA, DPB is changed, they are output (ST158).
While the invention has been described in detail with reference to the specific embodiments, it will be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit and the scope of the invention.
The present application is based on Japanese Patent Application No. (2004-112854) filed on Apr. 7, 2004 and Japanese Patent Application No. (2005-086335) filed on Mar. 24, 2005, which are incorporated herein by reference.
The invention has the advantage that the data to execute the function that the home terminal installs and the data to execute the function that the associated communication terminal installs are generated, whereby if the terminal capability of the associated communication terminal is lower than that of the home terminal, the associated communication terminal can be caused to execute the function at the level required by the home terminal, and is useful for a communication terminal having a communication function and installing a common function to a function that an associated communication terminal installs and a communication method of the communication terminal, etc.
Number | Date | Country | Kind |
---|---|---|---|
2005-086335 | Mar 2005 | JP | national |
2004-112854 | Apr 2004 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP05/06313 | 3/31/2005 | WO | 6/27/2006 |