The present application claims priority to the corresponding Japanese Application No. 2003-196216, filed on Jul. 14, 2003, the entire contents of which are hereby incorporated by reference.
1. Field of the Invention
The present invention relates to an image processing apparatus that processes image data according to a result of evaluating an audio signal. The present invention also relates to a program that is run on a computer to execute such a process and a storage medium storing such a program.
2. Description of the Related Art
Prior art techniques for scoring the singing ability of a user singing a song in a karaoke system, and reflecting the resulting score on an image being displayed are disclosed in Japanese Laid-Open Patent Application No. 9-81165 and Japanese Laid-Open Patent Application No. 9-160574, for example.
In Japanese Laid-Open Patent Application No. 9-81165, a technique is disclosed in which a song playtime is subdivided into blocks, and a story represented by images being displayed is modified according to the singing ability of the user.
In Japanese Laid-Open Patent Application No. 9-160574, a technique is disclosed in which the respective singing abilities of two singers singing the same song simultaneously are scored, and the image area for the singer with the higher score is increased on the monitor screen.
However, in Japanese Laid-Open Patent Application No. 9-81165, the different stories to be represented according to the singing ability evaluation result are prepared beforehand, and thereby, the user is likely to get weary and lose interest after repeated usage of the system.
Also, since Japanese Laid-Open Patent Application No. 9-160574 is concerned with scoring the respective singing abilities of two singers singing the same song simultaneously and increasing the image area on the monitor screen of the singer with the higher score, this system cannot be used by one single user.
Further, Japanese Laid-Open Patent Application No. 9-160574 only discloses a technique for changing the image size based on the difference in scores between the two singers.
An image processing apparatus, image display, system, program, and storage medium are described. In one embodiment, the image processing apparatus comprises a processing unit to process image data encoded through wavelet transform according to a result of evaluating a single audio signal.
In one embodiment of the present invention, a user's singing ability is evaluated, and an image to be displayed is processed according to the evaluation in a manner that can attract the interest of the user. In another embodiment, a user is enabled to use such technology alone.
One embodiment of the present invention comprises an image processing apparatus that processes image data according to a result of evaluating a single audio signal. Based on a result of evaluating an audio signal characteristic such as the singing ability of a user inputting his/her singing voice, image data may be processed in various ways that can attract the interest of the user.
According to one embodiment of the present invention, the processing of the image data may involve changing the image size of the image data according to the evaluation result. By processing the image data through changing the image size of the image data according to an evaluation result pertaining to the singing ability of a user, for example, the interest of the user may be maintained.
According to another embodiment of the present invention, the image processing may involve degrading the image quality of the image data according to the evaluation result. By processing the image data in order to degrade the image quality according to an evaluation result pertaining to the singing ability of a user, for example, the interest of the user may be maintained.
According to another embodiment of the present invention, the image processing may involve reducing image color of the image data according to the evaluation result. By processing the image data in order to reduce the color of the image based on an evaluation result pertaining to the singing ability of a user, for example, the interest of the user may be maintained.
According to another embodiment of the present invention, the image processing may involve discarding a portion of the image data according to the evaluation result. By processing the image data through discarding a portion of the image data according to an evaluation result pertaining to the singing ability of a user, for example, the interest of the user may be maintained.
In another embodiment, an image processing apparatus processes image data by degrading image quality of the image data according to a result of evaluating an audio signal. By degrading the image quality of image data according to the evaluation, the interest of the user may be maintained.
In another embodiment, an image processing apparatus processes image data by reducing image color of the image data according to a result of evaluating an audio signal. By reducing the color of the image according to the evaluation, the interest of the user may be maintained.
In another embodiment, an image processing apparatus processes image data by discarding a portion of the image data according to a result of evaluating an audio signal. By discarding a portion of the image data according to the evaluation, the interest of the user may be maintained.
According to another embodiment of the present invention, the evaluation result is obtained by comparing the waveform of an audio signal with the waveform of comparison data provided beforehand. By comparing the waveform of the audio signal with the waveform of the comparison data, for example, the singing ability of a user may be evaluated and image processing may be conducted accordingly.
According to another embodiment of the present invention, the evaluation result is obtained by comparing a volume (amplitude) of the audio signal with a volume of comparison data provided beforehand. In this way, image processing may be conducted based on an evaluation of the volume of an audio signal.
According to another embodiment of the present invention, the image data corresponds to code data encoded by the JPEG 2000 algorithm, and the image processing involves discarding a portion of codes of this code data. In this way, image processing is performed on image data in an encoded state.
According to another preferred embodiment, the image processing apparatus successively executes a procedure to process predetermined image data during a predetermined time period according to the evaluation result obtained during this predetermined time period, which predetermined image data are used to form an image to be displayed during a next predetermined time period. In this way, for example, in a karaoke system, input audio is progressively evaluated, and an image to be displayed is progressively processed accordingly.
In another embodiment, an image display system includes an image processing apparatus, an evaluation unit that evaluates the audio signal, and a display apparatus that displays an image based on the processed image data Based on a result of evaluating an audio signal characteristic such as the singing ability of a user inputting his/her singing voice, image data may be processed in various ways that can attract the interest of the user.
In another embodiment, an image display apparatus includes an image processing apparatus that processes image data through successively executing the image processing procedure at predetermined time periods, an evaluation unit that evaluates the audio signal, and a display apparatus that displays an image corresponding to the processed image data in sync with the successive execution of the image processing procedure. In this way, for example, the voice of the user being input may be evaluated, and the evaluation of the singing ability of the user may be immediately reflected in the image being displayed in sync with the song being replayed, by processing image data in various ways that may attract the interest of the user.
In another embodiment, a computer readable program that is run on a computer includes a procedure for processing image data according to a result of evaluating a single audio signal. In this way, based on a result of evaluating the voice of a user being input, for example, image data may be processed in various ways that may tract the interest of the user. Since the evaluation is made with respect to a single audio signal, one embodiment of the present invention may be used by a single user.
In another embodiment, a computer readable program that is run on a computer includes a procedure for processing image data by degrading the image quality of the image data according to a result of evaluating an audio signal. By processing the image in order to degrade the image quality, the interest of the user may be maintained.
In another embodiment, a computer readable program that is run on a computer includes a procedure for processing image data by reducing image color of the image data according to a result of evaluating an audio signal. By processing the image data in order to reduce the image color, the interest of the user may be maintained.
In another embodiment, a computer readable program that is run on a computer includes a procedure for processing image data by discarding a portion of the image data according to a result of evaluating an audio signal. By processing the image data through discarding a portion of the image data, the interest of the user may be maintained.
The present invention according to another embodiment provides a storage medium that stores a program of the present invention.
JPEG 2000
First, quantization, code discarding and image quality control processes according to JPEG 2000 are described.
A precinct corresponds to a rectangular division unit (having a size that may be determined by the user) of a sub-band. More specifically, a precinct may correspond to a collection of three corresponding rectangular division units of the three sub-bands HL, LH, and HH, or one rectangular division unit of the LL sub-band. A precinct roughly represents a position within an image. It is noted that a precinct may have the same size as the sub-band, and a precinct may be further divided into a rectangular division unit (of a size that may be determined by the user) to generate a code block.
The code block is used as a unit for conducting bit-plane encoding on the coefficients of a quantized sub-band (one bit plane is decomposed into three sub-bit planes and encoded). Packets correspond to a portion of codes that are extracted from all the code blocks included in a precinct (e.g., a collection of codes corresponding to the three most significant bit planes of all the code blocks). It is noted that the term “portion” of codes may also refer to an “empty” state in which the packet contains no codes.
When the packets of all the precincts (i.e., all the code blocks, and all the sub-bands) are collected, a portion of the codes of the overall image (e.g., codes corresponding to the three most significant bit planes of the wavelet coefficients of the overall image) may be obtained, and this is referred to as a layer. Since a layer roughly represents a portion of the codes of bit planes of the overall image, the image quality may be improved as the number of layers to be decoded increases. In other words, a layer may be regarded as a unit for determining the image quality.
When all the layers are collected, codes of all the bit planes of the overall image may be obtained.
It is noted that packets correspond to a portion of codes from one or more code blocks that are extracted and collected, and the rest of the unnecessary codes do not have to be generated as packets. For example, in
In this way, image quality control through code discarding may be conducted for each code block (and for each sub-bit plane). That is, the code block is used as the unit for conducting image quality control though code discarding. It is noted that the arrangement of packets is referred to as progression order.
In the following, embodiments of the present invention are described.
The server 104 transmits moving image or still image code data 111 accumulated therein to the client 103. The moving image or still image code data 111 used in this example are encoded according to a compression scheme, such as JPEG 2000 and motion JPEG 2000, that allows editing of code data in their encoded state without having to be decoded.
The client 103 includes a microphone 121 for inputting an audio signal, an amplifier 122 for amplifying this audio signal, and a speaker 123 for outputting the amplified audio signal.
Also, the client 103 includes an evaluation unit 124 for evaluating the audio signal such as a user's voice or the sound of an instrument that is input to the microphone 121 based on a predetermined criterion. For example, the evaluation unit 124 may compare the waveform of the input audio signal with the waveform of comparison data that are stored beforehand, and evaluate the input audio signal based on the absolute value difference between the waveforms. In another example, to augment the amusement factor, the audio volume may be used in evaluating the singing ability (e.g., evaluation using comparison data pertaining to audio volume). The evaluation result obtained from the evaluation unit 124 is input to an inter-code transform unit 125. The inter-code transform unit 125 conducts image processing through discarding a portion of codes according to the evaluation result pertaining to the singing ability of the user, for example. After discarding the unnecessary codes, the remaining code data are decoded at a decoder 126, and used by display unit 127 to produce a moving image, for example.
In a case where the image display system 101 is used as a communications karaoke system, the moving image code data in the server 104 may be accompanied by audio data corresponding to the accompaniment of a song (and this audio data may also be compressed, encoded, and transmitted). In this case, the audio data (in a decoded state if initially encoded) are mixed with the voice input to the microphone 121 by the user, and output from the speaker 123 in sync with the moving image displayed at the display unit 127.
The bus 313 is connected, via predetermined interfaces, to a magnetic storage device 314 such as a hard disk, an input device 315 such as a keyboard and/or a mouse, a display apparatus 316, and a storage medium reading device 318 that reads a storage medium 317 such as an optical disk. Also, the bus 313 is connected to a predetermined communication interface 319 that establishes communication with the network 102. It is noted that various types of media may be used as the storage medium 317, which may correspond to an optical disk such as a CD or a DVD, a magneto-optical disk, or a flexible disk, for example. The storage medium reading device 318 may correspond to an optical disk device, a magneto-optical disk device, or a flexible disk device, for example, according to the type of storage medium 317 being used.
The client 103 and the server 104 are adapted to read one or more programs 320 from the storage medium 317 and install the programs 320 in the magnetic storage device 314. Programs may also be downloaded via the network 102 such as the Internet and installed. By installing these programs, the client 103 and the server 104 may be able to execute the various procedures including the image processing procedure described above. It is noted that the programs 320 may correspond to programs that are operated on a predetermined OS.
It is noted that in the client 103, the bus 313 is also connected to the microphone 121 and the amplifier 122 via predetermined interfaces.
By executing the processes according to the installed programs 320, the functions of the respective component parts such as the evaluation unit 124, the decoder 126, the inter-code transform unit 125, the editing unit 201, and the display unit 127 may be realized and an image may be displayed on the display apparatus 316 by the display unit 127.
As can be appreciated from the above descriptions, the image display system may be arranged to have various system configurations. In the example of
Accordingly, a moving image obtained from evaluation and image processing during time period t is displayed during the next time period t. In a case of displaying one still image, when obtaining an image from evaluation and image processing during time period t for display during the next time period t, the image being subjected to the image processing corresponds to the same image for each time period t, but the images being displayed at the respective time periods t may differ depending on their corresponding evaluations affecting the image processing. In other words, in this example, a procedure of making an evaluation and processing image data at time period t and displaying the resulting image during the next time period t is repeatedly performed with respect to the same image in cycles of time period t. Alternatively, in a case of successively displaying plural still images like a slide show, a procedure of making an evaluation and processing a still image during time period t, and displaying the processed image during the next time period t is performed for each still image in cycles of time period t. In any case, a procedure of evaluating a singing ability and processing image data during a certain time period t for displaying the processed image during the next time period t is successively performed. The image corresponding to the processed image data is displayed on the display apparatus 316 in sync with the successive execution of the above procedure.
In the following, exemplary techniques for processing image data are described.
Also, it is noted that in the example of
In this case, a background image may be changed according to the singing ability of the user, or an image of the user may be captured (e.g., a still image or a moving image) and this image may be displayed as a portion of the background image, wherein the size of the image of the user changes according to the singing ability of the user. In the latter example, the code data corresponding to the captured image of the user are encoded using a resolution progressive scheme.
Also, it is noted that in the image display system of
In this case, a background image may be changed according to the singing ability of the user, or an image of the user may be captured (e.g., a still image or a moving image) and this image may be displayed as a portion of the background image, wherein the definition of the image of the user changes according to the singing ability of the user. In the latter example, the code data corresponding to the captured image of the user are encoded using an image quality progressive scheme.
Also, it is noted that in the image display system of
In this case a background image may be changed according to the singing ability of the user, or an image of the user may be captured (e.g., a still image or a moving image) and this image may be displayed as a portion of the background image, wherein the color of the image of the user is reduced according to the singing ability of the user. In the latter example, the code data corresponding to the captured image of the user are encoded using a component progressive scheme.
Also, it is noted that in the image display system of
In this case, a background image may be changed according to the singing ability of the user, or an image of the user may be captured (e.g., a still image or a moving image) and this image may be displayed as a portion of the background image, wherein a portion of the code data of the image of the user may be discarded in response to a poor singing ability. In the latter example, the code data corresponding to the captured image of the user are encoded using a position progressive scheme.
In the examples of FIGS. 10˜13, the image processing is illustrated by alternating between two levels; however, a singing ability may be evaluated and categorized into three or more levels and the image processing may alternate between the three or more levels.
Also, the above examples are illustrated using a voice of a user as an example of an audio signal being input; however, the present invention is not limited to such embodiment. For example, the input audio signal may correspond to a sound of an instrument. In such case, the ability to play the instrument is evaluated, the practice efforts (degree of progress) may be reflected in the image being displayed instead of a numerical value, for example, and the user may be more interested in the output, thereby finding motivation to practice further.
The present application is based on and claims the benefit of the earlier filing date of Japanese Patent Application No. 2003-196216 filed on Jul. 14, 2003, the entire contents of which are hereby incorporated by reference.
Number | Date | Country | Kind |
---|---|---|---|
2003-196216 | Jul 2003 | JP | national |