Electronic messaging is now commonplace in today's society. Electronic mail (e-mail), for example, is a ubiquitous and a common form of communication between users of computer systems or other devices linked together via a wired or wireless switched network data link such as the Internet or an intranet. E-mails typically include or attach type written text. Another example of current electronic messaging is video e-mail (v-mail). Typically, a v-mail includes or attaches a message including audio data and corresponding video data sent by a user. Often times, the audio data is a digital recording of the user's voice, and the video data relates to a series of images of the user as his voice is recorded. A computer system or other device receiving such a v-mail may play back the message attached or included therein by displaying a sequence of images and generating audio from the video and corresponding audio data, respectively. Typically, the images are displayed at 30 frames per second, and corresponding audio is generated at the same rate (e.g., a normal rate) at which the user's voice was originally recorded.
Video data, including those of v-mail messages, if not compressed, requires a large amount of data transfer bandwidth for its transmission between source and destination computer systems or other similar devices. Likewise, audio data, if not compressed, also requires a large amount of data transfer bandwidth. Various types of well known video and audio compression algorithms are used on video and audio data, respectively, to accommodate the limited transfer bandwidth between computer systems. In general, different video compression algorithms exist for still images and for moving images (a sequential display of images). Intraframe compression algorithms are used to compress data within a still image or single frame using spatial redundancies within the frame. Interframe compression algorithms are used to compress multiple frames, i.e., motion video, using the temporal redundancy between the frames. Interframe compression methods are used exclusively for motion video, either alone or in conjunction with intraframe compression methods.
The present invention relates to the play back of previously recorded audio and video data. More particularly, the present invention relates to a computer system, method, and computer readable medium storing instructions executable by computer system for varying the playback rate of audio data as corresponding motion video data is displayed. In accordance with the present invention, the playback rate can be increased above or decreased below normal playback rates while maintaining the quality or tone of audio speech.
The present invention finds application with respect to audio data and corresponding video data received from a switched network such as the Internet. Additionally the present invention finds application with respect to digitally recorded audio data and corresponding video data of movie clips, v-mail, self-study tapes, etc. Often audio data and corresponding video data is received over the Internet in a compressed format. Before playback, the audio data and corresponding video data is decompressed. After decompression, first audio corresponding to a first portion of decompressed audio data is generated. The first audio is generated at a first audio generation rate. Thereafter second audio corresponding to the second portion of the decompressed audio data is generated. The second audio is generated at a second audio generation rate which differs from the first audio generation rate. However, the tone of second audio is substantially equal to the tone of the first audio. The first and second audio is generated as decompressed video data is displayed in image frames.
The present invention may be better understood, and its numbers objects, features and advantages made apparent to those skilled in the art by referencing the accompanying drawings. The use of the same reference number throughout the several figures designates a like or similar element.
a-6d illustrate exemplary adjustments to the display of decompressed frames of video data in accordance with adjustments to the play back rate of decompressed audio data;
The present invention relates to adjusting the playback rate of digitally recorded audio data and corresponding video data of, for example, a v-mail message which has been transmitted via the Internet, an intranet or other wired or wireless data links (hereinafter referred to as a data link) between computer systems or similar devices. The present invention should not be limited to application to audio data and corresponding video data of a v-mail message. Rather, the present invention may find application to playback of any digitally recorded audio data and corresponding video data.
Typically, audio data and corresponding video data of a v-mail message is compressed before being transmitted to a destination computer system, or other similar device, via a datalink. The present invention will be described with respect to audio data and corresponding video data transmitted between source and destination computer systems, it being understood that the present may have application to data transmitted between other devices. Prior to transmission, the audio data and corresponding video data are typically compressed by the source computer system in accordance with any one of several well known audio and video compressing algorithms, respectively. The compressed audio data and corresponding video data, upon receipt by the destination computer system, are decompressed for subsequent play back by any one of several well-known data decompression techniques.
Audio data, after decompression, may be played back using transducers (i.e., speakers), while video data may be played back using an image display device (i.e., a monitor). The speaker generates audio (e.g., voice sounds) corresponding to the decompressed audio data while the image display device displays a sequence of image frames corresponding to the decompressed video data. The image display device generates full motion video by displaying image frames.
The present invention provides a computer system, a method, or a computer readable medium storing instructions executable by a computer system for increasing or decreasing the rate (measured with respect to normal rates) at which decompressed audio speech data is played back while corresponding video data is displayed.
As used herein the term “microprocessor” generally describes the logic circuitry that responds to and processes basic instructions contained in a memory medium. The term “memory medium” includes an installation medium, e.g., a CD ROM, or floppy disks; a volatile computer system memory such as DRAM, SRAM, rambus RAM, etc.; or a non volatile memory such as optical storage or magnetic medium, e.g., a hard drive. The term “memory” is used interchangeably with “memory medium” herein. The memory may comprise other types of memory or combinations thereof. In addition, the memory may be located in a computer system in which the instructions are executed, or may be located in a second computer system (e.g., computer system 106 in
Computer systems may take various forms. In general, computer systems may include a digital signal processor or application specific integrated circuit for performing distinct functions. Alternatively, computer systems can be broadly defined to encompass any device having a microprocessor that executes instructions from a memory medium. Instructions for implementing the present invention on a computer system can be received by the computer system via a carrier medium. The carrier medium may include the memory media or storage media described above in addition to a communication medium such as a network and/or wireless link which carries instructions as signals such as electrical or electromagnetic signals.
Referring again to
In one embodiment the audio data and corresponding video data of a v-mail message received by computer system 102, is decompressed in accordance with one or more well know decompression algorithms. Computer system 102 may include peripherals (not shown in
The computer system 102 may include an input/output (I/O) device which enables a user to moderate the rate or speed at which the decompressed audio is generated by the speakers as the image frames are displayed. More particularly, the computer system 102 may include an input/output device which receives commands to increase or decrease the speed or rate at which decompressed audio data is played back. As will be more fully described below the increase or decrease in play back rate occurs with little or no loss of voice content thereof. While the audio is generated at an increased or decreased rate, the voice tone of the audio remains substantially the same as the voice tone of the same audio when played back at a normal rate. In other words, the audio is generated at an increased or decreased speed without sounding like a “chipmunk.” U.S. Pat. No. 5,873,059 entitled Method And Apparatus For Decoding And Changing The Pitch Of An Encoded Speech Signal, describes a technique for increasing or decreasing the play back rate of audio while maintaining tone and is incorporated herein by reference. Also, as will be more fully described below, increasing or decreasing the rate at which decompressed audio is played back may also alter the display of corresponding decompressed video data.
With continuing reference to
The graphical user interface 404 may include a playback rate adjustment field or bar 402b for adjusting the rate at which decompressed audio data and corresponding video data are played back. N/P designates normal playback rate, F/P designates fast playback, and S/P designates slow playback. Even though the playback rate of audio is increased or decreased using field 402b, the tone or pitch of the resulting audio is substantially similar to that of audio generated at normal rates (e.g., the rate at which the audio was originally recorded). In one embodiment, the play back of the audio speech data above or below the normal rate, employs techniques described in U.S. Pat. No. 5,873,059. Thus, an increased or decreased rate of audio generation (when compared with normal speed) will be comprehendible by the user. As will be more fully described below, the display of the image frames will be adjusted to account for the increase or decrease rate of the audio generation.
The graphical user interface 404 may further include field 402c which may be used to pause the play back of decompressed audio data stored in memory 314 and corresponding image frames of data from memory 306. Lastly, the graphical user interface 404 may include a field 402d which may be used to fast reverse through data stored in memories 306 and 314 in much the same way as the fast forward field enables fast forwarding through the data described above.
Functions associated with fields or electronic buttons or electronic bars 402a-402d may be initiated by pointing to and clicking, for example, buttons or bars 402a, c, and 402d with a cursor controlled by a mouse. The function associated with button 402b can be implemented by moving bar 406 left or right using a cursor controlled by a mouse. In another embodiment, the graphically user interface may include fields for receiving numeric data. More particularly, the graphical user interface may include a field for receiving numerical data representing the rate at which decompressed audio and corresponding video data are played back.
While decompressed audio can be played back at an increased speed while maintaining tone or pitch, the increase has a limit.
However, when the audio generation rate increases to L with a corresponding change in tone or pitch (i.e., there is simply an speed increase at which audio is generated with no further processing of audio data to accommodate the change in the resulting pitch), the audio quality falls below a threshold AT at which audio comprehension may become compromised. However, where the audio data is processed in accordance with the techniques described in U.S. Pat. No. 5,873,059 prior to audio generation, the rate limit where the audio degrades to incomphrensionable sounds, extends to L+1.
Typically, image frames of data stored in memory 306, are displayed on the monitor in sync with corresponding audio data in memory 314 when play back occurs at normal rate. Normally, the image frames are displayed at a frequency of 30 frames per second. At normal playback rate, each 30 image frames is displayed as a corresponding amount of audio data is played back. Thus, a second's worth of audio data is played back with each corresponding 30 image frame set when play back occurs at normal speed.
As noted above, the playback speed of audio data may be increased or decreased in accordance with the present invention. To insure an illusion of video continuity, the display of the image frames is adjusted in accordance with the change in speed of the audio generation rate. For example, if the audio play back rate increases, then it may be desirable to omit displaying one or more frames of each 30 image frame set (or every other 30 frame set) corresponding with the audio data played back. In this fashion, the 30 frames per second display rate is maintained.
As noted above, audio playback may be slowed below normal.
With an increase or decrease in the play back rate value, the output of clock divide circuit 702 increases or decreases in frequency thereby increasing or decreasing the rate at which audio data address generator 704 generates sequential memory addresses. Additionally, the increased or decreased clock frequency signals audio restore circuit to process received audio data in a manner which maintains tone so that the resulting generated audio is comprehendible.
Although the present invention has been described in connection with several embodiments, the invention is not intended to be limited to the specific forms set forth herein, but on the contrary, it is intended to cover such alternatives, modifications, and equipment's as can be reasonably included within the spirit and scope of the invention as defined by the appending claim.
Number | Date | Country | |
---|---|---|---|
Parent | 09649852 | Aug 2000 | US |
Child | 11438829 | May 2006 | US |