This application claims the benefit of Korean Patent Application No.10-2005-0055116, filed on Jun. 24, 2005, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference.
1. Field of the Invention
The present invention relates to audio signal processing, and more particularly, to a method and apparatus for generating a bitstream of an audio signal, in which an audio signal can be easily extended to a multichannel audio signal, the processing speed of an audio signal can be improved, and channel signals of an audio signal can be processed simultaneously, and an audio encoding/decoding method and apparatus using the method and apparatus.
2. Description of Related Art
The time/frequency mapping unit 100 converts an audio signal in a time domain into signals in a frequency domain. A difference perceived by humans between the characteristics of a signal is not so great in the time domain, but the converted signals in the frequency domain vary from perceivable signals to unperceivable signals in each frequency band according to a human psychoacoustic model. Thus, compression efficiency can be improved by changing the number of bits assigned to each frequency band.
The psychoacoustic modeling unit 110 calculates a masking threshold for each frequency band using a masking phenomenon of the converted signals in the frequency domain.
By using the masking threshold for each frequency band input from the psychoacoustic modeling unit 110, the data processing unit 120 performs signal processing to improve encoding efficiency while minimizing a sound quality change that can be perceived by human. The data processing unit 120 uses a signal processing method for improving encoding efficiency, such as time-domain noise simulation, intensity stereo processing, perceptual-noise substitution, or mid/side (M/S) stereo processing.
The quantizing unit 130 performs scalar-quantization on frequency signals in each frequency band so that the magnitude of quantization noise in each frequency band is less than a corresponding masking threshold. Thus, humans cannot perceive the quantization noise even though the quantization noise is included in the audio signal. The bitstream generating unit 140 generates a bitstream to fit it into a predetermined data structure by combining the quantized audio signal of the encoder and information about the encoding.
When the audio signal to be encoded is a multichannel audio signal, it is generally encoded in predetermined units of encoding, instead of in channel units. The predetermined unit of coding means at least one channel signal that is simultaneously encoded.
For example, when an audio signal includes 5 channel signals, i.e., a stereo channel signal, a mono channel signal, a center channel signal, a surround left channel signal, and a surround right channel signal, the predetermined units of encoding are the stereo channel signal and the mono channel signal that are encoded together, the center channel signal, and the surround left channel signal and the surround right channel signal that are encoded together. Since two channel signals have high redundancy, encoding efficiency can be improved by encoding the two channel signals at the same time.
Conventional audio devices are classified into stereo players and a multichannel players. The stereo player is developed to also provide a mono playback function. The multichannel player is developed to also provide a stereo playback function. A bitstream extension method for the application of a data structure for generating bitstreams of mono/stereo audio signals to multichannel audio signals is provided in ISO/IEC 13818-3.
When a multichannel audio signal is encoded/decoded using the conventional bitstream data structure, it is difficult to determine whether an audio signal included in a bitstream is a multichannel signal including other channel signals in addition to stereo/mono channel signals. As a result, the audio signal cannot be efficiently processed according to the user's demand or the performance of an audio player. Moreover, since the maximum frame length is predetermined, the total frame length cannot be efficiently used.
An aspect of the present invention provides a method and apparatus for generating a bitstream, in which channel information of an encoded audio signal can be easily detected from a bitstream, and an audio encoding/decoding method and apparatus using the method and apparatus.
An aspect of the present invention also provides a method and apparatus for generating a bitstream, in which the total frame length of a bitstream can be set variable according to the characteristic of an audio signal, and an audio encoding/decoding method and apparatus using the method and apparatus.
An aspect of the present invention also provides a method and apparatus for generating a bitstream, in which a region where each of encoded audio signals is located is easily detected from a bitstream to simultaneously decode audio signals corresponding to units of encoding, and an audio encoding/decoding method and apparatus using the method and apparatus.
According to an aspect of the present invention, there is provided a method of generating a bitstream of an audio signal using an encoded audio signal and encoding information. The method includes generating a flag indicating whether the encoded audio signal is a multichannel audio signal, generating a bitstream header including the generated flag, and generating the bitstream using the generated bitstream header and the encoded audio signal.
According to another aspect of the present invention, there is provided a method of generating a bitstream using an encoded signal and encoding information. The method includes determining the possible maximum frame length of the bitstream to determine the number of bits assigned to data having frame length information according to the determined maximum frame length, generating a frame length of the bitstream as signal data encoded with the determined number of bits, and generating the bitstream using the generated frame length information data and the encoded signal.
According to still another aspect of the present invention, there is provided an apparatus for generating a bitstream of an audio signal using an encoded audio signal and encoding information. The apparatus includes a flag generating unit, a header generating unit, and a combining unit. The flag generating unit generates a flag indicating whether the encoded audio signal is a multichannel audio signal. The header generating unit generates a bitstream header including the generated flag. The combining unit generates the bitstream using the generated bitstream header and the encoded audio signal.
According to yet another aspect of the present invention, there is provided an apparatus for generating a bitstream using an encoded signal and encoding information. The apparatus includes a number-of-bit determining unit, a frame length data generating unit, and a combining unit. The number-of-bit determining unit determines the possible maximum frame length of the bitstream to determine the number of bits assigned to data having frame length information according to the determined maximum frame length. The frame length data generating unit generates a frame length of the bitstream as signal data encoded with the determined number of bits. The combining unit generates the bitstream using the generated frame length information data and the encoded signal.
According to yet another aspect of the present invention, there is provided a data structure of a bitstream of an encoded audio signal. The data structure includes a bitstream header including information about whether the encoded audio signal is a multichannel audio signal, frame length information data having frame length information of the bitstream, and data of the encoded audio signal.
According to yet another aspect of the present invention, there is provided a method of encoding an audio signal. The method includes encoding channel signals included in the audio signal in units of encoding, generating a bitstream header including a flag indicating whether the encoded audio signal is a multichannel audio signal, and generating a bitstream using the generated bitstream header and the encoded audio signal.
According to yet another aspect of the present invention, there is provided an apparatus for encoding an audio signal. The apparatus includes an encoding unit, a header generating unit, and a bitstream generating unit. The encoding unit encodes channel signals included in the audio signal in units of encoding. The header generating unit generates a bitstream header including a flag indicating whether the encoded audio signal is a multichannel audio signal. The bitstream generating unit generates a bitstream using the generated bitstream header and the encoded audio signal.
According to yet another aspect of the present invention, there is provided a method of decoding an input bitstream of an audio signal. The method includes checking if the audio signal is a multichannel signal using a flag included in a bitstream header of the bitstream and decoding the audio signal according to whether the audio signal is a multichannel signal or not.
According to yet another aspect of the present invention, there is provided an apparatus for decoding an input bitstream of an audio signal. The apparatus includes a multichannel detecting unit and a decoding unit. The multichannel detecting unit checks if the audio signal is a multichannel signal using a flag included in a bitstream header of the bitstream. The decoding unit decodes the audio signal according to whether the audio signal is a multichannel signal or not.
According to yet another aspect of the present invention, there is provided a computer-readable recording medium having recorded thereon a program for implementing the method of generating a bitstream of the audio signal and the audio encoding/decoding method.
Additional and/or other aspects and advantages of the present invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
The above and/or other aspects and advantages of the present invention will become apparent and more readily appreciated from the following detailed description, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
The multichannel determining unit 400 detects channel information of an input audio signal to determine whether the input audio signal includes only stereo/mono channel signals or is a multichannel signal including other channel signals such as a center channel signal or surround left/right channel signals in addition to the stereo/mono channel signals. It is advantageous that the multichannel determining unit 400 determines whether to encode the audio signal as a multichannel signal using encoding information input by a user through a user input unit (not shown). For example, when the user desires to encode the audio signal as the stereo/mono channel signals, it is advantageous that the multichannel determining unit 400 determines the input audio signal to be the stereo/mono channel signals even when the input audio signal includes the stereo/mono channel signals, the center channel signal, and the surround left/right channel signals.
The encoding unit 410 receives number of channel information and input audio signal from the multichannel determining unit 400 and encodes the input audio signal based on the received channel information. When the input audio signal is a multichannel signal, the encoding unit 410 divides channel signals included in the input audio signal into a predetermined number of units of encoding and performs encoding in units of encoding. When the input audio signal includes 5 channel signals, i.e., a stereo channel signal, a mono channel signal, a center channel signal, a surround left channel signal, and a surround right channel signal, it is advantageous that the units of encoding are the stereo/mono channel signals, the center channel signal, and the surround left/right channel signals.
When the input audio signal is a multichannel signal, the encoding unit 410 encodes the stereo/mono channel signals first and then encodes the other extension channel signals in units of encoding. The extension channel signals include extension channel type information indicating an audio channel configuration. It is advantageous that the extension channel type information is expressed by a channel configuration index. It is advantageous that the channel configuration index has a 3-bit field indicating an audio output channel configuration as follows. The channel configuration index prescribes the number of channels in channel-to-speaker mapping.
A method of encoding an extension channel signal includes encoding the extension channel signal, encoding additional information for the encoding, encoding the extension channel type information indicating the audio channel configuration, and then encoding the length of the extension channel signal.
Referring to
The flag generating unit 500 receives the number of channel information about whether the input audio signal is a multichannel signal from the multichannel determining unit 400 and generates a flag MC_PRESENT having the number of channel information in operation 920. It is advantageous that the flag generating unit 500 generates the flag MC_PRESENT as 0 when the audio signal includes only stereo/mono channel signals and generates the flag MC_PRESENT as 1 when the audio signal includes other channel signals in addition to stereo/mono channel signals.
The frame length data generating unit 510 generates data FRAME_LENGTH having frame length information of a generated bitstream in operation 930. It is advantageous that the data FRAME_LENGTH has a variable number of bits and includes a flag having information about the extension of the number of bits when the number of bits of the data FRAME_LENGTH is extended to a number greater than the basic number of bits.
As illustrated in
It is advantageous that the frame length data generating unit 510 calculates the maximum frame length using the number of channels of the audio signal and a required compression rate prior to encoding of the audio signal and then determines the number of bits of the data FRAME_LENGTH according to the calculated maximum frame length.
The unit length data generating unit 520 generates data ELEMENT_LENGTH having information about the length of encoded data of each of the encoding units of the audio signal in operation 940. For example, when the encoding units of the audio signal are stereo/mono channel signals, a center channel signal, and surround left/right channel signals, the unit length data generating unit 520 generates data ELEMENT_LENGTH having information about the length of the encoded stereo/mono channel signals, the length of the encoded center channel signal, and the length of the encoded surround left/right channel signals.
The offset data generating unit 530 generates data SCALABLE_HEADER having information about a layer that is the reproduction unit of each of the encoding units of the audio signal to distinguish the layer from a bitstream in operation 950. It is advantageous that the data SCALABLE_HEADER has an offset value for each of layers included in the encoding units. When the audio signal includes only stereo/mono channel signals, offset information of layers included in the encoded stereo/mono channel signals may be calculated as follows.
layer_offset[n]=layer_offset[n−1]+FRAME_LENGTH/total_layer_num (1),
where layer_offset[n] indicates an offset vale of an nth layer, FRAME_LENGTH indicates a total frame length, and total_layer_num indicates the total number of layers. It is advantageous that an offset value layer_offset[1] of a first layer is set to 0.
When the audio signal includes extension channel signals in addition to the stereo/mono channel signals, offset information of layers included in each of the encoding units may be calculated as follows.
layer_offset[n]=layer_offset[n−1]+ELEMENT_LENGTH/total_layer _num (2),
where layer_offset[n] indicates an offset value of an nth layer, ELEMENT_LENGTH indicates the length of encoded data of each of the encoding units, and total_layer_num indicates the total number of layers included in the encoding units.
The header generating unit 540 generates a bitstream header using the generated data MC_PRESENT, FRAME_LENGTH, ELEMENT_LENGTH, and SCALABLE_HEADER in operation 960. The bitstream generating unit 550 combines the encoded audio signal and the generated bitstream header, thereby generating a bitstream of the audio signal in operation 970.
As illustrated in
Examples of a syntax created for the bitstream header are as follows.
According to the above syntaxes, data FRAME_LENGTH having information about the total frame length and a flag MC_PRESENT having information about whether an audio signal is a multichannel signal are generated. When the flag MC_PRESENT is 1, i.e., the audio signal is a multichannel signal, data ELEMENT_LENGTH having information about the length of encoded data of each of the encoding units of the audio signal is generated. Then data SCALABLE_HEADER having offset information about a layer that is the reproduction unit of each of the encoding units is generated.
The above syntax is created for variably setting the number of bits of the data FRAME_LENGTH having frame length information and the number of bits of the data ELEMENT_LENGTH having information about the length of encoded data of each of the encoding units of the audio signal.
As mentioned above, when bits whose number is greater than the basic number of bits is assigned to the data FRAME_LENGTH, LengthEnd_flag of the above syntax is set to 1.
The multichannel detecting unit 1020 reads a flag MC_PRESENT included in a bitstream header of an input bitstream to check if an audio signal included in the bitstream is a multichannel signal in operation 1100. The multichannel detecting unit 1020 may determine that the audio signal includes only stereo/mono channel signals when the flag MC_PRESENT is 0 and determine that the audio signal includes other channel signals in addition to the stereo/mono channel signals when the flag MC_PRESENT is 1.
The frame length detecting unit 1030 reads data FRAME_LENGTH included in the bitstream header of the bitstream to detect the total frame length of the bitstream in operation 1110. The frame length detecting unit 1030 may read flags having information about whether the number of bits included in the data FRAME_LENGTH is extended to check if the number of bits of the data FRAME_LENGTH is equal to the basic number of bits or is extended and by how many bits the data FRAME_LENGTH is extended and detect the total frame length of the input bitstream from the data FRAME_LENGTH.
If the multichannel detecting unit 1020 determines that the audio signal included in the bitstream is a multichannel signal, the unit length detecting unit 1040 reads data ELEMENT_LENGTH included in the bitstream header of the bitstream and detects the length of encoded data of each of encoding units included in the bitstream in operation 1120. The layer information detecting unit 1050 reads data SCALABLE_HEADER included in the bitstream header of the bitstream and detects offset information about layers included in the bitstream in operation 1130.
The decoding unit 1010 decodes audio data included in the bitstream using information about the unit length data and the bitstream detected by the bit-unpacking unit 1000 in operation 1140.
If the multichannel detecting unit 1020 determines that the audio signal included in the bitstream is a multichannel signal, the decoding unit 1010 may decode only a channel signal desired by a user using information about the length of each of encoding units detected from the data ELEMENT_LENGTH. For example, when the bitstream includes an audio signal encoded in units of stereo/mono channel signals, a center channel, and surround left/right channel signals, only a user-desired signal among three encoded signals may be decoded and reproduced using the detected length of each of the stereo/mono channel signals, the center channel, and the surround left/right channel signals. If an audio player including the audio decoder according to the present invention can play only some of audio channel signals included in the bitstream, e.g., stereo/mono channel signals, the decoding unit 1010 may be controlled to decode only the stereo/mono channel signals that can be played by the audio player using the information about the length of each of the encoding units.
The decoding unit 1010 may simultaneously decode encoded signals included in the bitstream using the information about the length of each of the encoding units detected from the data ELEMENT_LENGTH.
Embodiments of the present invention include computer-readable code on a computer-readable recording medium. A computer-readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of computer-readable recording media include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and carrier waves.
According to the above-described embodiments of the present invention, a flag having information about whether an audio signal is a multichannel signal is included in a bitstream header of a bitstream, thereby allowing for efficient and rapid encoding/decoding. Furthermore, by variably setting the number of bits of data having frame length information of a bitstream, it is possible to improve encoding/decoding efficiency and easily increase the number of audio channel signals that can be processed at the same time.
Although a few embodiments of the present invention have been shown and described, the present invention is not limited to the described embodiments. Instead, it would be appreciated by those skilled in the art that changes may be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
10-2005-0055116 | Jun 2005 | KR | national |