This application claims priority from Korean Patent Application No. 10-2011-0092560, filed on Sep. 14, 2011 in the Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
1. Field
Apparatuses and methods consistent with exemplary embodiments relate to a signal processing method of down-mixing a plurality of channels, an encoding apparatus thereof, and a decoding apparatus thereof, and more particularly, to a signal processing method of down-mixing n channel signals to one mono-signal, an encoding apparatus thereof, and a decoding apparatus thereof.
2. Description of the Related Art
An encoding apparatus and a decoding apparatus for multi-channel input and output encode and decode an audio signal including a voice, music, or the like by using a predetermined codec, and transceive an encoded signal and a decoded signal. With respect to an audio codec, if there is one input/output channel, the channel is referred to as a mono-channel, if there are two input/output channels, the channels are referred to as stereo channels, and if there are three or more input/output channels, the channels are referred to as multi-channels.
The encoding apparatus that operates according to a multi-channel codec down-mixes n channel signals to m channel signals. Also, when the down-mixing is performed, a spatial parameter is extracted. The encoding apparatus encodes the down-mixed signals and the spatial parameter, and transmits a corresponding transport stream (TS) to the decoding apparatus.
In the down-mixing, in order to reduce the number of output channels, compared to the number of input channels, reverse one to two (R-OTT) conversion or reverse two to three (R-TTT) conversion is performed. Here, the R-OTT conversion indicates conversion in which two input signals are received and then one signal is output, and the R-TTT conversion indicates conversion in which three input signals are received and then two signals are output.
Referring to
As illustrated in
Afterward, n first mono-signals (e.g., ch11, ch12, ch13, and ch14) that are output from the n R-OTT converters (e.g., the R-OTT converters R-OTT1 through R-OTT4), respectively, are input again to n/2 R-OTT converters (e.g., the R-OTT converters R-OTT5 and R-OTT6). Each (e.g., the R-OTT converter R-OTT5) of the n/2 R-OTT converters generates a second mono-signal (e.g., ch21) by down-mixing the first mono-signals (e.g., ch11 and ch12), and then generates a spatial parameter (e.g., P11) indicating a correlation between the first mono-signals (e.g., ch11 and ch12) input to.
Finally, the R-OTT converter R-OTT7 generates a final down-mixed signal M by down-mixing second mono-signals (e.g., ch21 and ch22), and then generates a corresponding spatial parameter (i.e., P21).
Whenever an R-OTT converted signal is restored one time, a decoding error occurs. As described above, in order to down-mix the eight input signals to the final down-mixed signal M, the R-OTT conversion is performed three times. Thus, in a case where a signal that has undergone the R-OTT conversion three times is restored, a decoding error is accumulated three times. Thus, when the original input signals ch1 though ch8 are restored by using the final down-mixed signal M and the spatial parameters P1, P2, P3, P4, P11, P12, and P21, if the decoding error is accumulated as described above, the decoding apparatus cannot restore the input signals ch1 though ch8 into their original forms. In more detail, a signal magnitude difference and a phase difference occur between the restored signals and the original input signals ch1 though ch8, in proportion to the accumulated decoding error.
As described above, when multi-channel signals are down-mixed several times by using the R-OTT conversion or the R-TTT conversion, a quality of a restored signal deteriorates due to a decoding error.
Thus, a method and apparatus for preventing signal quality deterioration that occurs in decoding is demanded.
Exemplary embodiments provide a signal processing method capable of preventing signal quality deterioration that occurs in decoding, an encoding apparatus thereof, and a decoding apparatus thereof.
In more detail, exemplary embodiments provide a signal processing method capable of generating or processing a spatial parameter so as to allow a mono-signal to be exactly restored into original n channel input signals when the n channel input signals are down-mixed to the mono-signal, an encoding apparatus thereof, and a decoding apparatus thereof.
According to an aspect of an exemplary embodiment, there is provided a signal processing method performed by an encoding apparatus that down-mixes first through n channel signals to a mono-signal, the signal processing method including: generating a spatial parameter between a reference channel signal that is from among the first through n channel signals, and residual channel signals from among the first through n channel signals except for the reference channel signal; and encoding and transmitting the spatial parameter to a decoding apparatus.
The operation of generating the spatial parameter may include operations of: generating a summation signal by summing the residual channel signals; and generating the spatial parameter by using a correlation between the summation signal and the reference channel signal.
The operation of generating the spatial parameter may include an operation of generating n spatial parameters by using each of the first through n channel signals as the reference channel signal.
The signal processing method may further include an operation of receiving the encoded n spatial parameters and the encoded mono-signal, wherein the receiving is performed by the decoding apparatus.
The signal processing method may further include an operation of restoring the first through n channel signals by using the n spatial parameters and the mono-signal.
The spatial parameter may include an angle parameter indicating a predetermined angle value that denotes a correlation between a signal magnitude of the reference channel signal and signal magnitudes of the residual channel signals.
The operation of generating the spatial parameter may include an operation of generating first through n angle parameters by using each of the first through n channel signals as the reference channel signal, wherein the first through n angle parameters indicate a correlation between a signal magnitude of each of the first through n channel signals that are reference channel signals, and signal magnitudes of the residual channel signals.
Total summation of the first through n angle parameters may be converged to a predetermined value, and the operation of generating the spatial parameter may include an operation of generating the spatial parameter including a k angle residual parameter used to calculate a k angle parameter and angle parameters from among the first through n angle parameters except for the k angle parameter.
The operation of generating the spatial parameter may include operations of: predicting a value of the k angle parameter from among the first through n angle parameters; comparing the predicted value of the k angle parameter with an original value of the k angle parameter; and generating a difference value between the predicted value of the k angle parameter and the original value of the k angle parameter as the k angle residual parameter.
The signal processing method may further include operations of: receiving the spatial parameter including the k angle residual parameter and the angle parameters from among the first through n angle parameters except for the k angle parameter, wherein the receiving is performed by the decoding apparatus; and restoring the k angle parameter by using the received spatial parameter and the predetermined value.
The operation of restoring the k angle parameter may include an operation of subtracting the value of the angle parameters from among the first through n angle parameters except for the k angle parameter from the predetermined value, obtaining a value by compensating for the value of the k angle residual parameter to a value resulting from the subtracting, and then generating the obtained value as the k angle parameter.
According to an aspect of another exemplary embodiment, there is provided a signal processing method performed by an encoding apparatus that down-mixes first through n channel signals to a mono-signal, the signal processing method including: generating a spatial parameter by using a correlation between a reference channel signal that is from among the first through n channel signals, and the mono-signal; and encoding and transmitting the spatial parameter to a decoding apparatus.
According to an aspect of another exemplary embodiment, there is provided an encoding apparatus down-mixing first through n channel signals to a mono-signal, the apparatus including: a down-mixing unit for generating a spatial parameter between a reference channel signal that is from among the first through n channel signals, and residual channel signals from among the first through n channel signals except for the reference channel signal; and an encoder for encoding and transmitting the spatial parameter to a decoding apparatus.
According to an aspect of another exemplary embodiment, there is provided a decoding apparatus including: an inverse-multiplexing unit for receiving a transport stream (TS), and separating a spatial parameter that is encoded; a spatial parameter decoding unit for decoding the spatial parameter; and an up-mixing unit for decoding a mono-signal generated by down-mixing and encoding first through n channel signals, and restoring the first through n channel signals by using the decoded mono-signal and the decoded spatial parameter, wherein the spatial parameter includes at least one of a first spatial parameter between a reference channel signal that is from among the first through n channel signals, and residual channel signals from among the first through n channel signals except for the reference channel signal, and a second spatial parameter between the reference channel signal and the mono-signal.
According to an aspect of another exemplary embodiment, there is provided a decoding method including: decoding an encoded spatial parameter; decoding a mono-signal generated by down-mixing and encoding first through n channel signals; and restoring the first through n channel signals using the decoded mono-signal and the decoded spatial parameter, wherein the spatial parameter includes at least one of a first spatial parameter between a reference channel signal and residual channel signals and a second spatial parameter between the reference channel signal and the mono-signal, the reference channel signal and the residual channel signals being from among the first through n channel signals.
The above and other features and advantages will become more apparent by describing in detail exemplary embodiments with reference to the attached drawings in which:
Hereinafter, a signal processing method, an encoding apparatus, and a decoding apparatus according to one or more exemplary embodiments will be described in detail by explaining exemplary embodiments with reference to the attached drawings.
Expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list.
A spatial parameter contains information used to restore a down-mixed signal into original input channel signals. In more detail, the spatial parameter is generated by using a correlation between the input channel signals, and may broadly include a parameter indicating a signal level difference between the input channel signals, and a parameter indicating the correlation between the input channel signals.
Hereinafter, the parameter indicating the signal level difference between the input channel signals is referred to as ‘first parameter’. In more detail, the first parameter may include a channel level difference (CLD) parameter. The parameter which indicates the correlation, e.g., a similarity, between the input channel signals is referred to as ‘second parameter’ hereinafter. In more detail, the second parameter may include at least one of an inter channel correlation (ICC) parameter, an overall phase difference (OPD) parameter, and an inter phase difference (IPD) parameter.
Referring to
The encoding apparatus 200 down-mixes and encodes first through n channel signals ch1 through chn to a mono-signal DM.
The down-mixing unit 210 may receive the first through n channel signals ch1 through chn that are multi-channel signals and may generate a spatial parameter between a reference channel signal that is from among the first through n channel signals ch1 through chn, and residual channel signals from among the first through n channel signals ch1 through chn except for the reference channel signal. Hereinafter, a signal obtained by summing the residual channel signals from among the first through n channel signals ch1 through chn except for the reference channel signal is referred to as a ‘first summation signal’. Also, the spatial parameter between the reference channel signal and the first summation signal is referred to as a ‘first spatial parameter’ hereinafter. That is, the down-mixing unit 210 may generate the first spatial parameter between the reference channel signal and the first summation signal.
Also, the down-mixing unit 210 may generate a spatial parameter between the first through n channel signals ch1 through chn and the reference channel signal that is from among the first through n channel signals ch1 through chn. Hereinafter, a signal obtained by summing the first through n channel signals ch1 through chn is referred to as a ‘second summation signal’. Also, the spatial parameter between the reference channel signal and the second summation signal is referred to as a ‘second spatial parameter’ hereinafter. That is, the down-mixing unit 210 may generate the second spatial parameter between the reference channel signal and the second summation signal.
Each of the spatial parameters generated by the down-mixing unit 210 may include at least one of the first spatial parameter indicating relative signal magnitudes of input channel signals, and the second spatial parameter indicating a correlation between the input channel signals.
Hereinafter, spatial parameter generating operations by the down-mixing unit 210 will be described in detail with reference to
The down-mixing unit 210 generates the mono-signal DM by down-mixing the first through n channel signals ch1 through chn.
The encoder 220 encodes a spatial parameter SP generated by the down-mixing unit 210, and transmits the spatial parameter SP to a decoding apparatus (not shown). Also, the encoder 220 encodes the mono-signal DM generated by the down-mixing unit 210.
In more detail, the encoder 220 encodes the spatial parameter SP and the mono-signal DM generated by the down-mixing unit 210, and converts the encoded spatial parameter SP and mono-signal DM into a transport stream TS. The transport stream TS is transmitted to the decoding apparatus.
Detailed operations of the encoding apparatus 200 are the same as or similar to detailed operations involved in signal processing methods 300 and 400 according to exemplary embodiments, which will be described with reference to
Referring to
Operation 310 may be performed by the down-mixing unit 210.
A spatial parameter SP generated in operation 310 is encoded and transmitted to a decoding apparatus (not shown) (operation 320). In more detail, the spatial parameter SP transmitted in operation 320 may include at least one of the first spatial parameter and the second spatial parameter. In more detail, in operation 320, the spatial parameter SP and the mono-signal DM may be encoded and converted into a transport stream TS, and the transport stream TS may be transmitted to the decoding apparatus.
Operation 320 may be performed by the encoder 220 of
Referring to
(n−1) channel signals from among the first through n channel signals ch1 through chn may be summed or the first through n channel signals ch1 through chn may be summed (operation 420). In more detail, the residual channel signals from among the first through n channel signals ch1 through chn except for a reference channel signal may be summed, and a summed signal indicates the aforementioned first summation signal. Alternatively, all of the first through n channel signals ch1 through chn may be summed, and a summed signal indicates the aforementioned second summation signal.
Then, by using a correlation between the first summation signal generated in operation 420, and the reference channel signal, the aforementioned first spatial parameter may be generated (operation 430). Alternatively, the first spatial parameter may not be generated but the aforementioned second spatial parameter may be generated by using a correlation between the second summation signal generated in operation 420, and the reference channel signal (operation 430).
The reference channel signal may be each of the first through n channel signals ch1 through chn. Thus, the number of the reference channel signals may be n, and the number of the spatial parameters corresponding to the reference channel signals may be n.
Thus, operation 430 may further include an operation of generating n spatial parameters by using the first through n channel signals ch1 through chn as the reference channel signals, respectively.
Operations 420 and 430 may be performed by the down-mixing unit 210, and will now be described in detail with reference to
A spatial parameter SP generated in operation 430 is encoded and transmitted to a decoding apparatus (not shown) (operation 440). Also, the mono-signal DM generated in operation 410 is encoded and transmitted to the decoding apparatus. In more detail, the encoded spatial parameter SP and the encoded mono-signal DM may be included in a transport stream TS and then may be transmitted to the decoding apparatus. The spatial parameter SP included in the transport stream TS indicates a spatial parameter set including the aforementioned first through n spatial parameters.
Operation 440 may be performed by the encoder 220 of
Operations 450 and 460 will now be described in detail with reference to
The decoding apparatus 500 includes an inverse-multiplexing unit 510, a spatial parameter decoding unit 520, and an up-mixing unit 530.
The inverse-multiplexing unit 510 receives a transport stream TS including an encoded spatial parameter EN_SP and an encoded mono-signal EN_DM from the encoding apparatus 200 (operation 450).
In more detail, the inverse-multiplexing unit 510 separates the encoded spatial parameter EN_SP from the transport stream TS and then outputs the encoded spatial parameter EN_SP to the spatial parameter decoding unit 520. Also, the inverse-multiplexing unit 510 separates the encoded mono-signal EN_DM from the transport stream TS and then outputs the encoded mono-signal EN_DM to the up-mixing unit 530.
The spatial parameter decoding unit 520 decodes the encoded spatial parameter EN_SP output from the inverse-multiplexing unit 510. A decoded spatial parameter DE_SP is transmitted to the up-mixing unit 530. Also, the decoded spatial parameter DE_SP may include at least one of the n first spatial parameters and the n second spatial parameters.
The up-mixing unit 530 decodes the mono-signal EN_DM generated by down-mixing and encoding the first through n channel signals ch1 through chn, and restores the first through n channel signals ch1 through chn by using a decoded mono-signal and the decoded spatial parameter DE_SP (operation 460). That is, the up-mixing unit 530 generates first through n channel signals corresponding to the first through n channel signals ch1 through chn described above by up-mixing the decoded mono-signal by using decoded n spatial parameters.
Referring to
Referring to
Referring to
As described above, in a case where the multi-channel signals include three channel signals, the number of the reference channel signals is 3, and three spatial parameters may be generated. The generated spatial parameters are encoded by the encoder 220 and are transmitted to the decoding apparatus 500.
The mono-signal DM obtained by down-mixing the first, second, and third channel signals ch1, ch2, and ch3 is equal to the summation signal of the first, second, and third channel signals ch1, ch2, and ch3, and may be expressed in a manner of DM=ch1+ch2+ch3. Thus, a relation of ch1=DM−(ch2+ch3) is formed.
The decoding apparatus 500 receives and decodes the first spatial parameter that is the spatial parameter described with reference to
Referring to
A spatial parameter between the first channel signal ch1 and the second summation signal 720 is generated by using the first channel signal ch1 as a reference channel signal. In more detail, the spatial parameter including at least one of the first parameter and the second parameter may be generated by using a correlation between the first channel signal ch1 and the second summation signal 720 (ch1, and ch1+ch2+ch3).
Then, a spatial parameter is generated by using the second channel signal ch2 as a reference channel signal and by using a correlation between the second channel signal ch2 and the second summation signal 720 (ch2, and ch1+ch2+ch3). Also, a spatial parameter is generated by using the third channel signal ch3 as a reference channel signal and by using a correlation between the third channel signal ch3 and the second summation signal 720 (ch3, and ch1+ch2+ch3).
The decoding apparatus 500 receives and decodes the first spatial parameter that is the spatial parameter described with reference to
Thus, the first channel signal ch1 may be restored by using the decoded mono-signal and the spatial parameter generated by using the correlation between the first channel signal ch1 and the second summation signal 720 (ch1, and ch1+ch2+ch3). Similarly, the second channel signal ch2 may be restored by using the decoded mono-signal and the spatial parameter generated by using the correlation between the second channel signal ch2 and the second summation signal 720 (ch2, and ch1+ch2+ch3). Also, the third channel signal ch3 may be restored by using the decoded mono-signal and the spatial parameter generated by using the correlation between the third channel signal ch3 and the second summation signal 720 (ch3, and ch1+ch2+ch3).
Referring to
That is, when the related art restored signal 821 is reproduced, due to the signal loss of the related art restored signal 821, which is incurred due to a decoding error or the like, sound quality deteriorates.
Compared to the related restored signal 821, referring to
Thus, the signal processing method, the encoding apparatus thereof, and the decoding apparatus thereof according to one or more exemplary embodiments may further exactly restore a signal to the original channel signal 810, and may prevent signal loss and sound deterioration due to a decoding error or the like.
A spatial parameter generated by the down-mixing unit 210 may include an angle parameter as a first parameter.
In more detail, at least one of operations 310 and 430 for generating a spatial parameter may include an operation of generating the angle parameter.
The angle parameter indicates a predetermined angle value denoting a correlation between a signal magnitude of a reference channel signal that is from among first through n channel signals ch1 through chn, and signal magnitudes of the residual channel signals from among the first through n channel signals ch1 through chn except for the reference channel signal. Also, the angle parameter may be referred to as a global vector angle (GVA).
The angle parameters indicate angle values denoting relative magnitudes of the reference channel signal and a first summation signal.
The down-mixing unit 210 may generate first through n angle parameters by using the first through n channel signals ch1 through chn as reference channel signals, respectively. Hereinafter, an angle parameter that is generated by using a k channel signal as a reference channel signal is referred to as a k angle parameter.
Referring to
Referring to
In more detail, the first angle parameter (angle 1) 922 may be obtained by performing an inverse-tangent operation on a value obtained by dividing an absolute value of the summation signal 920 (i.e., ch2+ch3) by an absolute value of the first channel signal ch1.
Referring to
Referring to
In more detail, total summation of n angle parameters calculated by using first through n channel signals as reference channel signals, respectively, is converged to a predetermined value. The converged predetermined value may vary according to a value of n, and thus may be experimentally optimized.
In the graph of
Referring to
However, there is an exceptional case in which the total summation of angle parameters is converged near an X-axis value of 45 units, i.e., near a point 1020 of 270 degrees. The case in which the predetermined value is converged near the point 1020 of 270 degrees is when each of the angle parameters has a value of 90 degrees since the three channel signals are all mute. Regarding this exceptional case, if a value of one of the three angle parameters is changed to 0, the total summation of the angle parameters is converged to 180 degrees. In a case where the three channel signals are all mute, a down-mixed mono-signal also has a value of 0, and a signal obtained by up-mixing and decoding the down-mixed mono-signal has a value of 0. Thus, although the value of one of the three angle parameters is changed to 0, up-mixing and decoding results are not changed, so that the value of one of the three angle parameters being to 0 is not concerning.
Also, at least one of operations 310 and 430 for generating a spatial parameter may include an operation of generating the spatial parameter including a k angle residue parameter used to calculate a k angle parameter and angle parameters from among the first through n angle parameters except for the k angle parameter. The k angle residue parameter will now be described in detail with reference to
Referring to
When a third angle parameter is the k angle parameter, the k angle residue parameter may be obtained, which will now be described.
As described above, since the total summation of the n angle parameters is converted to the predetermined value, a value of the k angle parameter may be obtained by subtracting a value of the angle parameters from among the first through n angle parameters except for the k angle parameter from the predetermined value. In more detail, when the number of the n angle parameters is 3, if all of the first, second, and third channel signals ch1, ch2, and ch3 are not mute, total summation of the three angle parameters is converged to 180 degrees. Thus, a relation of ‘a value of the third angle parameter=180 degree−(a value of the first angle parameter+a value of the second angle parameter)’ is provided. By using the relation, the third angle parameter may be predicted.
In more detail, the down-mixing unit 210 predicts the value of the k angle parameter from among the first through n angle parameters. The prediction may be performed by using the aforementioned relation and the predetermined value. A predetermined bit region 1107 indicates a data region including a predicted value of the k angle parameter.
The down-mixing unit 210 compares the predicted value of the k angle parameter and the original value of the k angle parameter. A predetermined bit region 1105 indicates a data region including the value of the third angle parameter calculated in a manner shown in
The down-mixing unit 210 generates a difference value between the predicted value of k angle parameter 1107 and the value of k angle parameter 1105, as the k angle residue parameter. A predetermined bit region 1111 indicates a data region including a value of the k angle residue parameter.
The encoder 220 encodes the spatial parameter including the angle parameters (i.e., parameters included in the bit regions 1101 and 1103) from among the first through n angle parameters except for the k angle parameter, and the k angle residue parameter (i.e., a parameter included in the bit region 1111), and transmits the spatial parameter to the decoding apparatus 500.
Accordingly, the decoding apparatus 500 receives the spatial parameter including the angle parameters from among the first through n angle parameters except for the k angle parameter, and the k angle residue parameter.
The spatial parameter decoding unit 520 of the decoding apparatus 500 restores the k angle parameter by using the received spatial parameter and the predetermined value.
In more detail, the spatial parameter decoding unit 520 may subtract the value of the angle parameters from among the first through n angle parameters except for the k angle parameter from the predetermined value, may obtain a value by compensating for the value of the k angle residue parameter to a value of the subtraction result, and may generate the obtained value as the k angle parameter.
Referring to
A value of a k angle residue parameter contains data smaller than a value of a k angle parameter. Thus, when a spatial parameter including angle parameters from among the first through n angle parameters except for the k angle parameter, and the k angle residue parameter is transmitted to the decoding apparatus 500, an amount of data exchanged between the encoding apparatus 200 and the decoding apparatus 500 may be decreased.
That is, compared to an example of
As described above, the signal processing method, the encoding apparatus thereof, and the decoding apparatus thereof according to one or more exemplary embodiments may prevent signal quality deterioration that may occur when n channel signals are down-mixed to one mono-signal and then up-mixed.
In more detail, the signal processing method, the encoding apparatus thereof, and the decoding apparatus thereof according to one or more exemplary embodiments may generate or process the spatial parameter that allows the mono-signal to be exactly restored to the original channel input signals.
One or more exemplary embodiments can also be embodied as computer readable codes on a computer readable recording medium. The computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, etc. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. Moreover, one or more units of the above-described units can include a processor or microprocessor executing a computer program stored in a computer-readable medium.
While exemplary embodiments have been particularly shown and described above, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2011-0092560 | Sep 2011 | KR | national |