For regular broadcasting, internet, and the home area, besides two channel stereo and mono, the 5.1 sound format is also well established. Through the additional available sound formats there is an increased effort in audio production, in particular the effort of recording and mixing the respective sound formats. Also the compatibility to playback devices needs to be guaranteed, thus they need to be able to playback every sound format independent of the number of audio channels.
One possibility is the transmission of the sound format comprising the greatest number of audio channels and if necessary an automatic conversion of the signal by the receiver to a sound format with a smaller number of audio channels (automatic downmix).
It is also possible to generate the material in all formats during the audio production and broadcast those signals simultaneously (simulcast). In this case each sound format can be generated separately. However, this kind of mixing requires considerable production effort. In most cases this requires either additional manpower, a noticeable higher time effort or multiple sets of equipment (e.g. in the case of a live broadcast). Therefore the resulting volume of production is hardly acceptable. Alternatively—as in the approach described earlier—an automatic downmix can be done.
Such methods to automatically transform a sound format already exist, but further improvements are necessary in order to achieve a qualitatively satisfying result for a wide spectrum of basic raw material.
Automatic downmix methods can be categorised roughly into active and passive methods. Active methods adapt the automatic transformation depending on the basic raw material, where passive methods work independent of a signal. A known passive downmix method is the based on the broadcast reference ITU-R BS.775 and is illustrated in
Based on a five channel sound format with the sound channels
left channel (L)
right channel (R)
centre channel (C)
rear left channel (Ls)
rear right channel (Rs),
the known downmix method is designed to lower the level of the centre channel (C), as well as the rear left channel (Ls) and the rear right channel (Rs) by −3 dB using a damping function 50, 60 or 70. The −3dB lowered centre channel is distributed via the sum function 10 or 20 to the left channel and the right channel, while forming a first sum signal (output sum function 10) and a second sum signal (output sum function 20). The −3dB lowered level of the rear and the rear right signal (Ls) and (Rs) are distributed via the sum function 30 and 40 to the first and second sum signal to form the left and right channel (L0) and (R0) of the desired two channel sound format.
For the active method the sum functions according to the block diagram of
In order to reduce such shifts of the phantom sound source, a company called Lexicon has suggested method Logic 7, where next to the downmix there is also the possibility of an upmix. The multi channel sound can be downmixed to a mono signal as well as to a stereo signal. Furthermore, it is possible, for example, to decode up to 8 channels out of a stereo downmix Therefore the fraction of a centre channel downmix is controlled via variable coefficients and the fraction of the rear right and rear left channels are adapted with further coefficients. For the left channel a fraction of 0.91 of the rear left channel is used with a fraction of −0.38 of the rear right channel. The mixing of the right channel proceeds accordingly. With this method the levels of both rear channels stay unchanged. Through a phase shift of 90° a later separation of both rear channels from the left and right channels are possible. But sound tone changes as of comb filter effects of the phase shift cannot be limited with the method Logic 7.
Various embodiments of the present invention will now be discussed with reference to the appended drawings.
The object of the invention is to largely compensate for the shift of the phantom sound source, the change in level difference between the coherent and incoherent signal parts as well as the sound tone changes.
The underlying idea of the invention is while forming the first (L′) and second (R′) sum signals, to dynamically correct each of the spectral values of overlapping time windows with (k) samples of the left channel (L) and right channel (R). Furthermore while forming the third and fourth sum signals, the spectral values of overlapping time windows with (k) samples of the first (L′) and second (R′) sum signals are each dynamically corrected.
The invention is explained further while referring to the embodiment shown in
The block diagram shown in
The functional structures of the analysis in correction blocks 100, 200, 300 and 400 in
In
Asoll, l(k)=√{square root over (|l(k)|2+|c(k)|2)}{square root over (|l(k)|2+|c(k)|2)}
If the absolute value S1(k) is greater than Asoll, l(k), then the value l′(k) of the left channel is determined according to step 104 as:
l′(k)=Asoll, l(k)+(|l(k)+c(k)|−Asoll, l(k))*n,
where n is a factor greater than 0.1 and less than 0.4.
If the absolute value S1(k) is not greater than the desired value Asoll, l(k), then the spectral value l′(k) of the left channel is determined according to step 105, in which the spectral value l(k) is multiplied by a factor m1(k). The factor m1(k) is greater than 1 and is used to adapt the value similar to the aforementioned factor n. The product m1(k)*l(k) is added to the spectral value c(k) of the centre channel (i.e., m1(k)*l(k)+c).
In the end, the level adapted signal l′(k) determined either according to m1(k)*l(k)+c(k) or Asoll, l(k)+(ll(k)+c(k)l−Asoll, l(k))*n, as discussed above, is then put through an inverse transformation, as shown in step 106, to determine the first sum signal L′.
In
Asoll, r(k)=√{square root over (|r(k)|2+|c(k)|2)}{square root over (|r(k)|2+|c(k)|2)}
If the absolute value Sr(k) is greater than Asoll, r(k) then the value r′(k) of the right channel is determined in step 204 as:
r′(k)=Asoll, r(k)+(|r(k)+c(k)|−Asoll, r(k))*n,
where n is a factor greater than 0.1 and less than 0.4.
If the absolute value Sr(k) is not greater than the desired value Asoll, r(k), then the spectral value r′(k) of the right channel is determined according to step 205, in which the spectral value r(k) is multiplied by a factor mr(k). The factor mr(k) is greater than 1 and is used to adapt the level, similar to the aforementioned factor n. The product mr(k)*r(k) is added to the spectral value c(k) of the centre channel (i.e., mr(k)*r(k)+c(k)).
In the end, the level adapted signal r′(k) determined either according to mr(k)*r(k)+c(k) or Asoll, r(k)+(lr(k)+c(k)l−Asoll, r(k))*n, as discussed above, is then put through an inverse transformation, as shown in step 106, to determine the second sum signal R′.
In
Asoll, ls(k)=√{square root over (|ls(k)|2+|l′(k)|2)}{square root over (|ls(k)|2+|l′(k)|2)}
If the absolute value Sls(k) is greater than Asoll, ls(k), then the value llRT of the rear left channel is determined in step 304 as:
llRT(k)=Asoll, ls(k)+(|ls(k)+l′(k)|−Asoll, ls(k))*n,
where n is a factor greater than 0.1 and less than 0.4.
If the absolute value Sls(k) is not greater than the desired value Asoll, ls(k), then the spectral value llRT is determined according to step 305, in which the spectral value l′(k) is multiplied by a factor mls(k). The factor mls(k) is greater than one and is used to adapt the level, similar to the aforementioned factor n. The product mls(k)*l′(k) is added to the spectral value ls(k) of the rear left channel (i.e., mls(k)*l′(k)+ls(k)).
In the end, the level adapted signal determined either according to mls(k)*l′(k)+ls(k) or Asoll, ls(k)+(ll′(k)+ls(k)l−Asoll, ls(k))*n, as discussed above, is then put through an inverse transformation, as shown in step 306, to determine the third sum signal and therefore the left output signal L.
In
Asoll, rs(k)=√{square root over (|rs(k)|2+|r′(k)|2)}{square root over (|rs(k)|2+|r′(k)|2)}
If the absolute value Srs(k) is greater than Asoll, ls(k), then the value rlRT of the rear right channel is determined in step 404 as:
rlRT(k)=Asoll, rs(k)+(|r′(k)+rs(k)|−Asoll, rs(k))*n,
where n is a factor greater than 0.1 and less than 0.4.
If the absolute value Srs(k) is not greater than the desired value Asoll, rs(k), then the spectral value rlRT is determined according to step 405, in which the spectral value r′(k) is multiplied by a factor mrs(k). The factor mrs(k) is greater than one and is used to adapt the level, similar to the aforementioned factor n. The product mrs(k)*r′(k) is added to the spectral value rs(k) of the rear right channel (i.e., mrs(k)*r′(k)+rs (k)).
In the end the level adapted signal determined either according to mrs(k)*r′(k)+rs(k) or Asoll, rs(k)+(lr′(k)+rs(k)l−Asoll, rs(k))*n, as discussed above, is then put through an inverse transformation, as shown in step 406, to determine the fourth sum signal and therefore the right output signal R.
Number | Date | Country | Kind |
---|---|---|---|
10 2008 056 704.3 | Nov 2008 | DE | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2009/007971 | 11/7/2009 | WO | 00 | 7/14/2011 |