1. Field of the Invention
This invention relates to multichannel audio and more specifically to a multichannel audio format that provides a truly discrete as well as a backward compatible mix for surround-sound, front or other discrete audio channels in cinema, home theater, or music environments.
2. Description of the Related Art
Multichannel audio has become the standard for cinema and home theater, and is gaining rapid acceptance in music, automotive, computers, gaming and other audio applications. Multichannel audio provides a surround-sound environment that greatly enhances the listening experience and the overall presentation of any audio-visual system. The earliest multichannel systems included left, right, center and surround (L, R, C, S) channels. The current standard in consumer applications is 5.1 channel audio, which splits the surround channel into left and right surround channels and adds a subwoofer channel (L, R, C, Ls, Rs, Sub).
The move from stereo to multichannel audio has been driven by a number of factors paramount among them being the consumers' desire for higher quality audio presentation. Higher quality means not only more channels but higher fidelity channels and improved separation or “discreteness” between the channels. In a truly discrete environment, discrete channels carry discrete audio signals to discrete speakers.
To satisfy this demand, the audio industry had to provide a multichannel mix from the studio or content provider, multichannel encoding/decoding techniques, a media capable of supporting multichannel audio and multichannel speaker configurations. By its very nature, multichannel audio includes significantly more data than stereo audio, which has to be compressed to fit in the existing formats and on the existing media. With the advent of media such as DVD, new formats such as 5.1 have been developed specifically for multichannel audio to enhance the listening experience.
The extension of multichannel audio beyond the 5.1 standard has once again raised the challenge of developing new encoding/decoding techniques that move the state-of-the-art forward while maintaining backward compatibility with the 5.1 standard. Having become accustomed to discrete audio, the consumer will demand the same performance as more channels are added. Backward compatibility is critical because of the great investment in 5.1 equipment by consumers and professionals alike.
Dolby Prologic™ provided one of the earliest multichannel systems. Prologic squeezes 4-channels (L, R, C, S) into 2-channels (Lt, Rt) by introducing a phase-shifted surround sound term. These 2-channels are then encoded into the existing 2-channel formats. Decoding is a two step process in which an existing decoder receives Lt, Rt and then a Prologic decoder expands Lt, Rt into L, R, C, S. Because four signals (unknowns) are carried on only two channels (equations), the Prologic decoding operation is only an approximation and cannot provide true discrete multichannel audio. As shown in
Lt=L+0.707C+S(+90°), and (1)
Rt=R+0.707C+S(−90), (2)
which are carried on the two discrete channels, encoded into the existing two-channel format and recorded on a media 14 such as film.
A matrix decoder 16 decodes the two discrete channels Lt, Rt and expands them into four discrete reconstructed channels Lr, Rr, Cr and Sr. A passive matrix decoder decodes the audio data as follows:
Lr=Lt,
Rr=Rt,
Cr=(Lt+Rt)/2, and
Sr=(Lt−Rt)/2.
In general, the Lr and Rr channels have significant center and surround components and Cr and Sr have left and right components. The reproduced audio signals, although carried on discrete channels to discrete speakers in a speaker configuration 18, are not discrete, but in fact are characterized by significant crosstalk and phase distortion. For this reason passive decoders are rarely used.
Active matrix decoders reduce crosstalk and phase distortion but at best approximate a discrete audio presentation. Many different proprietary algorithms are used to perform an active decode and all are based on measuring the power of Lt+Rt, Lt−Rt, Lt and Rt to calculate gain factors Gi whereby,
Lr=G1*Lt+G2*Rt
Rr=G3*Lt+G4*Rt
Cr=G5*Lt+G6*Rt, and
Sr=G7*Lt+G8*Rt.
Active decode provides better compensation based on the power of the signal but crosstalk among components remains and true discrete reproduction is not possible.
The advent of the 5.1 format represented a fundamental shift in multichannel audio away from squeezing multiple channels into an existing stereo format and the phase distortion and crosstalk associated with matrix coding and to a truly discrete multichannel format, which provides higher fidelity and improved separation and directionality. Furthermore, two additional channels were added. The subwoofer (“Sub”)(0.1 channel) provides enhanced low frequency capability. The surround channel S consists of left Ls and right Rs channels indicating the consumers' strong preference for true discrete sound even in the surround channels. Each signal (L, C, R, Ls, Rs, Sub) is compressed independently and then mixed together in a 5.1 format thereby maintaining the discreteness of each signal. Dolby AC-3™, Sony SDDS™ and DTS Coherent Acoustics™ are all examples of 5.1 systems.
As shown in
In its cinema products, DTS implemented its 5.1 system with 5 single channel APT-X encoders by taking advantage of the spectral characteristics of the surround and subwoofer channels without sacrificing performance. The use of five rather than six processors reduced system cost As shown in
Extension to discrete 6.1 and higher multichannel formats is limited by space availability on the media, reliability and the strong desire to maintain backward compatibility with existing 5.1 decoders. Multichannel audio consumes a lot of space on the medium. Providers want to extend playtime, include multiple different audio formats including 2-channel PCM, Dolby AC-3 and DTS Coherent Acoustics, add other content such as director's comments, outtakes, etc.
Dolby has developed Dolby EX, as described in PCT Publication W099/57941, which provides more than two surround-sound channels in the current 5.1 formats and does so without increasing space requirements (number of bits or film space). Dolby EX provides more than two surround sound channels within the format of a digital soundtrack system designed to provide only two surround sound channels. Three main channels are recorded in the discrete soundtrack channels and 3, 4 or 5 surround-sound channels are matrix-encoded and recorded in two discrete surround-sound soundtrack channels. The digital audio stream of the digital soundtrack system designed to provide only two surround sound channels remains unaltered, thus providing compatibility with existing playback equipment. Moreover, the format of the media carrying the digital sound tracks is unaltered. Dolby asserts that the “discreteness” of the digital soundtrack system is not audibly diminished by employing matrix technology to surround sound channels, particularly if active matrix decoding is employed.
Dolby EX introduces phase-shifted surround sound terms to matrix encode the 3, 4 or 5 surround-sound signals into two channels, which facilitates decoding the two channels into 3, 4 or 5 audio channels. The introduction of the phase-shifted terms is essential to Dolby EX as it was to Dolby Prologic. The encoding process is given by the following generalized equations:
Lts=Ls+ΣGi*Si(φi) for i=0, 1, 2, and
Rts=Rs+ΣHi*Si(φi) for i=0, 1, 2
where Gi and Hi are the gain coefficients, Si are the additional surround-sound channels and φi are the phase distortion components. The decoding process is given by the following generalized equations:
Lrs=G1*Lts+G2*Rts
Rrs=G3*Lts+G4*Rts
Crs=G5*Lts+G6*Rts
In the special case of three surround-sound channels (Ls, Rs, Cs), these generalized equations default to the well known mix equations where the Cs channel is reduced by 3 dB and added to the Ls and Rs channels as follows:
Lts=Ls+0.707Cs, and
Rts=Rs+0.707Cs.
It is believed that actual Dolby Ex systems phase shift Ls and Rs by plus and minus 45 degrees, respectively, to provide more depth to the surround sound. The QS or SQ matrix systems cited in the PCT Publication teach that technique.
As shown in
It is important to note that the three discrete surround channels do NOT carry discrete signals. The same crosstalk and phase distortion limitations associated with Prologic are now reintroduced into what was a truly discrete multichannel system. While it is true that a listener's sensitivity to position and direction is less for rear signals, true discrete audio reproduction will provide better sound separation and directionality. For the same reasons consumers preferred 2-channel surround over mono surround they will prefer 3-channel discrete surround over matrixed 2 channel surround.
Dolby EX represents a first step toward enhanced multichannel audio. Dolby EX provides additional surround sound channels using existing 5.1 formats without increasing the bit rate. Furthermore, Dolby EX preserves the discrete coding of L, R, C and sub audio signals. However, Dolby EX achieves these desirable results by sacrificing the true discreteness of the surround sound channels. A 3:2:3 system will suffer the same crosstalk limitation as Pro Logic. 4:2:4 and greater systems will also suffer phase distortion problems due to the matrix decode.
Dolby cannot provide true discrete N.1 audio because audio quality and/or reliability will suffer. The PCT Publication contemplates and then dismisses a new N.1 format for truly discrete audio stating “Although, in theory, additional channels could be carried by reducing the symbol size in order to provide more bits and allowing the storage of more data in the same physical area, such a reduction would introduce unwanted difficulties in the printing process and require substantial modification or recorder and player units in the field.” A true N.1 format would be incompatible with existing hardware and would require at least substantial modification if not total replacement.
Accordingly, there remains an unfulfilled need in the industry to provide a truly discrete multichannel surround sound environment with more than two surround channels while maintaining backward compatibility with existing 5.1 decoders without sacrificing audio quality or reliability.
In view of the above problems, the present invention provides a truly discrete multichannel audio environment with additional discrete audio signals while maintaining backward compatibility with existing decoders.
A truly discrete as well as a backward compatible mix for surround-sound, front or other discrete audio channels for cinema, home theater, or music by mixing additional discrete audio signals with the existing discrete audio channels into a predetermined format such as the 5.1 audio format. These additional discrete audio channels are separately encoded and appended to the predetermined format as extension bits in the bitstream.
In a 5.1 channel environment, the more than two discrete surround-sound audio signals (Ls, Rs, Cs, . . . ) are mixed into two discrete surround-sound channels (Lts, Rts). The front channels (L, R, C, sub) and the mixed surround-sound channels (Lts, Rts) are encoded using a standard 5.1 encoder. The additional discrete surround-sound audio signals (Cs, . . . ) are independently encoded and carried in a discrete extension surround-sound channel that is appended to the 5.1 bitstream as extension bits. The bitstream is compatible with a variety of decoder configurations including existing 5.1 decoders, a 5.1 decoder plus existing matrix decoders, a 5.1 decoder plus a mix decoder and a N.1 decoder. The inclusion of the additional discrete surround-sound audio signals in the bitstream makes possible the reproduction of true discrete multichannel audio when used with either the 5.1 decoder plus the mix decoder or the N.1 decoder.
A 5.1 decoder reads the 5.1 bitstream and ignores the extension bits. The 5.1 decoder decodes the Lts and Rts surround-sound channels and directs the mixed audio signals to the discrete left and right surround-sound speakers. Playback creates the discrete left and right surround-sound signals and a “phantom” surround-sound signal from the center surround (Cs) audio signal and any other additional surround signals that acoustically appears at the center of the left and right surround speaks. The phantom surround is completely devoid of any phase distortion.
The inclusion of a matrix decoder with the 5.1 decoder decodes the Lts and Rts channels into Lrs, Rrs and Crs matrixed audio signals, which are carried on discrete channels to left, right and center surround speakers. The Lrs, Rrs and Crs audio signals are not discrete and exhibit the crosstalk associated with matrix coding.
The inclusion of a mix decoder with the 5.1 decoder reads the extension bits and decodes the additional surround-sound audio signals (Crs, . . . ). The mix decoder subtracts the weighted surround sound audio signals (Crs, . . . ) from the left and right total surround-sound signals (Lrts, Rrts) to produce truly discrete surround-sound audio signals (Lrs, Rrs, Crs, . . . ), which are carried on discrete channels to discrete speakers. A true N.1 decoder incorporates the 5.1 decoder and mix decoder in a single box. Playback creates a truly discrete (discrete signals carried on discrete channels to discrete speakers) surround-sound environment in which the surround-sound portion exhibits improved sound separation and directionality. Unlike matrix-encoded surround-sound audio, the mix-encoded N.1 channel audio provides discrete playback without crosstalk.
These and other features and advantages of the invention will be apparent to those skilled in the art from the following detailed description of preferred embodiments, taken together with the accompanying drawings, in which:
The present invention provides a multichannel audio format for a truly discrete as well as a backward compatible mix for surround-sound, front or other discrete audio channels in cinema, home theater, or music environments. The additional discrete audio signals are mixed with the existing discrete audio channels into a predetermined format such as the 5.1 audio format. In addition these additional discrete audio channels are encoded and appended to the predetermined format as extension bits in the bitstream. The existing base of multichannel decoders can be used in combination with a mix decoder to reproduce truly discrete N.1 multichannel audio. This allows a consumer or professional to choose whether to keep their existing audio systems and realize some of the benefits of additional surround-sound channels or to upgrade their systems by adding a mix decoder to realize truly discrete multichannel audio for the ultimate listening experience.
It is to be understood that the present approach is applicable to extend any predetermined multichannel audio format, of which 5.1 is the current standard, to greater number of channels of discrete audio while maintaining backward compatibility to the predetermined format. For example, a true 10.2 format may be adopted for certain very specialized audio systems. At some point after the adoption of such a 10.2 format it may be desirable to extend that format to even more channels. For purposes of clarity, the present invention will be described with reference to a 5.1 channel system without lack of generality.
For purposes of clarity, the present invention will now be described with reference to the drawings in the context of a 5.1 channel system.
The inclusion of the additional surround-sound audio signals in both the two-channel mix and discrete channels eliminates the need to introduce a phase-shift in order to decode the three or more audio channels. As such, mix encoder 114 has more flexibility to mix the surround-sound channels. For example, a coherent mix introduces no phase-shifts or delays. This has the advantage that neither a direct 5.1 decode that produces a “phantom” surround channel or a 2:3 matrix-decode introduce phase distortion. Alternately, mix encoder 114 could phase-shift the Ls and Rs signals to improve the depth of the matrix decoded surround-sound audio. The key is that the phase term is not needed in order to decode, and that the inclusion of the additional channels in the bitstream allows the mix decoder to reproduce discrete audio for either mix approach.
Assuming a coherent mix, the generalized mixing equations are as follows:
Lts=Ls+ΣGiSi for i=0, 1, 2, . . .
Rts=Rs+ΣHiSi for i=0, 1, 2, . . .
where Gi and Hi are the gain coefficients and Si are the additional surround-sound channels.
In the special case of three surround-sound channels (Ls, Rs, Cs), these generalized equations default to the well known mix equations where the Cs channel is reduced by 3 dB and added to the Ls and Rs channels as follows:
Lts=Ls+0.707Cs, and
Rts=Rs+0.707Cs.
At this one point, a 3:2 mix of a center surround channel, the matrix-encode equations for the Dolby EX system and the mix-encode equations of the present invention each default to the standard technique for mixing a center channel with left and right channels. Although the mix equations are identical at this one point, the system of the present invention is fundamentally different than either Dolby EX or standard mixing practice. In those instances the additional signals are only mixed into the left and right signals thereby sacrificing the ability to reproduce discrete multichannel audio. The present invention details a method for both producing discrete multichannel audio while maintaining backward compatibility. Unlike Dolby EX, this approach requires additional bits (space) to encode the bitstream. However, as evidenced by the earlier adoption of left/right surround to replace mono surround, true discrete surround-sound audio will replace matrix-decoded surround-sound audio.
The bitstream is compatible with a variety of decoder configurations including existing 5.1 decoders, a 5.1 decoder plus existing matrix decoders, a 5.1 decoder plus a mix decoder and a N.1 decoder. Mixing the additional surround-sound signals with the left and right surround signal provides backward compatibility. The inclusion of the additional discrete surround-sound audio signals in the bitstream makes possible the reproduction of true discrete multichannel audio when used with either the 5.1 decoder plus the mix decoder of the N.1 decoder.
As shown in
A conventional 5.1 decoder when used in a 3:2:3 system reproduces the same multichannel audio experience for the encoding techniques described in
As shown in
As discussed above in reference to
As illustrated in
The distinct advantage of the present encoding and formatting techniques over Dolby EX, as illustrated in
As shown in
As shown in
Lsr=Lts −0.707Csr, and
Rsr=Rts −0.707Csr
The circuit is easily expandable to accommodate more than three surround-sound signals by using additional channel decoders, multipliers and summing nodes.
As shown in
It is important to note that the audio quality obtained by mixing the three or more surround-sound channels into a 5.1 format and appending the additional surround-sound signals as extension bits, and separating the audio signals as just described would be substantially the same as the audio quality associated with a true N.1 format, which would not be backward compatible with 5.1 systems. This slight advantage is easily outweighed by the necessity to provide backward compatibility.
Although the described audio mixing/separating techniques and modified bitstream format are generally applicable to all 5.1 formats including Dolby-AC3 and Sony SDDS they are particularly well suited for use with the DTS Coherent Acoustics, which has the ability to vary frame size as is described in detail in U.S. Pat. No. 5,978,762. The variable frame size can be used to accommodate additional surround-sound channels, i.e. the extension bits by either a) reducing the frame size or b) adaptively changing the frame size. Dolby AC-3 has a fixed frame size with insufficient bits to accommodate the extension bits without sacrificing fidelity of the reconstructed audio signals.
The DTS Coherent Acoustics encoder/decoder can vary its frame size by one bit at a time. DTS Coherent Acoustics has the flexibility to reduce frame size to increase the bit rate to accommodate N.1 systems and particularly the extra extension bits. The reduction of frame size increases the percentage of bits allocated to overhead and reduces the flexibility for bit allocation but allows true discrete N.1 channel audio to be reproduced with sufficient sound quality.
An alternate embodiment for encoding N.1 channel audio (N=3 as depicted) is shown in
The Lts and Rts audio channels are weighted by coefficients C1 and C2 and subtracted from the Ls and Rs audio channels from the 6.1 mix 154, respectively, to produce difference signals dLs and dRs. An encoder 158 encodes Cs, dLs and dRs and passes them to a frame formatter 160 that appends them as extension bits to the 5.1 audio format in the bitstream. Each additional channel added after 6.1 adds one new channel to the extension bits. This approach is not constrained by simple linear equations to mix the signals but requires two additional channels, dLs and dRs to encode the audio data.
To this point the invention has been described as a technique for mixing three or more surround-sound channels into the left and right surround-sound channels. Although this is the current application for such techniques, the same techniques can be used to provide a truly discrete as well as a backward compatible mix for additional front channels, side channels, subwoofer or any other discrete channels.
As shown in
While several illustrative embodiments of the invention have been shown and described, numerous variations and alternate embodiments will occur to those skilled in the art. Such variations and alternate embodiments are contemplated, and can be made without departing from the spirit and scope of the invention as defined in the appended claims.
This application is a divisional application of Ser. No. 09/568,355 filed on May 10, 2000 and claims priority of the same.
Number | Date | Country | |
---|---|---|---|
Parent | 09568355 | May 2000 | US |
Child | 11726976 | Mar 2007 | US |