This application is a U.S. National Stage filing under 35 U.S.C. §371 and 35 U.S.C. §119, based on and claiming priority to PCT/GB2013/051548 for “DOUBLY COMPATIBLE LOSSESS AUDIO BANDWIDTH EXTENSION” filed Jun. 12, 2013, claiming priority to GB Patent Application No. 1210373.5 filed Jun. 12, 2012.
The invention relates to digital audio signals, and particularly to lossless bandwidth extension schemes that provide compatibility with standard PCM playback.
Many discerning audiophiles and musicians are demanding ‘high resolution’ digital audio, which is normally understood to a mean audio sampled at a frequency significantly higher than the 44.1 kHz or 48 kHz of current media and quantised with a resolution better than 16 bits.
Lossily compressed audio is commonplace in the consumer market, but experience has led many people to be suspicious of lossily compressed audio, even of systems that claim to be ‘transparent’. An exception is plain nonadaptive noise-shaped dithered requantisation to a constant bit depth. With proper precautions this is equivalent (according to first-order and second-order statistics of the difference between input and output) to the addition of a constant noise (see J. Vanderkooy and S. P. Lipshitz, “Digital Dither: Signal Processing with Resolution Far below the Least Significant Bit” in Proc. AES 7th Int. Conf. on Audio in Digital Times (Toronto, Ont., Canada, 1989), pp. 87-96), which is considered ‘benign’ as a result of decades of experience with both analogue and digital media.
Two music distribution media dominate the mass market: the compact disc (CD) which has a sampling frequency of 44.1 kHz and a bit-depth of 16 bits, and the Internet download typically heard through a computer or personal player. Although most downloads are lossy-compressed, the computers or players are almost invariably able to handle uncompressed PCM (Pulse Code Modulation) signals at sampling frequencies of 44.1 kHz and 48 kHz. Many can handle bit depths of 24 bits, though some personal players are restricted to 16 bits.
It is commercially unattractive to issue audio recordings in both an audiophile version (having a sampling frequency of typically 96 kHz) and in a format that can be played on mass-market players. The possibility of issuing a recording that is playable on standard mass-market players but also contains hidden information that allows a special decoder to retrieve additional bandwidth has been explored several times previously, including by Komamura MITSUYA KOMAMURA “Wide-Band and Wide-Dynamic-Range Recording and Reproduction of Digital Audio” J. Audio Eng. soc. Vol. 43, No. 1/2, 1995 January/February). However none has so far provided standard PCM playback compatibility while addressing the desire for lossless retrieval of an original higher-sampling-rate signal and none has considered the question of how a decoder may provide an optimal experience to the listener at two different bit depths (for example for both 16-bit and 24-bit players).
According to a first aspect of the present invention a lossless audio encoder is adapted to receive an input digital audio signal at a first sampling rate and to generate therefrom a PCM digital audio output comprising a plurality of samples and having a second sampling rate lower than the first sampling rate, wherein:
Standard “legacy” PCM playback equipment that was not designed for use with the invention will typically receive or play only the top 16 bits here referred to as the “more significant portions”, of each sample of an audio stream sampled at the second sample rate of typically 44.1 kHz or 48 kHz, and will present the lossy representation to the listener with a bandwidth of approximately 0-20 kHz. The second decoder allows an extended bandwidth to be reproduced from the same 16-bit 44.1 kHz or 48 kHz stream. The first decoder typically expects to receive a 24-bit stream, and so to have access also to the “less significant portion” of each sample, i.e. to the bits beyond the sixteenth. This additional information allows lossless recovery of an input audio signal presented at a first, higher, sampling rate such as 88 kHz or 96 kHz, and thereby having a wider audio bandwidth such as 0-40 kHz.
Preferably, the first lossy representation is an accurate representation of the input audio signal other than the effects of time-invariant filtering, sample rate reduction and requantisation that imposes a time-invariant noise floor. If all quantisations, including those within the sample rate reduction, are performed to a constant bit depth and with appropriate dither, the “lossy” representation can be of a standard equivalent to CD quality and would have been considered “audiophile” reproduction only a few years ago. This is in contrast to traditional “lossy codecs” which dynamically adapt the spectral noise floor and sometimes the bandwidth in response to the input signal.
Preferably, the input digital audio signal is coupled to a lossless bandsplitter having a high frequency output and a low frequency output. In addition it is preferred that the high frequency output of the lossless bandsplitter is coupled to a lossy compression unit having a compressed output and a touchup output, the more significant portions are derived in dependence on the low frequency output of the bandsplitter and in dependence on the compressed output, and the less significant portions are derived in dependence on the touchup output.
The lossless bandsplitter is key to separate treatment of, typically, two halves of the original signal spectrum, the lower half being conveyed as PCM and the upper half being conveyed in a compressed format.
In some embodiments each more significant portion comprises sixteen binary bits. In some embodiments each less significant portion comprises eight binary bits.
In some embodiments the second sampling rate is one half of first sampling rate. Particular preferred second sampling rates include 48 kHz and 44.1 kHz.
In an encoder of the invention, the second decoder may recover an audio bandwidth equal to the Nyquist frequency that corresponds to the first sampling rate. Alternatively, the second decoder may recover a bandwidth equal to three quarters of the Nyquist frequency that corresponds to the first sampling rate.
The term ‘Nyquist frequency’ is normally understood to mean half the sampling rate of a digital system. Thus typically the first sampling rate is 96 kHz, the second is 48 kHz, the Nyquist frequency that corresponds to the first sampling rate is also 48 kHz and the second decoder will provide lossy reproduction of signals up to that Nyquist frequency, that is 48 kHz. An alternative configuration allows the second decoder to provide lossy reproduction up to 36 kHz, the advantage being a slightly lower noise floor in the range 0-24 kHz.
In some embodiments, the less significant portion is derived in dependence on output of a lossless compressor fed from the touchup output of the lossy compression unit. The lossless compressor optimises the use of the bits in the least significant units. Alternatively, if the touchup output is already in compressed or “packed” form, then the separate lossless compressor is not needed.
The less significant portion may also be derived in dependence on low frequency output of the bandsplitter. This allows a first decoder to recover losslessly an original signal that is quantised more finely than if the low frequency output of the bandsplitter were conveyed entirely within the more significant portion.
Preferably, the low frequency output of the lossless bandsplitter is coupled to a splitter having a first output coupled to the more significant portion and a second output coupled to the less significant portion. Preferably, the splitter comprises a noise-shaping filter. The splitter will provide a quantised and preferably noise-shaped representation of the LF output of the bandsplitter to the more significant portion, while its second output allows the first decoder to restore the information that was removed by the quantisation.
In some embodiments it is preferred that a plurality of bits within the more significant portion are derived in dependence on the output of a subtractor having a first input coupled to the low frequency output of the lossless bandsplitter and a second input coupled to the compressed output of the lossy compression unit. The more significant portion must contain the compressed output in order to support the operation of the second decoder; however the compressed output is a data signal not an audio signal and the purpose of the subtractor is to compensate the effect of this data signal on the audio signal recovered by legacy equipment.
According to a second aspect of the present invention, there is provided apparatus comprising a noise shaper coupled to a lossless audio encoder according to the first aspect. Typically this noise shaper operates at 96 kHz and it reduces the wordwidth of the input signal to the encoder in order to allow the input signal to be conveyed losslessy within the constraint of a 24-bit output word at a sampling frequency of 48 kHz.
According to a third aspect of the present invention, there is provided apparatus comprising a lossless audio encoder according to the first aspect coupled to a losslessly reversible watermarking encoder providing a watermarked output, wherein the apparatus encodes in dependence on configuration parameters and the watermarking encoder buries the configuration parameters in the watermarked output for use by a decoder.
The apparatus may further comprise a noise shaper providing a quantised signal to the input of the lossless audio encoder wherein the noise shaper quantises to a bit depth and the configuration parameters include the bit depth. Additionally, the apparatus may further comprise a chooser unit that chooses a bit depth of the quantisation in order to maximise audio quality consistent with not exceeding the information carrying capacity of the less significant portions.
In this way, the present invention provides a system whereby a high quality wide bandwidth signal can be conveyed over a baseband PCM transmission channel, also performing well if the transmission channel only conveys the top 16 bits and further providing a reasonable rendition of bandlimited audio when an encoded stream is decoded by legacy equipment interpreting the signal as baseband PCM.
According to a fourth aspect of the present invention, there is provided an audio decoder adapted to receive a PCM input digital audio signal comprising a plurality of input samples at a second sampling rate generated by a corresponding audio encoder according to the first aspect, the audio decoder further adapted to generate from the PCM input digital audio signal an output digital audio signal having a first sampling rate higher than the second sampling rate, wherein:
Thus, the decoder of the fourth aspect is intended for use with a corresponding encoder according to the first aspect, whose output when interpreted as a plain PCM signal can satisfy the audiophile criteria such as a noise floor that may be spectrally shaped but does not vary with time. The decoder performs operations of filtering, resampling and quantisation in order to generate the output signal. The comparison signal may be generated by mimicking the decoder's operations of filtering and resampling, but at high precision without the decoder's quantisations. The difference between the output digital signal and the comparison signal thereby isolates quantisation artefacts introduced by the decoder. Since the input to the decoder is preferably a signal that satisfies audiophile criteria, it follows that the comparison signal should also satisfy audiophile criteria, hence the difference between the comparison signal and the output signal should contain only quantisation artefacts that satisfy audiophile criteria, and are therefore equivalent to spectrally shaped noise with stationary statistics. This could be tested either by listening or using a spectrum analyser.
According to a fifth aspect of the present invention, there is provided an audio decoder adapted to receive a PCM input digital audio signal comprising a plurality of input samples at a second sampling rate and to generate therefrom an output digital audio signal having a first sampling rate higher than the second sampling rate, the decoder comprising:
The bandjoiner and decompression unit are required in order to reverse the operations of bandsplitting and compression performed in a corresponding encoder. Full lossless reconstruction requires that the complete input sample be presented to the decoder, but it is also desired to support lossy reconstruction when the less significant portion is missing. For this reason the lossy input to the decompression is fed from the more significant portion of the stream, and it is also desired that the low frequency input to the bandjoiner should be substantially taken from the more significant portion, any dependence on the less significant portion serving merely to improve the resolution of the low frequency signal.
Preferably, the low frequency input of the bandjoiner is derived in dependence on all the bits contained in the more significant portion. The more significant portion contains bits that will be fed to the decompression unit that provides the high frequency input to the lossless bandjoiner. Therefore, it might seem natural to exclude these bits when deriving the low frequency input. These bits will affect the signal heard by the legacy listener who decodes the more significant portion in a standard PCM decoder. However, it is preferred to allow those bits to contribute to the low frequency input. An encoder is then able to compensate these bits by adjusting other bits according to the principle of “subtractive buried data”, in a manner that gives results that are consistent between the decoder of the invention and a standard PCM decoder.
Preferably, the low frequency input of the bandjoiner is also dependent on the less significant portion. This allows the resolution of the signal presented to the low frequency input of the bandjoiner to be improved when the less significant portion is available to the decoder.
It is further preferred that, over the frequency region 0-5 kHz, the difference between the output digital audio signal and a comparison signal is spectrally shaped noise with stationary statistics, wherein the comparison signal is generated from the PCM input digital audio signal by the operations of filtering and resampling to the first sampling rate. Thus, one of the advantages described above in respect of the fourth aspect of the invention may be combined with the advantages provided by the fifth aspect of the invention.
Preferably, the audio decoder is adapted to receive a signal generated by a corresponding audio encoder, wherein the output digital audio signal is an exact replica of a digital audio input signal that was presented to that corresponding audio encoder.
In this way, yet another advantage described above in respect of the fourth aspect may be combined with the advantages provided by the fifth aspect of the invention.
As will be appreciated by those skilled in the art, further adaptations of the lossless audio encoder of the present invention are possible. Moreover, in other aspects, corresponding decoders are contemplated, as are communication systems comprising an encoder and a decoder.
Examples of the present invention will be described in detail with reference to the accompanying drawings, in which:
Lossy Bandwidth Extension
A commercial ‘scalable’ transmission system for consumer audio was described in U.S. Pat. No. 6,226,616 by You et. al.: “Sound Quality of Established Low Bit-Rate Audio Coding Systems without loss of Decoder Compatibility”. Starting from an established system of packaging a data stream representing a lossily compressed audio signal into sixteen-bit words that can be transmitted through a standard SPDIF digital audio interface, the enhanced system provides the option of packing further ‘extension streams’ into the same format to allow higher audio quality, in a manner compatible with decoders designed for the original system. However although SPDIF is often used to convey a PCM stream, the “compatibility” here relates to an established infrastructure of proprietary decoders, not to the devices adapted to play PCM streams without a special decoder, which is an object of the current invention.
Komamura's proposal uses ADPCM (Adaptive Differential Pulse Code Modulation) as the basis for lossy compression. Komamura precedes the ADPCM unit with a downsampler to provide a representation of the HF stream at a rate of 24 kHz, this representation then being compressed to two bits per sample and the two bits serialised into a one-bit stream at 48 kHz. Thus the HF information occupies only one bit of the final 16-bit output, allowing 15 bits of LF resolution. As downsampling is itself a lossy process, Komamura's downsampler and ADPCM unit may be considered together as a lossy compression unit 4. As a result of the downsampling, a decoder is unable to provide unambiguous reconstruction of frequencies up to 48 kHz: the limit is rather 36 kHz.
The “legacy” listener who has no decoder and plays the stream 8 as PCM audio, will hear primarily the noise-shaped (or truncated) LF output from the bandsplitter, which should be acceptable as a downsampled and lower-quality version of the original signal 2. However, the least significant bits of the stream 8, containing the compressed HF signal 7, will also contribute to the audio output of the legacy listener's player. The output of an ideal compressor is noise-like, for otherwise it contains redundancy, which in principle could be removed to give improved compression. In practice it may be necessary to provide explicit scrambling to remove tonal artefacts and render the compressor's output truly noise-like. We assume in this document that the compressor 4 contains such scrambling internally if necessary to ensure that its output is composed of binary bits that are statistically independent.
Another assumption throughout this document is that processes such as compression and decompression are instantaneous. In practice they incur signal delay, so that compensating delays must be introduced into parallel signal paths. For clarity, such compensating delays have been omitted from the diagrams and similarly the diagrams do not preclude the organising of signal samples into blocks should this be convenient or necessary for the correct operation of the processing units.
Bandwidth Extension Using Subtractive Buried Data
In
Thus signal 7, interpreted as an audio signal, is fed to the subtractor 15, so that the noise shaper 5 receives the signal 7 in antiphase along with the LF signal to produce a modified 13-bit signal 6′ which is placed into the top thirteen bits B1-B13 of the output word 8. The legacy listener will hear the whole of the output word 8 interpreted as a PCM audio signal, that is the sum of the signals 6′ and 7. Thus the legacy listener will hear the compressor signal 7 both directly via the bottom three bits of the complete word 8, and also in antiphase via the noise shaper in the top thirteen bits of the word 8, and these two presentations of the compressor signal 7 will cancel. This is an instance of “subtractive buried data” as described in M. A. Gerzon and P. G. Craven, “A High-Rate Buried Data Channel for Audio CD,” J. Audio Eng. Soc. Volume 43 Issue 1/2 pp. 3-22; February 1995.
Internally, the noise shaper 5 contains a 13-bit quantiser and a noise-shaping filter. As well as cancelling noise from the compressor signal, the subtractive buried data provides subtractive dither for the 13-bit quantiser. Quantisation artefacts other than additive noise are now at the 16-bit level rather than the 13-bit level. The additive noise at the 13-bit level is shaped by the noise-shaping filter, potentially providing two or more bits of perceptual advantage, while the subtractive dither introduces 4.77 dB less noise than a conventional TPDF dither. Hence the perceived performance may be equivalent to that of a 16-bit system that uses TPDF dither.
The corresponding decoder is shown in
The above-referenced paper by Gerzon and Craven also describes how a non-integer number of bits of other data may be ‘buried’ in the bottom bits of a PCM signal. In particular, it is straightforward to bury a half-integer number of bits in each channel of a two-channel (stereo) stream. For simplicity this description assumes an integer number but it will be clear that the designs described herein can be used with a non-integer number of bits of compressed data.
Lossless Bandwidth Extension—General Considerations
A lossless system is not allowed to throw away information, so a transmission channel must have an information carrying capacity at least as large as the information in the signal to be conveyed. Experience with lossless compression suggests that the redundancy in a 96 kHz audio signal of 16 bits or higher resolution is typically about eight bits. Thus a 16-bit 96 kHz signal might be compressed to a data rate of eight bits per sample, and a 24-bit 96 khz signal might be compressed to sixteen bits. Thus a 16-bit 96 kHz signal can usually be transmitted through a 16-bit 48 kHz channel. However it will not be compatible, since an optimally compressed signal will appear as full scale white noise if interpreted as a PCM signal. A requirement for PCM compatibility forces redundancy into the PCM signal and thus requires a larger wordwidth.
Thus, it is generally not possible to pack losslessly and with PCM compatibility a 16-bit 96 kHz signal into a 16-bit 48 kHz channel, and neither is it generally possible to pack losslessly and with PCM compatibility a 24-bit 96 kHz signal into a 24-bit 48 kHz channel. However, PCM-compatible lossless packing of a 16-bit 96 kHz signal into a 24-bit 48 kHz channel is usually feasible.
Currently “96/24” (i.e., a sampling rate of 96 kHz and bit-depth of 24 bits) is widely regarded as the next step up from the “44/16” of the Compact Disc. However it was realised by Gerzon in 1995 that 96 kHz sampling is highly advantageous for noise shaping, allowing larger perceptual improvements yet with a gentler rise in the high frequency noise spectrum than the 44.1 kHz shapers that have been widely used on CD. The coefficients for Gerzon's 96 kHz shaper, which provides nearly five bits of perceptual improvement, were given in Acoustic Renaissance for Audio, “A Proposal for High-Quality Application of High-Density CD Carriers” private publication (1995 April); reprinted in Stereophile (1995 August); in Japanese in J. Japan Audio Soc., vol. 35 (1995 October); available for download at www.meridian-audio.com/ara. Stuart provides a careful analysis considering the capabilities of human hearing (“Coding for High-Resolution Audio Systems” J. Audio Eng. Soc., Vol. 52, No. 3, 2004 March, see especially FIG. 16) from which one may conclude that a 44.1 kHz sampled digital system properly quantised with TPDF dither (but without noise shaping) to 20.5 bits will always provide sufficient dynamic range as a distribution medium. The non-noise-shaped noise spectral density is reduced by a further 3.4 dB when 96 kHz sampling is used. We can conclude that a 16-bit 96 kHz channel with appropriate noise shaping is entirely adequate as a distribution format, meeting audiophile requirements with some margin to spare.
Therefore, considering the information-theoretic arguments along with the psychoacoustic arguments, it is both necessary and permissible to requantise a 96 kHz input signal which may have a large bit depth such as 24 bits to a smaller bit depth such as 16 bits. Accordingly, a 96 kHz noise shaper 1 is shown in
In the decoder of
As quantisation is a lossy process, the total processing indicated by
Lossless Band Splitter and Joiner Using ‘Lifting’
The architecture of
In the bandsplitter of
Two lifting step are now applied. A lifting step adds a function of one signal to another signal:
X′=X+f(Y)
Y′=Y
which can be inverted simply by:
X=X′−f(Y′)
Y=Y′
This is lossless provided function f is exactly consistent (including any quantisation or initialisation of state variables) between the two cases.
In the first lifting step of
is such a filter having n=2 and a delay of 2.5 samples. A filter of length of 10-20 taps may be reasonable to furnish an “HF” stream having good rejection of most of the bottom half of the original spectrum, i.e. of frequencies significantly below 24 kHz.
Again assuming that the 2× stream is sampled at 96 kHz, the top half of the original spectrum is aliased down to 0-24 kHz in both the Even and Odd streams that emerge from the de-interleaving unit, but in opposite phase. Thus original signals in the range 24-48 kHz are doubled in amplitude by the first lifting operation, and so the 1×HF output potentially has twice the amplitude of the 2× input. This is why in
The first lifting step in
The two lifting operations will furnish a stream pair (LF, HF) in which the precise response of the LF stream near crossover may not be ideal—it may rise slightly before cutting off. If this is considered a problem, it can be avoided using three lifting operations with adjusted filter shapes.
Each quantisation Q1, Q2 should be to the original step size, for example 2−16 if the input to the bandsplitter is a 17-bit signal occupying the signal range −1 to +1. The LF and HF outputs of the bandsplitter in
For lossless reconstruction each quantisation Q1, Q2 in the decoder must behave identically to its counterpart in the encoder, for example both rounding up or both rounding down.
Lossless Bandwidth Extension—Singly Compatible
Returning to
While the HF signal contains potentially 18 bits of information, in practice its peak level is lower than the theoretical maximum by 35 dB or more, even on ‘vigorous’ commercial recordings. Lossless compression is clearly indicated as a means to reduce the number of bits. Lossless compressors intrinsically produce a variable data rate, which in practice needs to be smoothed by buffering, for example, using a FIFO (First In First Out) buffer. The HF signals produced by bandsplitting appear typically to be more “bursty” than standard audio signals, so buffering is even more important. For clarity, the necessary buffers have not been shown on the diagrams here but it is assumed that such a buffer is built in to each lossless compressor and decompressor, as it is in the MLP compression system. Of course, FIFO buffereing introduces delay and it is necessary to add a fixed delay in any parallel signal path (such as the LF signal path) so as to maintain time alignment. Again such fixed delays have been omitted from the diagrams for clarity.
Tests on a corpus of 970 commercial 96 kHz recordings have indicated that with a FIFO buffer of 0.3 seconds, the composite LF and losslessly compressed HF information will fit into 24 bits in 97.6% of cases if quantised to bit depths between 15 bits and 18 bits.
Thus in general, trial encodings with different quantisation depths may be used to establish the largest quantisation depth that may be used for each item to be encoded. It can be seen that coarsening the 96 kHz quantisation reduces the bitwidth required by the composite information in two ways:
However, coarser quantisation also increases the shaped noise in the HF signal. Whether this has a significant effect depends on whether noise dominates signal in the HF path, a matter that may vary with time and so be different at different instants that contribute the data that is stored in the lossless encoder's FIFO buffer at any given time. Empirically, we find that coarsening the 96 kHz quantisation by one bit may reduce the composite information at 48 kHz by one-and-a-half bits.
In the case of 16-bit original material, the composite information will often fit directly into 24 bits, in which case the prequantiser shown in
As already indicated, the output 11 of the decoder of
The “legacy” listener without a decoder will hear the output of the encoder interpreted as a PCM signal, thus primarily the LF output of the bandsplitter but potentially also the output of the lossless compressor interpreted as a PCM signal in the bottom bits of a 24-bit word. As already mentioned, this output should be randomised if it is not already a noiselike signal.
The legacy listener is also exposed to any quantisation artefacts produced by the quantisers Q1 and Q2 in
Double Compatibility: Simple Approach
As signal “A” is quantised to thirteen bits, the bandsplitter 3 may also be configured to produce the LF output 15 of thirteen bits which will fit directly into the top thirteen bits B1-B13 of the output word 16. The HF output 28 is then lossily compressed 4 and justified 12 to bits fourteen through sixteen, B14-B16, of the output word 16. Thus, for the 16-bit listener, the more significant portion 8 of the output word 16 provides the same decoding options as did the sixteen-bit word 8 in
To support lossless encoding for the 24-bit listener, an encoder similar to that of
The decoder of
The decompression, subtraction and lossless compression shown in
Accordingly, in
Thus, in some less preferred embodiments, the compression unit 21 may contain the internal subunits shown within the dashed box in
In this description and in the figures, 96 kHz quantisation bit depths such as 13 bits and 15 bits are for illustration only and are not intended to be limiting. The same applies to the 96 kHz frequency itself. Similarly, the 3 bits shown for the lossy compressed output is an example and compression to a smaller number of bits may be used in practice.
Improved Double Compatibility
The scheme of
It is to be noted that the encoder of
The new feature of
The decompressor 22 in
Because the encoder splits the information in the LF signal 15 between the more and less significant portions 8 and 17 of the composite word, it is able to handle a higher precision 96 kHz signal 2 than did the encoder of
The noise-shaped splitter 5′ and joiner 24 may be implemented in various ways.
In
In standard practice the output of the filter 33 would be subtracted directly from the input signal. Here however it must be made possible for the 24-bit decoder to “undo” the effect of the shaper, since noise shaping is a lossy process. Referring to
Given these conditions, the joiner in
Returning to
This process is recursive, since the regenerated splitter's LSB derived thus at a particular sample instant will affect signal 38a at the next sample instant, on account of propagation through the noise shaping filter 33a. It is therefore necessary to ensure that the state variables in the noise shaping filters 33 and 33a are initialised to the same values. It would be natural to set these variables to zero, in both the encoder and decoder, at the beginning of a stream.
The layout of the less significant portion of the composite encoded word is at the implementor's discretion. For example, the LSBs from the shaper and the packed touchup signal could have been interchanged with no effect on the overall operation.
Considering that in some contexts 20-bit audio can be conveyed but 24-bit audio cannot, there may also be the desire to provide triple compatibility, that is to provide advantages balanced between the legacy listener, the 16-bit listener with a decoder, and the 20-bit listener with a decoder, as well as lossless extended-bandwidth reproduction for the 24-bit listener. This may be achieved by further subdivision of the less significant portion of the 24-bit composite word, and a further application of the principles already described.
The references to 16 bits and to 24 bits in this document merely reflect wordwidths popular in current practice, and the invention can equally well be applied with different values for these longer and shorter wordwidths.
In summary, we have described systems that provide a PCM-compatible stream with a variety of decoding options. Although it is necessary to have a decoder to achieve lossless reproduction of an original high-sample-rate signal, the signal provided to the legacy listener thus being described as ‘lossy’, the reduction to lossy is carried out in a manner that is described as ‘benign’ in audiophile circles, using only the operations of time-invariant filtering, sample rate reduction and a requantisation that imposes a time-invariant noise floor.
Number | Date | Country | Kind |
---|---|---|---|
1210373.5 | Jun 2012 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2013/051548 | 6/12/2013 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2013/186561 | 12/19/2013 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5956674 | Smyth | Sep 1999 | A |
6226616 | You et al. | May 2001 | B1 |
20100082352 | Fejzo | Apr 2010 | A1 |
20100161321 | Oshikiri | Jun 2010 | A1 |
20110202353 | Neuendorf | Aug 2011 | A1 |
20110224991 | Fejzo | Sep 2011 | A1 |
20120221326 | Grancharov | Aug 2012 | A1 |
Number | Date | Country |
---|---|---|
1396844 | Mar 2004 | EP |
1883067 | Jan 2008 | EP |
0079520 | Dec 2000 | WO |
2007128662 | Nov 2007 | WO |
2013186561 | Dec 2013 | WO |
Entry |
---|
“Patents Act 1997: GB Search Report under Section 17(5)”, Intellectual Property Office, dated Oct. 8, 2012, 3pgs. |
“Patents Act 1997: GB Search Report under Section 17”, Intellectual Property Office, dated Oct. 14, 2013, 4pgs. |
“PCT Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration”, dated Dec. 20, 2013, for International Application No. PCT/GB2013/051548, 12pgs. |
van der Veen, Michiel et al., “High Capacity Reversible Watermarking for Audio”, Proceedings of Spie—International Society for Optical Engineering, Spie—Engineering, US, vol. 5020, Jan. 1, 2003 (Jan. 1, 2003), XP001189791, ISSN: 0277-786X, DOI: 10.117/12.476858, 11pgs. |
“The New Digital Sound: The promise and complications of high definition digital audio on Blu-ray”, Dec. 2, 2008, retrieved Aug. 10, 2012, retrieved from http://www.hifi-writer.com/he/misc/newsound.htm, 8pgs. |
Thomton, Mike “DTS (DTS-HD) Master Audio Suite”, Feb. 2008, Sound on Sound, retrieved from http://www.soundonsound.com/sos/feb08.articles/ma—suite.htm, 2pgs. (See the Section ‘Core Values’). |
Komamura, Mitsuya “Wide-Band and Wide-Dynamic-Range Recording and Reproduction of Digital Audio”, J. Audio Eng. Soc., vol. 43, No. 1/2, Jan./Feb. 1995, XP-000733674, (pp. 29-39, 11 total pages). |
Number | Date | Country | |
---|---|---|---|
20150154969 A1 | Jun 2015 | US |