The invention relates to a method and to an apparatus for quantisation index modulation for watermarking an input signal, wherein different quantiser curves are used for quantising said input signal.
In known digital audio signal watermarking the audio quality suffers from degradation with each watermark embedding-and-removal step.
One of the dominant approaches for watermarking of multimedia content is called quantisation index modulation denoted QIM, see e.g. B. Chen, G. W. Wornell, “Quantization Index Modulation: A Class of Provably Good Methods for Digital Watermarking and Information Embedding”, IEEE Transaction on Information Theory, vol. 47(4), pp. 1423-1443, May 2001, or J. J. Eggers, J. K. Su, B. Girod, “A Blind Watermarking Scheme Based on Structured Codebooks”, Proc. of the IEE Colloquium on Secure Images and Image Authentication, pp. 1-6, 10 Apr. 2000, London, GB.
With QIM it is possible to achieve a very high data rate, and the capacity of the watermark transmission is mostly independent of the characteristics of the original audio signal.
In QIM as described by B. Chen and G. W. Wornell and mentioned above, an input value x is mapped by quantisation to a discrete output value y=Qm(x), whereby for each watermark message m a different quantiser Qm is chosen. Therefore the detector can in turn try all possible quantisers and detect the watermark message by finding the quantiser with the smallest quantisation error. J. J. Eggers et al. mentioned above have proposed an extension to QIM in order to achieve better capacity in specific watermark channels: in this α-QIM all input values x are linearly shifted towards the reference value (i.e. towards the centroid of the quantiser) with a constant factor. The watermarked output value y can be considered as being computed by y=Qm(x)+α(x−Qm(x)).
The Chen/Wornell processing is by definition non-reversible because information is lost in the quantisation step. The Eggers/Su/Girod processing is reversible, but it is not subject to any time-variable distortion constraint.
A problem to be solved by the invention is to avoid degradation of the audio quality with each watermark embedding-and-removal step by improving the known QIM processing. This problem is solved by the quantisation method disclosed in claim 1. An apparatus that utilises this method is disclosed in claim 2. A method for corresponding regaining is disclosed in claim 8.
The inventive audio signal watermarking uses specific quantiser curves in time domain and in particular in transform domain for embedding the watermark message into the audio signal, whereby it is almost perfectly reversible and the term ‘reversible’ means that the watermark can be removed in order to recover the original PCM samples with high (i.e. with near-bit-exact) quality—under the preconditions that the watermarked audio signal has not undergone significant signal modification, and that the secret key is known which is required for detection of the watermark.
The inventive reversible quantisation index modulation watermarking processing has embedded a power constraint, which is important in audio watermarking in order to guarantee that the modifications of the signal due to the watermark embedding are inaudible.
Advantageously, the inventive processing provides robustness and capacity characteristics which are competitive to state-of-the-art, non-reversible watermarking schemes, and the invention allows to reverse the watermark embedding process without significant penalties in terms of data rate, robustness and computational complexity of the watermark scheme, whereby the reversal of the watermark embedding process will deliver almost exactly the original PCM audio signal.
In principle, the inventive quantisation method is suited for quantisation index modulation for watermarking an input signal x, wherein different quantiser curves Qm are used for quantising said input signal x and a current characteristic of said quantiser curve is controlled by the current content of a watermark message m, wherein in said quantising the difference between input value and output value at any position is not greater than T, and said quantising curves Qm are reversible in that for any input value x there is a unique output value y,
and wherein ±T is a value defining the y shift towards y=0 of outer sections of said quantiser curves Qm and is determined by the current psycho-acoustic masking level of said input signal x, and y is the watermarked output signal, and wherein the different quantiser curves Qm are established according to the current value of m by different shifts of the complete quantiser curve in x direction.
In particular, said quantising can be carried out according to y=Qm(x)+max(−T, min(T, α(x−Qm(x)))),
wherein α is a predetermined steepness of the medium section of said quantiser curves Qm, ±T is a value defining the y shift towards y=0 of the other sections of said quantiser curves Qm and is determined by the current psycho-acoustic masking level of said input signal x, and y is the watermarked output signal.
In principle the inventive quantisation apparatus is suited for quantisation index modulation for watermarking an input signal x, wherein different quantiser curves Qm are used for quantising said input signal x and a current characteristic of said quantiser curve is controlled by the current content of a watermark message m, said apparatus including:
In particular, said quantising can be carried out according to y=Qm(x)+max(−T, min(T, α(x−Qm(x)))),
wherein α is a predetermined steepness of the medium section of said quantiser curves Qm, ±T is a value defining the y shift towards y=0 of the other sections of said quantiser curves Qm and is determined by the current psycho-acoustic masking level of said input signal x, and y is the watermarked output signal.
In principle, the inventive regaining method is suited for regaining an original input signal x which has been processed according to said inventive quantisation method, said method including the steps:
Advantageous additional embodiments of the invention are disclosed in the respective dependent claims.
Exemplary embodiments of the invention are described with reference to the accompanying drawings, which show in:
Reversible QIM watermarking with embedding power constraint The invention extends QIM in order:
The related characteristic curve of the quantiser has to fulfil the following two constraints:
An example of a characteristic curve for one of the quantisers for the inventive reversible QIM processing with embedding power constraint is shown in
The computation of this example characteristic curve is defined for scalar input values by
y=Q
m(x)+max(−T,min(T,α(x−Qm(x)))),
where m represents the watermark message and Qm denotes the different curves of quantisers used for embedding message m, e.g. one quantiser curve for ‘0’ bits of m and a different quantiser curve for ‘1’ bits.
The value of α is fixed in an application, and the choice of α is a trade-off: if α is near ‘1’, the robustness of the embedded watermark is likely to be inferior than for lower values of α, because the average shift towards the reference value is lower than possible. On the other hand, the higher the value of α the better is it possible to reverse the characteristic curve of the embedder in noisy conditions. The value of T is adapted to the current psycho-acoustic masking level of the input signal.
The characteristic curve in
In order to design a full or near reversible audio watermarking system, it is required to utilise filter banks with perfect reconstruction properties. Furthermore, it is highly advantageous in such application if the filter bank coefficients (e.g. MDCT frequency bins) are mutually independent: that means it is desired that any modification of one coefficient (in the embedding process) does only affect exactly the same coefficient at the decoder side (assuming perfect synchronisation of signal segments used for analysis). Any interference with other (nearby) coefficients shall be avoided. One example filter bank with these properties is the MDCT.
A corresponding example embodiment of an inventive embedder is illustrated in
The inventive quantising processing can be carried out in time domain, but preferably the signal processing takes place in frequency domain, i.e. the input signal is fed into an MDCT analysis block and the output watermark signal is produced via an inverse MDCT. Instead of MDCT/IMDCT, any other suitable time-to-frequency domain/frequency-to-time domain transforms can be used, which must allow perfect (i.e. bit-exact) reconstruction of the time domain signal. According to the invention, two consecutive MDCT frames are interpreted as real and imaginary part of one complex spectrum. Strictly mathematically, this interpretation is wrong. However, it allows to define an angular spectrum for the purpose of embedding a watermark. The actual watermark embedding corresponds to the processings described in WO 2007/031423 A1, WO 2006/128769 A2 or WO 2007/031423 A1. For inserting watermark information, only the angles (i.e. the phases) of the pseudo-complex spectrum are modified according to the constraints provided by a psycho-acoustic analysis of the input signal.
The above definition of a pseudo-complex spectrum in MDCT domain has some advantages, compared to a real angular spectrum in DFT domain as used in WO 2007/031423 A1, WO 2006/128769 A2 or WO 2007/031423 A1:
The embedding of the watermark message m is performed according to the inventive reversible QIM with embedding power constraint as described in connection with
The input values x to the embedding curve from that section are the angles of the pseudo-complex spectrum, and the output values y are used to derive the angles of the additive watermark-only signal (in MDCT domain) y-x. The reference angles are derived from a pseudo-noise sequence according to the principles described in WO 2007/031423 A1, WO 2006/128769 A2 or WO 2007/031423 A1. The amplitudes of the complex values defined by two consecutive MDCT spectra are not modified by the watermark embedder.
The new angles (according to y-x as explained in the previous paragraph), together with the amplitudes of the complex interpretation, are again split into two real-valued, consecutive MDCT spectra. The resulting stream of MDCT spectra is fed into the inverse MDCT filter bank 25 in order to produce the additive watermark signal.
The watermark process is reversible because all analysis steps that are applied in order to derive the additive watermark signal are invariant to the embedding of the watermark. That means, the same additive watermark signal can be derived from the original signal as well as from the watermarked signal. There are, however, two preconditions to this property:
In practice, the watermark embedding process typically will not be 100% reversible if the watermarked output signal of the embedder is quantised to integer values. If, for example, the watermarked signal is quantised to 16 bit integer values, the output signal of a watermark remover will suffer from the quantisation noise of this 16 bit quantiser as compared to the original PCM samples.
The above example system has been built and used to determine overmarking performance figures. The term ‘overmarking’ means that a sequence of embedding and removal of watermarks has been applied to one original audio signal.
Typically, the quality of the signal degrades according to the number of consecutive overmarkings.
For comparison,
In a special embodiment, the above principles can also be applied in order to provide a full removal of the watermark, leading with high probability to the bit-exact original input PCM samples of the embedder. For this purpose, in a system as depicted in
The invention can be used for applications like:
The inventive processing can also be used in connection with spread spectrum based watermarking techniques.
Number | Date | Country | Kind |
---|---|---|---|
11305883.8 | Jul 2011 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2012/062194 | 6/25/2012 | WO | 00 | 1/6/2014 |