1. Technical Field
The present invention generally relates to the field of digital signal processing, and more specifically, to a system and method for digital watermarking.
2. Description of the Related Art
In recent years, the digital watermarking technology has been widely applied in fields such as copyright protection of multimedia digital signals, release control, consistency check, broadcast monitoring, data hiding, etc. The basic idea of the digital watermarking technology is adding information called watermark into digital multimedia signals such as image, graph, audio and/or video, so as to be used for various verification purposes in the future. The watermark is substantively a digital signature hidden in a main multimedia signal, providing information such as proprietorship or rights of use of the main multimedia signals.
From the perspective of extraction and detection of a watermark, the watermarking technology may be divided into non-blind watermarking technology, semi-blind watermarking technology, and blind watermarking technology. Upon extract and decoding, the non-blind watermarking technology needs the original multimedia information and the added reference signals (e.g., pseudo noise sequence); upon extraction, the semi-blind watermarking technology needs a reference signal and a key for generating the reference signal; the blind watermarking technology only needs the key in the case of extraction.
In the digital watermarking technology for audio, a spread spectrum (SS)-modulated digital watermark is a known blind watermarking technology. However, the traditional SS-based watermark only considers the influence of attacking noise on watermark decoding, which ignores the interference to watermark decoding introduced by the main audio signal per se, which causes increase of the bit error rate. Moreover, in order to lower the auditory distortion caused by the watermark to main audio signals, the watermark embedding party always performs spectrum processing to the reference signals in use, such that a digital watermark decoder performing blind detection can hardly recover the reference signals used by the watermark embedding party accurately. In other words, reference signal mismatch exists between the watermark embedding party and the decoder party, which causes deterioration of the decoding performance.
Therefore, a more accurate and robust digital audio watermarking technology is needed in this field.
In order to solve the above problems and other problems in the field, the present invention provides a system and method for digital watermarking.
In one aspect of the present invention, there is provided a system for digital watermarking, the system being adapted to add a watermark to an audio signal generated by a signal source. The system comprises: a spectrum modulator configured to perform spectrum modulation to a watermark bit and a pseudo noise signal to be embedded into the audio signal to generate a modulated signal; a distortion controller coupled to the signal source and the spectrum modulator and configured to shape the modulated signal based on the audio signal, so as to generate a shaped signal satisfying a predetermined distortion constraint; and an interference compensator coupled to the signal source and the distortion controller and configured to generate a compensation signal based on the audio signal, the pseudo noise signal, and the shaped signal, wherein the compensation signal is for compensating for interference to watermark decoding caused by the audio signal.
In another aspect of the present invention, there is provided a method for digital watermarking, the method being adapted to add a watermark to an audio signal generated by a signal source. The method comprises: performing spectrum modulation to a watermark bit and a pseudo noise signal to be embedded into the audio signal to generate a modulated signal; shaping the modulated signal based on the audio signal, so as to generate a shaped signal satisfying a predetermined distortion constraint; and generating a compensation signal based on the audio signal, the pseudo noise signal, and the shaped signal, wherein the compensation signal is for compensating for interference to watermark decoding caused by the audio signal.
The above and other objectives, features and advantages of the present invention will become more comprehensible through reading the following detailed description with reference to the accompanying drawings. In the accompanying drawings, several embodiments of the present invention are illustrated in an exemplary and non-limiting manner, wherein:
In respective figures, same or corresponding reference numerals represent the same or corresponding parts.
In general, according to the embodiments of the present invention, in order to lower the interference to watermark decoding caused by a main audio signal as a signal carrier as far as possible, a compensation signal is generated at a watermark embedding party to compensate for such interference, which may effectively reduce the bit error rate of the watermark decoding party. Moreover, in order to overcome the adverse impact on watermark decoding brought by the distortion control processing performed by the watermark embedding party to a reference signal, in the embodiments of the present invention, generation of the above compensation signal not only takes a main audio signal and an original pseudo noise signal into account, but also considers the modulated and shaped pseudo noise signal. In this way, it can be assured that the pseudo noise signal recovered at the watermark decoding party matches the embedding party, thereby further lowering the bit error rate of watermark decoding.
Hereinafter, the principle and spirit of the present invention will be described with reference to several exemplary embodiments illustrated in the drawings. It should be understood that these embodiments are provided only to enable those skilled in the art to better understand and then further implement the present invention, not intended to limit the scope of the present invention in any manner.
Please note that the term “couple” used in the following description is for limiting the connection relationships between two components. For example, “component A is coupled to component B” means component A is in connection or communication with component B through any appropriate manner. Unidirectional or bidirectional communication of signals or data may be done between two coupled components A and B. As used herein, the term “couple” refers to not only direct coupling (i.e., no further component C exists between component A and component B), but also indirect coupling (i.e., component A is coupled to a further component C, while the component C is in turn coupled to component B).
Further, in the accompanying drawings, a directional connecting line between components means to express the flow direction of information or a signal between the coupled components, not intended to limit the coupling manner between the components in any manner. Besides, in the description below, the signal may be expressed in a vector manner, which is common in the art.
Reference is first made to
As shown, the digital watermark system 100 comprises a spectrum modulator 102. The spectrum modulator 102 is configured to perform spectrum modulation to a pseudo noise (PN) (denoted as u) and a watermark bit (denoted as b) that is to be embedded into an audio signal (denoted as x), so as to generate a modulated signal (denoted as bu).
According to the embodiments of the present invention, the audio signal x may be generated by any appropriate one or more signal sources (not shown in
The PN signal u for example may be a bit sequence with a particular number, the average value of the bits in the sequence is zero, and the value of each bit is +σn or −σn. The PN signal may be generated by a dedicated PN generator under the control of a key. According to the embodiments of the present invention, the PN generator may be a part of the spectrum modulator 102, or a stand-alone component separate therefrom. The scope of the present invention is not limited in this aspect. For this aspect, an exemplary embodiment will be described with reference to
According to the embodiments of the present invention, the spectrum modulator 102 modulates the PN signal using the watermark bit b. The watermark b is a bi-polar bit to be embedded into the audio signal x, namely, its value is equal to either +1 or −1. According to the embodiments of the present invention, the watermark bit b may be generated by a component in the system 100 or generated by other component independent of the system 100. The scope of the present invention is not limited in this aspect.
According to some embodiments of the present invention, the spectrum modulator 102 may realize spread spectrum modulation through multiplying the watermark bit b to the PN signal u, so as to generate a modulated signal bu. Other embodiments are also easily envisaged by those skilled in the art, and the scope of the present invention is not limited in this aspect.
The modulated signal bu generated by the spectrum modulator 102 is outputted to a distortion controller 104 in the system 100 to perform distortion control. As shown in
It would be appreciated that after the watermark is added to the original audio signal x, the audio will be somewhat distorted in hearing. The distortion controller 104 may control thus distortion to an acceptable extent by shaping the modulated signal bu. Specifically, the distortion controller 104 may modify and adjust the spectrum features of the modulated signal bu based on features of the audio signal x, such that the shaped signal bup satisfies a predetermined constraint in spectrum and other acoustic features. In this way, the distortion of the original audio signal caused by addition of watermark may be controlled within a limit that is unsusceptible or acceptable to the user.
Various methods of mask distortion in the audio signal under the constraint control are known in the field. For example, the distortion constraint may for example be a group of mask thresholds. The mask thresholds may be generated in an appropriate manner, for example, generated based on statistic empirical value, manual setting or through various acoustic models. As an example, masks based on acoustic psychological model may be described in detail in the embodiments described with reference to
The shaped signal bup generated by the distortion controller 104 is fed to the interference compensator 106. As shown in
In the traditional SS watermarking technology, only the impact of attack noise on watermark decoding is considered, without addressing the interference caused by the main audio signal per se to watermark decoding. More specifically, in the traditional SS watermarking technology, the resultant signal s is generated computer as:
s=x+bu
However, in practice, the main audio signal is always far stronger than the attack noise because the interference to watermark decoding caused by the audio signal per se is usually dominant. In addition, since it is the modulated and shaped PN signal bup that is used during the process of adding the watermark, the traditional SS watermarking technology cannot eliminate the impact brought by spectrum shaping on watermark decoding. In other words, the reference signal recovered at the watermark decoding party does not match the reference signal used by the watermark embedding party.
In order to solve the above problem, according to the embodiments of the present invention, in the compensation signal generated by the interference compensator 106, the interference to watermark decoding caused by the main audio signal will be eliminated at the watermark embedding party. For example, if the interference to watermark decoding caused by the main audio signal is denoted as x, then the compensation signal may be computer as:
y=xup
Accordingly, the final signal s may be calculated as follows:
s=x+αbup−y=x+αbup−xup=x+(αb−x)up
wherein α is a parameter controlling embedding distortion. In this way, the interference x brought by the main audio signal with the watermark decoding is compensated at the embedding party. Therefore, the bit error rate of the watermark decoding party can be effectively lowered.
Moreover, it may be seen that when calculating the main audio signal interference x, the interference compensator 106 not only considers the features of the main audio signal x and the PN signal u, but also considers the main signal component up of the shaped signal bup (namely, the remaining signal component after the watermark bit b is removed from bun). In this way, the impact brought by the spectrum shaping executed for the purpose of distortion control on watermark decoding may also be effectively compensated. The value of interference x may be calculated in various appropriate manners. Hereinafter, a specific example will be described with reference to
Now, refer to
As shown in
The system 200 may further comprise a spectrum modulator 202, which corresponds to the spectrum modulator 102 in the system 100. As shown in the figure, the spectrum controller is coupled to the pseudo noise generator 201 and configured to receive a pseudo noise signal u generated by the pseudo noise generator 201. Next, the spectrum modulator 202 modules the pseudo noise signal u and the to-be-embedded watermark bit b to generate a modulated signal bu. According to the embodiments of the present invention, the spectrum modulator 202 may accomplish the modulation by multiplying the pseudo noise signal u with the watermark bit b. Of course, other manners are also feasible. The scope of the present invention is not limited in this aspect.
The modulated signal bu generated by the spectrum modulator 202 is fed to the distortion controller 204 in the system 200 (which corresponds to the distortion controller 104 in the system 100). In particular, in the embodiment of
The spectrum system resulting from transform by the analysis filter 2041 is fed to the spectrum adjustor 2042 in the distortion controller 204. The spectrum adjustor 2042 is coupled to the analysis filter 2041 and configured to regulate the spectrum coefficients generated by the analysis filter 2041 based on the predetermined distortion constrain. As mentioned above, the distortion constrain may be derived through various manners. In the example shown in
Specifically, as shown in
The masking threshold generated by the modeler 203 is fed to the spectrum adjustor 2042 as a distortion constraint. Correspondingly, the spectrum adjustor 2042 may regulate the spectrum coefficients to be lower than the masking threshold. Besides, the spectrum adjustor 2042 may also consider various other factors, e.g., quantitative noise of the audio encoder and the like, when adjusting the spectrum coefficient.
The spectrum coefficients regulated by the spectrum adjustor 2042 are fed to a synthesis filter 2043 in the distortion controller 204. The synthesis filter 2043 is coupled to the spectrum adjustor 2042 and configured to transform the regulated spectrum coefficients back to the time domain. For example, when the analysis filter completes the frequency domain transform using FFT, the synthesis filter 2043 may perform the time domain transform using the Inverse Fast Fourier Transform. Of course, other manners are also feasible, and the scope of the present invention is not limited in this aspect. The temporal signal generated by the synthesis filter 2043 is fed as the shaped signal bup from the distortion controller 204 to the interference compensator 206.
Still with reference to
As an example, according to some embodiments, the interference calculator 2061 may calculate the value of interference x through calculating a signal projection. One feasible manner is calculating x as such:
x=<x,u>/<u,u>
where <,> denotes an internal product between two vectors. In this case, the physical meaning of x is the projection of the audio signal x on the PN signal u. However, as mentioned above, in order to control the distortion of the original audio x within an acceptable scope, the original PN signal experiences modulation and shaping. In this way, the watermark decoding party always cannot accurately recover the reference signal bup used by the watermark embedding party.
Therefore, according to the preferred embodiment of the present invention, the value of interference x to watermark decoding caused by the audio signal may be calculated as such:
x=<x,u>/<up,u>
In this way, by considering the main signal component of the shaped signal when calculating the value of interference x, the problem existing in the prior art can be effectively overcome, which guarantees that the reference signal extracted by the watermark decoding party is consistent with the watermark sequence embedded by the watermark embedding party.
Return to
s=X+αbup−xup=X+(αb−x)up
where α is a parameter for controlling distortion, and an appropriate numerical value may be set according to the actual conditions.
Examples of a system for digital watermarking according to some embodiments of the present invention have been described above with reference to
It should be understood that the particular details and algorithms as described above are all exemplary, and based on the teaching and inspiration provided here, those skilled in the art would easily conceive of an alternative solution to realize the above idea. These alternative solutions all fall within the scope of the present invention.
The systems described above with reference to
As shown in the figure, after the method 300 starts, in step S301, spectrum modulation is performed to a watermark bit b to be embedded into the audio signal and a pseudo noise signal u to generate a modulated signal bu. According to some embodiments of the present invention, the pseudo noise signal, for example, is generated under the control of a key.
Next, the method 300 proceeds to step S302, in which the modulated signal bu is shaped based on the audio signal x to generate a shaped signal bup satisfying a predetermined distortion constrain. According to some embodiments, generating a shaped signal bup may comprise: transforming the modulated signal bu into spectrum coefficients in a frequency domain; adjusting spectrum coefficients based on a distortion constraint, and transforming the regulated spectrum coefficients back into the time domain to generate the shaped signal bup. According to some embodiments of the present invention, the distortion constrain may be a masking threshold generated based on psychoacoustic model specifically for the audio signal x.
The method 300 then proceeds to step S303 in which a compensation signal is generated based on the audio signal x, the pseudo noise signal u, and the shaped signal bup. The generated compensation signal will be used for compensating for the interference to the watermark decoding caused by the audio signal x. According to some embodiments of the present invention, generating a compensation signal comprises: calculating a signal projection based on the audio signal x, the pseudo noise signal u, and the shaped signal bup, to determine a value of interference x to watermark decoding caused by the audio signal; and generating the compensation signal based on the value of interference x and the modulated signal component.
It should be understood that the method 300 may be executed by the system 100 and/or 200 as described above. Therefore, all features described above with reference to
The devices and their modules related to in the present invention may be implemented by, for example, a very large scale integrated circuit or gate array, a semiconductor such as a logic chip, a transistor, or hardware circuitry of a programmable hardware device such as a field programmable gate array, a programmable logic device, etc. Alternatively or additionally, the embodiments of the present invention may also be implemented through firmware.
It should be noted that although several modules or sub-modules of the devices have been described in detail, such partitions are only non-compulsory. Actually, according to the embodiments of the present invention, features and functions of two or more devices as described above may be substantiated in one device. In turn, the features and functions of one device as described above may be further partitioned into multiple devices to substantiate.
Besides, although the operations of the method of the present invention have been described in particular ordering in the drawings, it does not require or suggest that these operations have to be performed in that particular ordering, or a desired result can only be reached by performing all illustrated operations. On the contrary, the steps described in the flowchart may be changed in their execution orders. Additionally or alternatively, some steps may be omitted; a plurality of steps may be combined into one step for execution, and/or one step may be decomposed into a plurality of steps for execution.
Although the present invention has been described with reference to several preferred embodiments, it should be understood that the present invention is not limited to the disclosed preferred embodiments. The present invention intends to cover various modifications and equivalent arrangements within the spirit and scope of the appended claims. The scope of the appended claims satisfies a broadest explanation and therefore includes all such modifications and equivalent structures and functions.
Number | Date | Country | Kind |
---|---|---|---|
2014 1 0333164 | Jul 2014 | CN | national |
Number | Name | Date | Kind |
---|---|---|---|
20070136595 | Baum | Jun 2007 | A1 |
Number | Date | Country |
---|---|---|
WO02103695 | Dec 2002 | WO |
Entry |
---|
ISO/IEC 11172-3, “Information technology—Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s—Part 3: Audio,” International Standard, 1993, 158 pages. |
Malvar et al., “Improved Spread Spectrum: A New Modulation Technique for Robust Watermarking,” IEEE Transactions on Signal Processing 51(4): 898-905, Apr. 2003. |
Swanson et al., “Robust audio watermarking using perceptual masking,” Signal Processing 66:337-355, 1998. |
Number | Date | Country | |
---|---|---|---|
20160012826 A1 | Jan 2016 | US |