The present invention relates to a method and a system for selecting bit positions for storing a digital watermark in digital data, and in particular selecting bit positions corresponding to a predetermined spread function.
Because of the increased interest in protecting digital data from illegal copying, watermarking of data has, in recent years, become increasingly popular. Embedding a watermark into a digital data file involves selecting samples from the digital data file and recording in selected bits of these samples, data comprising copyright information. Arrangements can then be made so that any unauthorized access or copying of the original data file runs the risk of the extracted watermark exposing the lack of legal ownership of the file.
An official and reliable watermark has to be difficult to find and remove or override. In addition, the watermark should not affect substantially the quality of the original data file.
Copyright protection by way of watermarking has become especially popular in the music industry, where recently there has been a strong increase in illegal downloads and copying. For the process of watermarking audio files, however, an additional consideration is related to the fact that many such files, together with the embedded watermarks, are processed on mobile phones and other hand-held players. In order to minimize cost and extend battery life, such hand-held devices often have slow processors with limited computational capabilities.
Some of the methods developed for embedding watermarks use spread spectrum techniques in the frequency domain. These methods generally require the original audio data for watermark detection. These methods also are computationally intense, because of the complex transformations involved in the data processing. Accordingly, these methods are not suitable for processing watermarked files in hand-held devices.
Other techniques embed watermarks in the time domain. Many of these techniques use the least significant data bits of the respective data samples to store the watermark. One disadvantage of this approach is that the stored watermark can be erased without significantly eroding the audio quality, thus undermining the reliability of the protection.
Accordingly, it is desirable to develop a method for embedding a watermark in digital data that is tamper-resistant and relatively simple, so that the verification of the watermark would not require substantial computational capabilities.
According to a first aspect of the invention, there is provided a method for protecting digital audio data comprising a plurality of samples. The method comprises selecting bit positions for storing a digital watermark in the digital audio data in time domain, by choosing a spread function characterising the plurality of the selected bit positions, wherein the spread function comprises at least one Gaussian curve.
Preferably, the at least one Gaussian curve is defined by spread function parameters including;
According to a second aspect of the invention, there is provided a stand-alone or a networked computer system for selecting bit positions for storing a watermark in a digital audio signal, the signal comprising of plurality of samples in time domain. The computer system comprises computational means for spreading the digital watermark data in time domain by storing the data in bit positions of selected ones of the samples. The computational means are programmed to define a spread function characterising the plurality of the selected bit positions used for storing the watermark. The spread function comprises at least one Gaussian curve defined by spread parameters including;
According to a third aspect of the invention, there is provided a computer program product comprising a computer readable medium with a computer program recorded therein for selecting bit positions for storing a watermark in a digital audio data in time domain. The digital data comprises a plurality of samples. The computer program comprises means for spreading the digital watermark data by storing the data in bit positions of selected ones of the samples. The spread function characterising the plurality of the selected bit positions used for storing the watermark comprises at least one Gaussian curve defined by spread parameters including;
In the drawings:
Method, system and computer program products for selecting bit positions for storing a digital watermark are described. In the following detailed description of exemplary embodiments of the invention, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific exemplary embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention. Other embodiments may be utilized, and logical, mechanical, and other changes may be made without departing from the spirit or scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims.
The Spread Function
An important feature of any algorithm for embedding watermarks in the time domain, is the choice of bit positions of the samples in the audio file that are to be used for storing the watermark. If the bit positions that contain the watermark data are not sufficiently spread out, a process analogous to “signal jamming” can be used to erase the watermark. For example, using only the least significant bits in the respective audio file for storing the watermark is attractive, since the audio quality will not be perceptibly affected by the watermark. However, the predictability of such an approach exposes the protected audio signal to an “erasing” attack that, when directed to the least significant bits, will also affect the quality of the audio signal only marginally. However, the use of more significant bit positions of selected samples for the watermark will perceptibly degrade the audio quality of the original file.
The described method relies on choosing a weighted mix of bit positions for storing the watermark. In particular, the utilized bit positions follow a normal distribution and can be represented by a Gaussian, also known as “normal” or “bell”, curve. As a result, out of the samples selected for use by the watermark, the number of samples using bit position i for storing the watermark is exponentially smaller than the number of samples using bit position (i-1). Such a ratio offers a good compromise between reliability and distortion introduced by the watermark.
Since linear Pulse Code Modulation (PCM) encoding, where the quantization levels are evenly paced, is the most common technique used for sampling music, it is assumed for the present process that the original data file is PCM encoded. However, this is not essential, and the discussed algorithm can be easily adapted for other time domain encoding techniques.
μ—The point in time when the Gaussian curve peaks.
σ—The standard deviation of the Gaussian curve.
N—The interval between audio samples. Only those sample points in the curve at intervals of N are selected to embed the watermark.
ip—The bit position of the samples used to store the watermark near the peak of the curve, i.e. the value of i at time μ.
Parameter μ is chosen to be at one of the points in time when some distortion would be tolerable to the auditory senses (e.g., when there is a loud or jarring piece of audio). However, to minimize the effect of the watermark itself on the quality of the audio sample, μ should be positioned at a point in time, when the PCM signal strength is smaller than a selected level 4, chosen to correspond to a signal strength S that is below the local peaks 5 and 6, as shown in
The sample interval N is chosen depending on the length of the desired watermark that is to be embedded and the frequency of embedding that is desired. A typical value can be calculated using the formula:
N=(Sampling frequency*bits per sample*x)/y (1)
where in Eq. (1) the parameters x and y refer to every ‘x’ seconds of audio being protected by a watermark of length ‘y’. For example, if it is assumed that every 30 seconds of CD quality audio (44.1 KHz, 16 bits per sample) needs to be protected by a 40 byte watermark, then N=(44100*16*30)/40=529200, i.e., one in every 529200 bytes in the audio needs to be selected for holding the watermark.
The parameters σ and ip are chosen on the basis of an engineering compromise between the expected probability for watermark removal attack, and the tolerance to distortions introduced by the watermark. The value chosen for ip also depends on the bits per sample ratio, or the granularity of quantization, characterizing the digital audio sample that is to be protected. An audio recording with higher bits per sample ratio can tolerate a higher ip. For a given i (say i=2), the larger the value of σ, the larger will be the number of selected samples using the second least significant bit for storing the watermark.
The effect of different spread curve combinations can be experimentally determined with the help of a visual drag-and-adjust application program interface (API) that allows variations of σ, μ, N and ip.
Multiple Gaussian curves also can be used across the same audio file to better control the above compromise between security and distortion. For instance,
Additional control over the spread function can be obtained by introducing the functionality of “shift intervals”, as indicated in step 204 in
A shift interval is defined by the parameters (t1, t2, v, p), where t1, t2 are the starting and the finishing points in time, v is the value of i during the time interval t1 to t2, and p is the value of N (period) during that interval. The respective bit positions of points in any Gaussian curve that exists between t1 and t2 are overridden with v. The case when v=p=0 results in a blanking interval. Samples falling within such a blanking interval are exempt from carrying watermarking information. This can be used in critical audio regions, where even an occasional degradation is unacceptable.
The Spread Function Key
The entire information associated with parameters of the Gaussian curves and the shift parameters included in the spread function can be summarized in a “Spread Key” recorded in a look-up table.
For example, a Spread Key can be represented as G+S in the equations below where
G=({(μ[1], σ[1], N[1], ip[1], t1[1], t2[1]) . . . (μ[k], σ[k], N[k], ip[k], t1[k], t2[k]))} (2)
and
S={(t1[1], t2[1], v[1], p[1]) . . . (t1[m], t2[m], v[m], p[m]}) (3)
In Eq (2) G is the set of Gaussian curves and in Eq (3) S is the set of shift intervals within the audio data. The simplest spread function includes only one Gaussian curve and no shift intervals. The spread key in this case is given by Eq (4).
Spread Key=(μ, σ, N, ip) (4)
The additional data necessary for computing the bit position of the selected samples that are to be used for storing watermark information is the normalized area under the Gaussian curve. This data can be found in tabulated form in standard text books on statistics and probability. The embedding software stores the tabulated data in memory.
The Watermark Embedding Algorithm
The algorithm to embed the watermark is as follows:
The above pseudo code is illustrated with the functional description in
An encryption algorithm can be used in conjunction with the spread algorithm described previously. One example of such algorithm is the widely used “RC4” (also known as “ARC4” or “ARCFOUR”) stream cipher algorithm that produces dissimilar cipher text for each instance of the watermark. The resulting cipher stream is embedded at the bit positions of chosen samples, as calculated by the spread algorithm described previously.
If a respective Gaussian curve is valid only during a time interval t1 to t2, the steps outlined in the above watermark embedding algorithm are applied only during that interval. The spread parameters can either be generated automatically by software (by following heuristics and/or with the help of a random number generator), through user input using a custom drop-and-adjust program interface or by a combination of both. The encoder uses the bit positions of the samples generated above, for storing the encrypted watermark. Encryption can be accomplished by any encoding algorithm. After embedding the watermark, the spread keys, the encryption keys and the watermark itself, are stored in a lookup table to be used later for decoding.
A measure of the relative amount of audio degradation introduced into the protected audio file by the watermark during any time interval can be determined with the use of a standardized table defining a normalized area of a Gaussian curve, by finding the area under the curve during those time intervals.
The proposed method for introducing a spread in the number of bits used for embedding a watermark gives good control over watermark positioning with only limited amount of retrieval information necessary to be stored. To allow verification of the watermark by an authorized party, the decoder of the verifying party needs to access the lookup table including the spread keys, the encryption keys and the watermark. During decoding, the corresponding spread function parameters are obtained from the lookup table and the spread function is reconstructed. Watermark bits from the identified bit positions are continuously extracted and fed into the stream decipher along with the decryption key that is also obtained from the lookup table. The watermark information is extracted from the audio file and compared with the watermark information stored for this audio work in the lookup table, to effect verification of the authenticity of the watermark.
The watermark embedding or extraction can be accomplished in a single pass of the audio data through an audio data processing system that decodes the coded data and verifies the watermark. Of course, if the spread function is present only over part of the audio data, only that part of the audio data needs to be processed. The described method for identifying the bits for embedding the watermark allows the watermark encoding or decoding at high speeds using simple mathematical operations. Notably, the method described herein makes it difficult for malicious attacks to successfully erase or replace the watermark.
Computer Platform
The computer 401 typically includes at least one processor unit 405, and a memory unit 406 for example formed from semiconductor random access memory (RAM) and read only memory (ROM). Here, the processor unit 405 is an example of a processing means which can also be realized with other forms of configuration performing similar functionality. The computer 401 also includes an number of input/output (I/O) interfaces including a video interface 407 that couples to the video display 414, an I/O interface 413 for such devices like the keyboard 402 and mouse 403, and an interface 408 for the external modem 416. In some implementations, the modem 416 may be incorporated within the computer 401, for example within the interface 408. The computer 401 may also have a local network interface 411 which, via a connection 423, permits coupling of the computer 401 to a local computer network 422, known as a Local Area Network (LAN). As also illustrated, the local network 422 may also couple to the wide network 420 via a connection 424, which would typically include a so-called “firewall” device or similar functionality. The interface 411 may be formed by an Ethernet™ circuit card, a wireless Bluetooth™, an IEEE 802.11 wireless arrangement or a combination of thereof.
Storage devices 409 are provided and typically include a hard disk drive (HDD) 410. It should be apparent to a person skilled in the art that other devices such as a floppy disk drive, an optical disk drive and a magnetic tape drive (not illustrated) may also be used. The components 405 to 413 of the computer 401 typically communicate via an interconnected bus 404 and in a manner which results in a conventional mode of operation of the computer 401.
Typically, the programming modules that incorporate the method for choosing the bit positions for watermarking are resident on the storage device 409 and read and controlled in execution by the processor 405. Storage of intermediate product from the execution of such programs may be accomplished using the semiconductor memory 406, possibly in concert with the storage device 409. In some instances, the application programs may be supplied to the user encoded on one or more CD-ROM or other forms of computer readable media and read via the corresponding drive, or alternatively may be read by the user from the networks 420 or 422.
If verification is required on the handheld device 425, it can either utilize its own storage and processing means, similar to these described in relation to computer 401, or make use of a wireless network connection to a computer system, such as 401, on which all watermark related processing can be carried out remotely.
While the invention has been hereby described by using an example that is believed to represent the most practical and preferred embodiment, it would be clear to a skilled addressee that other embodiments and variations will also be within the scope of the main concept of the invention. For example, the method for selecting the bit positions of the samples used for storing a watermark has been descried here in the context of an audio file that is linearly PCM encoded. However, the method is applicable for other encoding techniques, as well as to video and other types of digital data in the time domain.
The discussed method for selecting the bit positions of the samples used for storing a watermark, allows spreading the watermark in the time domain using Gaussian curves. This offers a compact spread information representation. The data processing involved is relatively simple with modest demands on the computational power of the processing device. This facilitates identifying the location of a watermark in real-time even in a low-MIPS (Millions of Instructions per Second) devices, such as mobile phones and other handheld media players.
Other advantages of the discussed method include the following;