This invention generally relates to a technology for deriving robust non-local characteristics and quantizing such characteristics for blind watermarking of a digital good.
“Digital goods” is a generic label for electronically stored or transmitted content. Examples of digital goods include images, audio clips, video, digital film, multimedia, software, and data. Digital goods may also be called a “digital signal,” “content signal,” “digital bitstream,” “media signal,” “digital object,” “object,” and the like.
Digital goods are often distributed to consumers over private and public networks—such as Intranets and the Internet. In addition, these goods are distributed to consumers via fixed computer readable media, such as a compact disc (CD-ROM), digital versatile disc (DVD), soft magnetic diskette, or hard magnetic disk (e.g., a preloaded hard drive).
Unfortunately, it is relatively easy for a person to pirate the pristine digital content of a digital good at the expense and harm of the content owners—which includes the content author, publisher, developer, distributor, etc. The content-based industries (e.g., entertainment, music, film, etc.) that produce and distribute content are plagued by lost revenues due to digital piracy.
Modern digital pirates effectively rob content owners of their lawful compensation. Unless technology provides a mechanism to protect the rights of content owners, the creative community and culture will be impoverished.
Watermarking
Watermarking is one of the most promising techniques for protecting the content owner's rights of a digital good (i.e., digital good). Generally, watermarking is a process of altering the digital good such that its perceptual characteristics are preserved. More specifically, a “watermark” is a pattern of bits inserted into a digital good that may be used to identify the content owners and/or the protected rights.
Watermarks are designed to be completely invisible or, more precisely, to be imperceptible to humans and statistical analysis tools. Ideally, a watermarked signal is perceptually identical to the original signal.
A watermark embedder (i.e., encoder) embeds a watermark into a digital good. It typically uses a secret key to embed the watermark. A watermark detector (i.e., decoder) extracts the watermark from the watermarked digital good.
Blind Watermarking
To detect the watermark, some watermarking techniques require access to the original unmarked digital good or to a pristine specimen of the marked digital good. Of course, these techniques are not desirable when the watermark detector is available publicly. If publicly available, then a malicious attacker may get access to the original unmarked digital good or to a pristine specimen of the marked digital good. Consequently, these types of techniques are not used for public detectors.
Alternatively, watermarking techniques are “blind.” This means that they do not require access to the original unmarked digital good or to a pristine specimen of the marked digital good. Of course, these “blind” watermarking techniques are desirable when the watermark detector is publicly available.
Robustness
Before detection, a watermarked signal may undergo many possible changes by users and by the distribution environment. These changes may include unintentional modifications, such as noise and distortions. Moreover, the marked signal is often the subject of malicious attacks particularly aimed at disabling the detection of the watermark.
Ideally, a watermarking technique should embed detectible watermarks that resist modifications and attacks as long as they result in signals that are of perceptually the same quality. A watermarking technique that is resistant to modifications and attacks may be called “robust.” Aspects of such techniques are called “robust” if they encourage such resistance.
Generally speaking, a watermarking system should be robust enough to handle unintentional noise introduction into the signal (such noise may be introduced by A/D and D/A conversions, compressions/decompressions, data corruption during transmission, etc.)
Furthermore, a watermarking system should be robust enough and stealthy enough to avoid purposeful and malicious detection, alternation, and/or deletion of the watermark. Such an attack may use a “shotgun” approach where no specific watermark is known or detected (but is assumed to exist) or may use “sharp-shooter” approach where the specific watermark is attacked.
This robustness problem has attracted considerable attention. In general, the existing robust watermark techniques fall into two categories: spread-spectrum and quantization index modulation (QIM).
With the spread spectrum-type techniques, the watermark indexes the modification to the host data. The host data is the data of the original, unmarked digital signal (i.e., host signal). With typical spread-spectrum watermarking, each bit (e.g., 0s and 1s) of the watermark is embedded into the signal by slightly changing (e.g., adding a pseudorandom sequence that consists of +Δ or +Δ) the signal.
With quantization index modulation (QIM), the watermark is embedded via indexing the modified host data. The modified host data is the data of the marked digital signal (i.e., marked host signal). This is discussed in more detail below.
Those of ordinary skill in the art are familiar with conventional techniques and technology associated with watermarks, watermark embedding, and watermark detecting. In addition, those of ordinary skill in the art are familiar with the typical problems associated with proper watermark detection after a marked signal has undergone changes (e.g., unintentional noise and malicious attacks).
Desiderata of Watermarking Technology
Watermarking technology has several highly desirable goals (i.e., desiderata) to facilitate protection of copyrights of content publishers. Below are listed several of such goals.
Perceptual Invisibility. The embedded information should not induce perceptual changes in the signal quality of the resulting watermarked signal. The test of perceptual invisibility is often called the “golden eyes and ears” test.
Statistical Invisibility. The embedded information should be quantitatively imperceptive for any exhaustive, heuristic, or probabilistic attempt to detect or remove the watermark. The complexity of successfully launching such attacks should be well beyond the computation power of publicly available computer systems. Herein, statistical invisibility is expressly included within perceptual invisibility.
Tamperproofness. An attempt to remove the watermark should damage the value of the digital good well above the hearing threshold.
Cost. The system should be inexpensive to license and implement on both programmable and application-specific platforms.
Non-disclosure of the Original. The watermarking and detection protocols should be such that the process of proving digital good content copyright both in-situ and in-court, does not involve usage of the original recording.
Enforceability and Flexibility. The watermarking technique should provide strong and undeniable copyright proof. Similarly, it should enable a spectrum of protection levels, which correspond to variable digital good presentation and compression standards.
Resilience to Common Attacks. Public availability of powerful digital good editing tools imposes that the watermarking and detection process is resilient to attacks spawned from such consoles.
False Alarms & Misses
When developing a watermarking technique, one does not want to increase the probability of a false alarm. That is when a watermark is detected, but none exists. This is something like finding evidence of a crime that did not happen. Someone may be falsely accused of wrongdoing.
As the probability of false alarms increases, the confidence in the watermarking technique decreases. For example, people often ignore car alarms because they know that more often than not it is a false alarm rather than an actual car theft.
Likewise, one does not want to increase the probability of a miss. This is when the watermark of a signal is not properly detected. This is something like overlooking key piece of evidence at a crime scene. Because of this, a wrongdoing may never be properly investigated. As the probability of misses increases, the confidence in the watermarking technique decreases.
Ideally, the probabilities of a false alarm and a miss are zero. In reality, a compromise is often made between them. Typically, a decrease in the probability of one increases the probability of the other. For example, as the probability of false alarm is decreased, the probability of a miss increases.
Consequently, a watermarking technique is needed that minimizes both while finding a proper balance between them.
Quantization Index Modulation (QIM)
To that end, some have proposed embedding a watermark by indexing the signal (e.g., host data) during the watermark embedding. This technique is called quantization index modulation (QIM) and it was briefly introduced above.
In general, quantization means to limit the possible values of (a magnitude or quantity) to a discrete set of values. Quantization may be thought of as a conversion from non-discrete (e.g., analog or continuous) values to discrete values. Alternatively, it may be a conversion between discrete values with differing scales. Quantization may be accomplished mathematically through rounding or truncation. Typical QIM refers to embedding information by first modulating an index or sequence of indices with the embedded information and then quantizing the host signal with the associated quantizer or sequence of quantizers. A quantizer is class of discontinuous, approximate-identity functions.
The major proponent of such QIM techniques is Brian Chen and Gregory W. Wornell (i.e., Chen-Wornell). In their words, they have proposed, “dither modulation in which the embedded information modulates a dither signal and the host signal is quantized with an associated dither quantizer” (from Abstract of Chen-Wornell article from the IEEE Trans. Inform. Theory).
See the following documents for more details on Chen-Wornell's proposals and on QIM:
However, a key problem with conventional QIM is that it is susceptible to attacks and distortions. Conventional QIM relies upon local characteristics within relative to specific representation of a signal (e.g., in the time or frequency domain). To quantize, conventional QIM relies exclusively upon the values of “individual coefficients” of the representation of the signal. An example of such an “individual coefficients” is the color of an individual pixel of an image.
When quantizing, only the local characteristics of an “individual coefficient” are considered. These local characteristics may include value (e.g., color, amplitude) and relative positioning (e.g., positioning in time and/or frequency domains) of an individual bit (e.g., pixel).
Modifications—from either an attack or some type of unintentional noise—can change local characteristics of a signal quite dramatically. For example, these modifications may have a dramatic affect on the color of a pixel or the amplitude of a bit of sound. However, such modifications have little effect on non-local characteristics of a signal.
Accordingly, a new and robust watermarking technique is needed to find the proper balance between minimizing the probability of false alarms and the probability of misses, such as QIM watermarking techniques. However, such a technique is needed that is less susceptible to attacks and distortions to the local characteristics within a signal.
Described herein is a technology for deriving robust non-local characteristics and quantizing such characteristics for blind watermarking of a digital good.
This technology finds the proper balance between minimizing the probability of false alarms (i.e., detecting a non-existent watermark) and the probability of misses (i.e., failing to detect an existing watermark). One possible technique is quantization index modulation (QIM) watermarking. However, conventional QIM is susceptible to attacks and distortions to the local characteristics of a digital good.
The technology, described herein, performs QIM based upon non-local characteristics of the digital good. Non-local characteristics may include statistics (e.g., averages, median) of a group of individual parts (e.g., pixels) of a digital good.
This summary itself is not intended to limit the scope of this patent. Moreover, the title of this patent is not intended to limit the scope of this patent. For a better understanding of the present invention, please see the following detailed description and appending claims, taken in conjunction with the accompanying drawings. The scope of the present invention is pointed out in the appending claims.
The same numbers are used throughout the drawings to reference like elements and features.
In the following description, for purposes of explanation, specific numbers, materials and configurations are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent to one skilled in the art that the present invention may be practiced without the specific exemplary details. In other instances, well-known features are omitted or simplified to clarify the description of the exemplary implementations of present invention, thereby better explain the present invention. Furthermore, for ease of understanding, certain method steps are delineated as separate steps; however, these separately delineated steps should not be construed as necessarily order dependent in their performance.
The following description sets forth one or more exemplary implementations of a Derivation and Quantization of Robust Non-Local Characteristics for Blind Watermarking that incorporate elements recited in the appended claims. These implementations are described with specificity in order to meet statutory written description, enablement, and best-mode requirements. However, the description itself is not intended to limit the scope of this patent.
The inventors intend these exemplary implementations to be examples. The inventors do not intend these exemplary implementations to limit the scope of the claimed present invention. Rather, the inventors have contemplated that the claimed present invention might also be embodied and implemented in other ways, in conjunction with other present or future technologies.
An example of an embodiment of a Derivation and Quantization of Robust Non-Local Characteristics for Blind Watermarking may be referred to as “exemplary non-local QIM watermarker.”
Incorporation by Reference
The following co-pending patent applications are incorporated by reference herein (which are all assigned to the Microsoft Corportaion):
The one or more exemplary implementations, described herein, of the present claimed invention may be implemented (in whole or in part) by a non-local QIM watermarking architecture 100 and/or by a computing environment like that shown in
In general, the exemplary non-local QIM watermarker derives robust non-local characteristics of a digital good. It quantizes such characteristics for blind watermarking of the digital good.
The exemplary non-local QIM watermarker minimizes the probability of false alarms (i.e., detecting a non-existent watermark) and the probability of misses (i.e., failing to detect an existing watermark). It does so by employing quantization index modulation (QIM) watermarking. However, it does not employ conventional QIM techniques because they are susceptible to attacks and distortions to the local characteristics of a digital good.
Local Characteristics
Conventional QIM relies upon local characteristics within a signal (i.e., a digital good). To quantize, conventional QIM relies exclusively upon the values of “individual elements” of the host signal. When quantizing, only the local characteristics of an “individual element” are considered. These local characteristics may include value (e.g., color, amplitude) and relative positioning (e.g., positioning in time and/or frequency domains) of an individual bit (e.g., pixel).
Modifications—from either an attack or unintentional noise—can change local characteristics of a signal quite dramatically. For example, these modifications may have a dramatic affect on the color of a pixel or the amplitude of a bit of sound. However, such modifications have little effect on non-local characteristics of a signal.
Non-Local Characteristics
Non-local characteristics are representative of general characteristics of a group or collection of individual elements. Such a group may be called a segment. Non-local characteristics are not representative of the individual local characteristics of the individual elements; rather, they are representative of the group (e.g., segments) as a whole.
The non-local characteristics may be determined by a mathematical or statistical representation of a group. For example, it may be an average of the color values of all pixels in a group. Consequently, such non-local characteristics may also be called “statistical characteristics.” Local characteristics do not have statistical characteristics because a fixed value for a given category. Thus, no statistics are derived from a single value.
The non-local characteristics are not local characteristics. They are not global characteristics. Rather, they are in between. Consequently, they may also be called “semi-global” characteristics.
Brief Overview
Given an original, unmarked good, the exemplary non-local QIM watermarker derives robust characteristics that are not local in nature. For example, the exemplary non-local QIM watermarker may employ randomized non-invertible transforms to produce robust non-local characteristics that can be modified without perceptual distortion. These characteristics are typically represented statistically and/or mathematically.
To embed the watermark, the exemplary non-local QIM watermarker performs quantization of these non-local characteristics in one or more dimensional lattices or vector spaces. The marked good that results from the exemplary non-local QIM watermarker is robust against unintentional and intentional modifications (e.g., malicious attacks). Examples of malicious attacks include de-synching, random bending, and many other benchmark attacks (e.g., Stirmark attacks).
Exemplary Non-Local QIM Watermarking Architecture
The watermark embedding system 132 applies the watermark to a digital signal from the content storage 130. Typically, the watermark identifies the content producer 122, providing a signature that is embedded in the signal and cannot be cleanly removed.
The content producer/provider 122 has a distribution server 134 that distributes the watermarked content over the network 124 (e.g., the Internet). A signal with a watermark embedded therein represents to a recipient that the signal is being distributed in accordance with the copyright authority of the content producer/provider 122. The server 134 may further compress and/or encrypt the content conventional compression and encryption techniques prior to distributing the content over the network 124.
Typically, the client 126 is equipped with a processor 140, a memory 142, and one or more content output devices 144 (e.g., display, sound card, speakers, etc.). The processor 140 runs various tools to process the marked signal, such as tools to decompress the signal, decrypt the date, filter the content, and/or apply signal controls (tone, volume, etc.). The memory 142 stores an operating system 150 (such as a Microsoft® Windows 2000® operating system), which executes on the processor. The client 126 may be embodied in many different ways, including a computer, a handheld entertainment device, a set-top box, a television, an appliance, and so forth.
The operating system 150 implements a client-side watermark detecting system 152 to detect watermarks in the digital signal and a content loader 154 (e.g., multimedia player, audio player) to facilitate the use of content through the content output device(s) 144. If the watermark is present, the client can identify its copyright and other associated information.
The operating system 150 and/or processor 140 may be configured to enforce certain rules imposed by the content producer/provider (or copyright owner). For instance, the operating system and/or processor may be configured to reject fake or copied content that does not possess a valid watermark. In another example, the system could load unverified content with a reduced level of fidelity.
Exemplary Non-Local QIM Watermark Embedding System
The watermark embedding system 200 includes an amplitude normalizer 210, a transformer 220, a partitioner 230, segment-statistics calculator 240, a segment quantizer 250, a delta-sequence finder 260, and a signal marker 270.
The amplitude normalizer 210 obtains a digital signal 205 (such as an audio clip). It may obtain the signal from nearly any source, such as a storage device or over a network communications link. As its name implies, it normalizes the amplitude of the signal.
The transformer 220 receives the amplitude-normalized signal from the normalizer 210. The transformer 220 puts the signal in canonical form using a set of transformations. Specifically, discrete wavelet transformation (DWT) may be employed (particularly, when the input is an image) since it compactly captures significant signal characteristics via time and frequency localization. Other transformations may be used. For instance, shift-invariant and shape-preserving “complex wavelets” and any overcomplete wavelet representation or wavelet packet are good candidates (particularly for images).
The transformer 220 also finds the DC subband of the initial transformation of the signal. This DC subband of the transformed signal is passed to the partitioner 230.
The partitioner 230 separates the transformed signal into multiple, pseudorandomly sized, pseudorandomly positioned, adjacent, non-contiguous segments (i.e., partitions). A secret key K is the seed for pseudorandom number generation here. This same K may be used to reconstruct the segments by an exemplary non-local QIM watermark detecting system 400.
For example, if the signal is an image, it might be partitioned into two-dimensional polygons (e.g., rectangles) of pseudorandom size and location. In another example, if the signal is an audio clip, a two-dimensional representation (using frequency and time) of the audio clip might be separated into two-dimensional polygons (e.g., triangles) of pseudorandom size and location.
In this implementation, the segments do not overlap. They are adjacent and non-contiguous. In alternative implementations, the segments may be overlapping.
For each segment, the segment-statistics calculator 240 calculates statistics of the multiple segments generated by the partitioner 230. Statistics for each segment are calculated. These statistics may be, for example, any finite order moments of a segment. Examples of such include the mean, the median, and the standard deviation.
Generally, the statistics calculations for each segment are independent of the calculations of other segments. However, other alternatives may involve calculations dependent on data from multiple segments.
A suitable statistic for such calculation is the mean (e.g., average) of the values of the individual bits of the segment. Other suitable statistics and their robustness are discussed in Venkatesan, Koon, Jakubowski, and Moulin, “Robust image hashing,” Proc. IEEE ICIP 2000, Vancouver, Canada, September 2000. In this document, no information embedding was considered, but similar statistics were discussed.
For each segment, the segment quantizer 250 applies a multi-level (e.g., 2, 3, 4) quantization (i.e., high-dimensional, vector-dimensional, or lattice-dimensional quantization) on the output of the segment-statistics calculator 240 to obtain quantitized data. Of course, other levels of quantization may be employed. The quantizer 250 may be adaptive or non-adaptive.
This quantization may be done randomly also. This may be called randomized quantization (or randomized rounding). This means that the quantizer may randomly decide to round up or round down. It may do it pseudorandomly (using the secret key). This adds an additional degree of robustness and helps hide the watermark.
The delta-sequence finder 260 finds a pseudorandom sequence Z that estimates the difference (i.e., delta) between the original transformed signal X and the combination of segments of quantized statistics. The pseudorandom sequence Z may also be called the delta-sequence. For example, if the statistic is averaging, then Z s.t. Avgi(X+Z)=Avgi({circumflex over (X)})={circumflex over (μ)}i, where {circumflex over (X)} is the marked signal and {circumflex over (μ)}i is average for a segment.
When finding a delta-sequence Z, it is desirable to minimize the perceptual distortion; therefore, some perceptual distortion metrics may be employed to create this sequence. Thus, in creating Z for, the criteria may include a combination of that minimizes the visual distortions on X+Z (compared to {circumflex over (X)}) and minimizes the distance between the statistics of X+Z and quantized statistics of X.
The signal marker 270 marks the signal with delta-sequence Z so that {circumflex over (X)}=X+Z. The signal marker may mark the signal using QIM techniques. This marked signal may be publicly distributed to consumers and clients.
The functions of aforementioned components of the exemplary non-local QIM watermark embedding system 200 of
Methodological Implementation of the Exemplary Non-Local QIM Watermark Embedding
At 310 of
At 314, the exemplary non-local QIM watermarker partitions the transformed signal into multiple, pseudorandomly sized, pseudorandomly positioned, adjacent, non-contiguous segments. A secret key K is the seed for pseudorandom number generation here. This same K may be used to reconstruct the segments in the watermark detecting process.
For example, if the signal is an image, it might be separated into two-dimensional polygons (e.g., rectangles) of pseudorandom size and location. In another example, if the signal is an audio clip, a two-dimensional representation (using frequency and time) of the audio clip might be separated into two-dimensional polygons (e.g., trapezoids) of pseudorandom size and location.
In this implementation, the segments do not overlap. They are adjacent and non-contiguous. In alternative implementations, the segments may be overlapping.
For each segment of the signal, the exemplary non-local QIM watermarker repeats blocks 320 through 330. Thus, each segment is processed in the same manner.
At 322 of
At 324, the exemplary non-local QIM watermarker finds a pseudorandom sequence Z where the statistics calculation (e.g., averaging) of this sequence combined with the segmented signal generates the calculation of watermarked signal segment. This is also equal to the quanitized statistics of the signal segment.
At 330, the process loops back to 320 for each unprocessed segment. If all segments have been processed, then it proceeds to 340.
At 340 of
Exemplary Non-Local QIM Watermark Detecting System
The watermark detecting system 400 includes an amplitude normalizer 410, a transformer 420, a partitioner 430, segment-statistics calculator 440, a segment MAP decoder 450, a watermark-presence determiner 460, a presenter 470, and a display 480.
The amplitude normalizer 410, the transformer 420, the partitioner 430, and the segment-statistics calculator 440 of the watermark detecting system 400 of
For each segment, the segment MAP decoder 450 determines a quantized value using the MAP decoding scheme. In general, MAP techniques involve finding (and possibly ranking by distance) all objects (in this instance, quantized values) in terms of their distance from a “point.” In this example, the “point” is the statistics calculated for a given segment. Nearest-neighbor decoding is one specific instance of MAP decoding. Those of ordinary skill in the art understand and appreciate MAP and nearest-neighbor techniques.
In addition, the segment MAP decoder 450 determines a confidence factor based upon the distance between the quanitized values and the statistics calculated for a given segment. If they are coexistent, then factor will indicate a high degree of confidence. If they are distant, then it may indicate a low degree of confidence. Furthermore, the decoder 450 combines the confidence factors of the segments to get an overall confidence factor. That combination may be most any statistical combination (e.g., addition, average, median, standard deviation, etc.).
The watermark-presence determiner 460 determines whether a watermark is present. The determiner may use some distortion metric d(w,ŵ) and some threshold T to decide if watermark is present or not. A normalized Hamming distance may be used.
The presenter 470 presents one of three indications: “watermark present,” “watermark not present,” and “unknown.” If the confidence factor is low, it may indicate “unknown.” In addition, it may present the confidence-indication value.
This information is presented on the display 480. Of course, this display may be any output device. It may also be a storage device.
The functions of aforementioned components of the exemplary non-local QIM watermark detecting system 400 of
Methodological Implementation of the Exemplary Non-Local QIM Watermark Detecting
At 510 of
At 512, it finds a transform of the amplitude-normalized signal and gets significant frequency subband. That may be a low or the lowest subband (e.g., the DC subband). Generally, the subband selected may be one that represents the signal is a manner that helps further robust watermarking and detection of the watermark. The lower frequency subbands are suitable because they tend to remain relatively invariant after signal perturbation.
The result of this block is a transformed signal. When watermarking an image, an example of a suitable transformation is discrete wavelet transformation (DWT). When watermarking an audio clip, an example of a suitable transformation is MCLT (Modulated Complex Lapped Transform). However, most any other similar transformation may be performed in alternative implementations.
At 514, the exemplary non-local QIM watermarker partitions the transformed signal into multiple, pseudorandomly sized, pseudorandomly positioned, adjacent, non-contiguous segments. It uses the same secret key K is the seed for its pseudorandom number generation here. Therefore, this process generates the same segments as in the embedding process.
For each segment of the signal, the exemplary non-local QIM watermarker repeats blocks 520 through 530. Thus, each segment is processed in the same manner.
At 522 of
At 526, it measures how close each decoded value is to a quanitized value and tracks such measurements. The data provided by such measurements may provide an indication of the confidence of a resulting watermark-presence determination.
At 530, the process loops back to 520 for each unprocessed segment. If all segments have been processed, then it proceeds to 540.
At 540 of
At 544, based upon the watermark-presence determination of block 540 and the confidence-indication of block 544, the exemplary non-local QIM watermarker provides one of three indications: watermark present, watermark not present, and unknown. In addition, it may present the confidence-indication value.
At 550, the process ends.
Probability of False Alarms and Misses
There is a relationship between Probability of false alarm (PF), Probability of miss (PM), and the average sizes of the segments. PF is the probability of declaring that a watermark is present even though it is not. PM is the probability declaring that a watermark is not present although it is present indeed. Generally, the average segment size is relatively directly proportional to the PF, but it is relatively indirectly proportional to the PM.
For example, if the average segment-size is extremely small. So small that they are equivalent to the individual bits (e.g., equal to a pixel in an image. In that situation, watermarks are embedded in single coefficients, which is equivalent to the conventional schemes based on local characteristics. For such a case, PF is very small and PM is high. Conversely, if the average segment-size is extremely large. So large that it is the maximum size of the signal (e.g., the whole image). In that situation, PM is presumably very low whereas PF is high.
Other Implementation Details
For the following descriptions of an implementation of the exemplary non-local QIM watermarker, assume the following:
The input signal X is an image. Of course, for signal may be other types for other implementations. However, for this example implementation, the signal is an image. Let w be the watermark (a binary vector) to be embedded in X and a random binary string K be the secret key. The output of a watermark encoder, {circumflex over (X)} is the watermarked image in which the information w is hidden via the secret key K. This image will possibly undergo various attacks, yielding X. A watermark decoder, given {circumflex over (X)} and the secret key K outputs w. It is required that {circumflex over (X)} and X are approximately the same for all practical purposes and that {circumflex over (X)} is still of acceptable quality. The decoder uses some distortion metric d(w, ŵ) and some threshold T to decide if watermark is present or not. The exemplary non-local QIM watermarker use a normalized Hamming distance, which is the ratio of the usual Hamming distance and the length of the inputs.
WM via Quantization
The exemplary non-local QIM watermarker computes a vector μ by applying a forward transformation TF on X. The exemplary non-local QIM watermarker assumes that X is a grayscale image; if not, a linear transformation may be applied on a colored image to obtain the intensity image. (For colored images, most of the energy is concentrated in the intensity plane). Once the information hiding process is done via quantization, there is {circumflex over (μ)}, and the watermarked data {circumflex over (X)} are obtained by applying the transform TR on {circumflex over (μ)}. The exemplary non-local QIM watermarker randomizes the process using a random string derived from our secret key K as a seed to a pseudo-random generator.
The scheme is generic. So most any transform may be used. In addition, TF=(TR)−1 need not be true, or even that their inverses exist. The exemplary non-local QIM watermarker does not constrain the space where quantization occurs. The decoder applys the transform TF on the input, Y, to get the output is μY and applying approximate Maximum Likelihood (ML) estimation of the possibly embedded sequence wY. Through a pseudo-random generator, K will determine many randomization functions of the transform, quantization and estimation stages. Once ŵY is found, it is compared with the embedded watermark μi, if the distance between them is close to 0 (or less than a threshold T), then it is declared that the watermark is present; otherwise not present. This contrasts with the natural measure for spread spectrum techniques that yields a high value of correlation if the watermark is present.
More Details of this Methodological Implementation
The transforms TF and TR help enhance robustness. For TF to retain significant image characteristics, we may use discrete wavelet transformation (DWT) at the initial stage. Next, semi-global (i.e., non-local) statistics of segments are determined of the image. The local statistics are not robust.
When the exemplary non-local QIM watermarker computes, for example, the first order statistics of an original image signal and computes several attacked versions for random rectangles of fixed size. Then the average mean squared error between these statistics may be found. The error monotonically decays as the size of the rectangles is increased.
In TF, μ is set to be the estimated first order statistics of randomly chosen rectangles in the wavelet domain. Here TF is not invertible if the number of rectangles is less than the number of coefficients. To choose TR, one might first generate a pseudo-random sequence p in the image domain that has the property to be visually approximately unnoticeable, pass it through TF, find the corresponding statistics μp, compute the necessary scaling factors α such that the pseudorandom sequence p scaled by α and added to X yields the averages {circumflex over (μ)} that is quantized μ. This implementation uses randomly chosen non-overlapping rectangles. The exemplary non-local QIM watermarker defines two quantizers, Q0 and Q1, which map vectors in Rn to some nearby chosen lattice points.
Here is a formal description. Define Qi:={(qij, Sij)|jεZ}; qijεRn to be reconstruction points for Qi; Sij⊂ are the corresponding quantization bins; iε{0,1}, jεZ. For each j, then S0j∩[(∪k≠jS0k)∪(∪kS1k)]=S1j∩[(∪k≠jS1k)∪(∪kS0k)]=Ø. Here q0j, q1j are determined in a pseudo-random fashion (using K as the seed of the random number generator) and n is the dimension of quantization. The exemplary non-local QIM watermarker defines Q0[α]=q0j αεS0j, and likewise for Q1. Both Q0 and Q1 are derived from a single quantizer, Q, such that (a) the set of reconstruction points of Q is equal to {q0j}j∪{q1j}j and (b) for all j, k, minlεZdL
If, for example, n=1. The exemplary non-local QIM watermarker finds {E(q0j)} and {E(q1j)} meeting the requirements mentioned above. Then randomization regions that are neighborhoods in Rn around {E(q0j)} and {E(q1j)} are introduced. Then {q0j⇄} and {q1j} are randomly choosen from these regions using some suitable probability distributions. The sizes of the randomizing regions and their shapes are input parameters. Let nR be the number of rectangles each indexed suitably. Let R be the length of the binary watermark vector μi to be hidden in rectangles. For a vector, v denotes its i-th entry by v(i). Let L denote the number of levels of DWT that is applied.
Encoding
Let N be the number of pixels in X and S:=[s1, . . . , sN] be the vector that consists of pixels of X sorted in ascending order. Create subvectors sL:=[st, . . . , st′] where t:=round(N(1−β)γ) and t′=round (N(1+β)γ) and sH:=[su, su′] where u=round(N[1−(1−β)γ]) and u′=round(N[(1+β)γ]). Here round(r) equals the nearest integer to r and the system parameters 0<β, γ<<1. Let m and M be the mean values of the elements of S L and S H respectively. Apply the point operator P[x]=255*(x−m)/(M−m) to each element of X to get X′.
Find the L-level DWT of X′. Let X A be its DC subband.
Partition the XA into random non overlapping nR rectangles; calculate μ such that μ(i) is the mean value of the coefficients within rectangle i, i≦nR.
Let μi:=[μ(n(i−1)+1), . . . , μ(ni−1), μ(ni)], i≦R. Use Q0 and Q1 to quantize: for i≦R, if wi (i)=0, quantize μi using Q0 else quantize μi using Q1. Concatenating the quantized {μi} we get {circumflex over (μ)}.
Find the differences between the original statistics and the quantized statistics: d={circumflex over (μ)}−μ.
Generate a pseudo-random sequence p=[pij] in the spatial (i.e. the original image) domain as follows: Choose pij randomly and uniformly from {0, 1} if st′<Xij<st, where t′=round(Nγ) and t=round(N(1−γ)); otherwise set pij=0. Now apply L-level DWT to the matrix p, extract the DC subband of the output and call it pW. Now compute the corresponding statistics similar to the step
Compute the scaling factors α such that α(i)=d (i)/ μp(i), i≦nR.
For each rectangle whose index is i, multiply all coefficients of p W within that rectangle by α(i), let the resulting vector be {circumflex over (p)}W.
Apply inverse DWT on {circumflex over (p)}W, let the output be {circumflex over (p)}.
Compute the watermarked data: Xij=X+{circumflex over (p)}. Apply the inverse of the point operator (see step 1) on Xnij to get {circumflex over (X)}.
Decoding (Input Y):
Similar to first part of the encoding above, find the corresponding point operator on Y. Now apply the operator to Y and to the output apply L-level DWT, extract the DC subband and call it YA.
Applying the partitioning procedure of the encoding above to Y and find its statistics μY. Let μY(i) be the ith element.
Using Q0 and Q1, now is described how to carry out approximate ML estimation to find the decoded sequence wYi: Let μYi:=[μiY(n(i−1)+1), . . . , μY(ni−1), μY(ni)]. For rectangles indexed by n(i−1)+1, . . . , ni−1, ni, i≦R, let r0(i) be the closest point to μYi among reconstruction points of Q0; likewise let r1(i) be the closest point to μYi among reconstruction points of Q1. If dl2(μYi, r0(i))<dl2(μYi, r1(i)) then assign wiY(i)=0; otherwise μw iY(i)=1.
Compute d(μiY, μi). If the result is less than threshold T, declare that the watermark was detected, or otherwise not present.
Exemplary Computing System and Environment
The exemplary computing environment 900 is only one example of a computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the computer and network architectures. Neither should the computing environment 900 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the exemplary computing environment 900.
The exemplary non-local QIM watermarker may be implemented with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use include, but are not limited to, personal computers, server computers, thin clients, thick clients, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The exemplary non-local QIM watermarker may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The exemplary non-local QIM watermarker may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The computing environment 900 includes a general-purpose computing device in the form of a computer 902. The components of computer 902 can include, by are not limited to, one or more processors or processing units 904, a system memory 906, and a system bus 908 that couples various system components including the processor 904 to the system memory 906.
The system bus 908 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures can include an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnects (PCI) bus also known as a Mezzanine bus.
Computer 902 typically includes a variety of computer readable media. Such media can be any available media that is accessible by computer 902 and includes both volatile and non-volatile media, removable and non-removable media.
The system memory 906 includes computer readable media in the form of volatile memory, such as random access memory (RAM) 910, and/or non-volatile memory, such as read only memory (ROM) 912. A basic input/output system (BIOS) 914, containing the basic routines that help to transfer information between elements within computer 902, such as during start-up, is stored in ROM 912. RAM 910 typically contains data and/or program modules that are immediately accessible to and/or presently operated on by the processing unit 904.
Computer 902 may also include other removable/non-removable, volatile/non-volatile computer storage media. By way of example,
The disk drives and their associated computer-readable media provide non-volatile storage of computer readable instructions, data structures, program modules, and other data for computer 902. Although the example illustrates a hard disk 916, a removable magnetic disk 920, and a removable optical disk 924, it is to be appreciated that other types of computer readable media which can store data that is accessible by a computer, such as magnetic cassettes or other magnetic storage devices, flash memory cards, CD-ROM, digital versatile disks (DVD) or other optical storage, random access memories (RAM), read only memories (ROM), electrically erasable programmable read-only memory (EEPROM), and the like, can also be utilized to implement the exemplary computing system and environment.
Any number of program modules can be stored on the hard disk 916, magnetic disk 920, optical disk 924, ROM 912, and/or RAM 910, including by way of example, an operating system 926, one or more application programs 928, other program modules 930, and program data 932. Each of such operating system 926, one or more application programs 928, other program modules 930, and program data 932 (or some combination thereof) may include an embodiment of an amplitude normalizer, a transformer, a partitioner, a segment-statistics calculator, a segment quantizer, an delta-sequence finder, and a signal marker.
A user can enter commands and information into computer 902 via input devices such as a keyboard 934 and a pointing device 936 (e.g., a “mouse”). Other input devices 938 (not shown specifically) may include a microphone, joystick, game pad, satellite dish, serial port, scanner, and/or the like. These and other input devices are connected to the processing unit 904 via input/output interfaces 940 that are coupled to the system bus 908, but may be connected by other interface and bus structures, such as a parallel port, game port, or a universal serial bus (USB).
A monitor 942 or other type of display device can also be connected to the system bus 908 via an interface, such as a video adapter 944. In addition to the monitor 942, other output peripheral devices can include components such as speakers (not shown) and a printer 946 which can be connected to computer 902 via the input/output interfaces 940.
Computer 902 can operate in a networked environment using logical connections to one or more remote computers, such as a remote computing device 948. By way of example, the remote computing device 948 can be a personal computer, portable computer, a server, a router, a network computer, a peer device or other common network node, and the like. The remote computing device 948 is illustrated as a portable computer that can include many or all of the elements and features described herein relative to computer 902.
Logical connections between computer 902 and the remote computer 948 are depicted as a local area network (LAN) 950 and a general wide area network (WAN) 952. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet.
When implemented in a LAN networking environment, the computer 902 is connected to a local network 950 via a network interface or adapter 954. When implemented in a WAN networking environment, the computer 902 typically includes a modem 956 or other means for establishing communications over the wide network 952. The modem 956, which can be internal or external to computer 902, can be connected to the system bus 908 via the input/output interfaces 940 or other appropriate mechanisms. It is to be appreciated that the illustrated network connections are exemplary and that other means of establishing communication link(s) between the computers 902 and 948 can be employed.
In a networked environment, such as that illustrated with computing environment 900, program modules depicted relative to the computer 902, or portions thereof, may be stored in a remote memory storage device. By way of example, remote application programs 958 reside on a memory device of remote computer 948. For purposes of illustration, application programs and other executable program components such as the operating system are illustrated herein as discrete blocks, although it is recognized that such programs and components reside at various times in different storage components of the computing device 902, and are executed by the data processor(s) of the computer.
Computer-Executable Instructions
An implementation of an exemplary non-local QIM watermarker may be described in the general context of computer-executable instructions, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.
Exemplary Operating Environment
The operating environment is only an example of a suitable operating environment and is not intended to suggest any limitation as to the scope or use of functionality of the exemplary non-local QIM watermarker(s) described herein. Other well known computing systems, environments, and/or configurations that are suitable for use include, but are not limited to, personal computers (PCs), server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, programmable consumer electronics, wireless phones and equipments, general- and special-purpose appliances, application-specific integrated circuits (ASICs), network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
Computer Readable Media
An implementation of an exemplary non-local QIM watermarker may be stored on or transmitted across some form of computer readable media. Computer readable media can be any available media that can be accessed by a computer. By way of example, and not limitation, computer readable media may comprise “computer storage media” and “communications media.”
“Computer storage media” include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer.
“Communication media” typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as carrier wave or other transport mechanism. Communication media also includes any information delivery media.
The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared, and other wireless media. Combinations of any of the above are also included within the scope of computer readable media.
Although the invention has been described in language specific to structural features and/or methodological steps, it is to be understood that the invention defined in the appended claims is not necessarily limited to the specific features or steps described. Rather, the specific features and steps are disclosed as preferred forms of implementing the claimed invention.
This application is a continuation of and claims priority to U.S. patent application Ser. No. 09/843,279, filed Apr. 24, 2001 now U.S. Pat. No. 7,020,775, the disclosure of which is incorporated by reference herein.
Number | Name | Date | Kind |
---|---|---|---|
4773039 | Zamora | Sep 1988 | A |
5093869 | Alves et al. | Mar 1992 | A |
5210820 | Kenyon | May 1993 | A |
5351310 | Califano et al. | Sep 1994 | A |
5425081 | Gordon et al. | Jun 1995 | A |
5465353 | Hull et al. | Nov 1995 | A |
5490516 | Hutson | Feb 1996 | A |
5535020 | Ulichney | Jul 1996 | A |
5613004 | Cooperman et al. | Mar 1997 | A |
5664016 | Preneel et al. | Sep 1997 | A |
5687236 | Moskowitz et al. | Nov 1997 | A |
5689639 | Schwarz | Nov 1997 | A |
5734432 | Netravali et al. | Mar 1998 | A |
5774588 | Li | Jun 1998 | A |
5802518 | Karaev et al. | Sep 1998 | A |
5809498 | Lopresti et al. | Sep 1998 | A |
5835099 | Marimont | Nov 1998 | A |
5862260 | Rhoads | Jan 1999 | A |
5875264 | Carlstrom | Feb 1999 | A |
5899999 | De Bonet | May 1999 | A |
5915038 | Abdel-Mottaleb et al. | Jun 1999 | A |
5918223 | Blum et al. | Jun 1999 | A |
5953451 | Syeda-Mahmood | Sep 1999 | A |
5983351 | Glogau | Nov 1999 | A |
6075875 | Gu | Jun 2000 | A |
6081893 | Grawrock et al. | Jun 2000 | A |
6101602 | Fridrich | Aug 2000 | A |
6131162 | Yoshiura et al. | Oct 2000 | A |
6246777 | Agarwal et al. | Jun 2001 | B1 |
6249616 | Hashimoto | Jun 2001 | B1 |
6278385 | Kondo et al. | Aug 2001 | B1 |
6314192 | Chen et al. | Nov 2001 | B1 |
6321232 | Syeda-Mahmood | Nov 2001 | B1 |
6330672 | Shur | Dec 2001 | B1 |
6363381 | Lee et al. | Mar 2002 | B1 |
6363463 | Mattison | Mar 2002 | B1 |
6370272 | Shimizu | Apr 2002 | B1 |
6377965 | Hachamovitch et al. | Apr 2002 | B1 |
6385329 | Sharma et al. | May 2002 | B1 |
6401084 | Ortega et al. | Jun 2002 | B1 |
6418430 | DeFazio et al. | Jul 2002 | B1 |
6425082 | Matsui et al. | Jul 2002 | B1 |
6477276 | Inoue et al. | Nov 2002 | B1 |
6513118 | Iwamura | Jan 2003 | B1 |
6522767 | Moskowitz et al. | Feb 2003 | B1 |
6532541 | Chang et al. | Mar 2003 | B1 |
6546114 | Venkatesan et al. | Apr 2003 | B1 |
6574348 | Venkatesan et al. | Jun 2003 | B1 |
6574378 | Lim | Jun 2003 | B1 |
6584465 | Zhu et al. | Jun 2003 | B1 |
6625295 | Wolfgang et al. | Sep 2003 | B1 |
6628801 | Powell et al. | Sep 2003 | B2 |
6647128 | Rhoads | Nov 2003 | B1 |
6654740 | Tokuda et al. | Nov 2003 | B2 |
6658423 | Pugh et al. | Dec 2003 | B1 |
6658626 | Aiken | Dec 2003 | B1 |
6671407 | Venkatesan et al. | Dec 2003 | B1 |
6674861 | Xu et al. | Jan 2004 | B1 |
6687416 | Wang | Feb 2004 | B2 |
6700989 | Itoh et al. | Mar 2004 | B1 |
6701014 | Syeda-Mahmood | Mar 2004 | B1 |
6725372 | Lewis et al. | Apr 2004 | B1 |
6751343 | Ferrell et al. | Jun 2004 | B1 |
6754675 | Abdel-Mottaleb et al. | Jun 2004 | B2 |
6768809 | Rhoads et al. | Jul 2004 | B2 |
6768980 | Meyer et al. | Jul 2004 | B1 |
6769061 | Ahern | Jul 2004 | B1 |
6771268 | Crinon | Aug 2004 | B1 |
6782361 | El-Maleh et al. | Aug 2004 | B1 |
6799158 | Fischer et al. | Sep 2004 | B2 |
6839673 | Choi et al. | Jan 2005 | B1 |
6864897 | Brand | Mar 2005 | B2 |
6879703 | Lin et al. | Apr 2005 | B2 |
6901514 | Iu et al. | May 2005 | B1 |
6907527 | Wu | Jun 2005 | B1 |
6965898 | Aono et al. | Nov 2005 | B2 |
6971013 | Mihcak et al. | Nov 2005 | B2 |
6973574 | Mihcak et al. | Dec 2005 | B2 |
6990444 | Hind et al. | Jan 2006 | B2 |
6990453 | Wang et al. | Jan 2006 | B2 |
6996273 | Mihcak et al. | Feb 2006 | B2 |
7062419 | Grzeszczuk et al. | Jun 2006 | B2 |
7095873 | Venkatesan et al. | Aug 2006 | B2 |
7142675 | Cheng et al. | Nov 2006 | B2 |
7152163 | Mihcak et al. | Dec 2006 | B2 |
7171339 | Repucci et al. | Jan 2007 | B2 |
7188065 | Mihcak et al. | Mar 2007 | B2 |
20010010333 | Han et al. | Aug 2001 | A1 |
20020126872 | Brunk et al. | Sep 2002 | A1 |
20020172394 | Venkatesan et al. | Nov 2002 | A1 |
20020196976 | Mihcak et al. | Dec 2002 | A1 |
20030056101 | Epstein | Mar 2003 | A1 |
20030095685 | Tewfik et al. | May 2003 | A1 |
20030118208 | Epstein | Jun 2003 | A1 |
20030169269 | Sasaki et al. | Sep 2003 | A1 |
20030190054 | Troyansky et al. | Oct 2003 | A1 |
20030219144 | Rhoads et al. | Nov 2003 | A1 |
20040100473 | Grzeszczuk et al. | May 2004 | A1 |
20050015205 | Repucci et al. | Jan 2005 | A1 |
20050065974 | Mihcak et al. | Mar 2005 | A1 |
20050071377 | Mihcak et al. | Mar 2005 | A1 |
20050076229 | Mihcak et al. | Apr 2005 | A1 |
20050084103 | Mihcak et al. | Apr 2005 | A1 |
20050165690 | Liu et al. | Jul 2005 | A1 |
20050180500 | Chiang et al. | Aug 2005 | A1 |
Number | Date | Country |
---|---|---|
1279849 | Dec 2004 | CN |
0581317 | Feb 1994 | EP |
1 253 784 | Oct 2002 | EP |
1553780 | Jul 2005 | EP |
11-041571 | Feb 1999 | JP |
11098341 | Apr 1999 | JP |
2000-004350 | Jan 2000 | JP |
2000050057 | Feb 2000 | JP |
2000149004 | May 2000 | JP |
2000261655 | Sep 2000 | JP |
2000332988 | Nov 2000 | JP |
2000350007 | Dec 2000 | JP |
WO 9917537 | Apr 1999 | WO |
WO 9918723 | Apr 1999 | WO |
WO 9960514 | Nov 1999 | WO |
WO 0111890 | Feb 2001 | WO |
WO 0128230 | Apr 2001 | WO |
WO 0237331 | May 2002 | WO |
Number | Date | Country | |
---|---|---|---|
20050125671 A1 | Jun 2005 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09843279 | Apr 2001 | US |
Child | 11012922 | US |