Digital watermarking is a technology which allows undetectable information to be hidden in an electronic file. The presence of the watermark is not apparent to a user and generally does not negatively affect the electronic file to which it is added. The watermark information is typically used to identify the originator of the electronic file, but can be used for other purposes, such as to confirm a legitimate user of the electronic file, to determine whether certain content has been aired or played, or many other uses.
Digital audio watermarking conceals a watermark in a digital audio file. In many instances, the digital audio file is a discrete audio file. In some instances, digital audio watermarks are being used by entities in the television transmission area to provide revenue outside of agreements between broadcasters and television networks or studios. Many times, a broadcaster or television network would like to know whether content supplied by such an entity contains an audio watermark and to prevent detection of an audio watermark by a third party device.
There are a number of different methodologies available to conceal an audio watermark in an audio file. Some of these methodologies include direct current (DC) watermarking, phase encoding, spread spectrum watermarking and echo watermarking. Direct current (DC) watermarking involves concealing the watermark data in lower frequency components of an audio signal. The lower frequency components are below the threshold of human perception.
Phase encoding conceals the watermark data by encoding the watermark data in an artificial phase signal. Spread spectrum watermarking uses direct sequence spread spectrum (DSSS) to spread the watermark data signal over the entire audible frequency spectrum such that it approximates white noise. Echo watermarking conceals watermark data by distorting an audio signal in a way that causes the human auditory system to perceive the watermarked audio file as environmental distortion. Spread spectrum watermarking is one of the more widely used digital audio watermarking techniques.
These watermarking methodologies generally require complex systems to implement and detect. Therefore, there is a need for a way of efficiently and easily preventing audio watermark detection.
Embodiments of a system to prevent audio watermark detection include content having a video portion and an audio portion, the audio portion having a watermark, an audio/video separator configured to separate the video portion and the audio portion, and a random number generator configured to generate a random number corresponding to a shifted frequency. The system also includes a frequency shift element configured to apply the shifted frequency to the audio portion to alter a spectrum of the watermark so as to prevent detection of the watermark by a device seeking to recover the watermark. The system also includes an audio resampler configured to resample the audio portion to restore the audio portion to an original length, and an audio/video combiner configured to combine the video portion and the audio portion.
Other embodiments are also provided. Other systems, methods, features, and advantages of the invention will be or become apparent to one with skill in the art upon examination of the following figures and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the invention, and be protected by the accompanying claims.
The invention can be better understood with reference to the following figures. The components within the figures are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the invention. Moreover, in the figures, like reference numerals designate corresponding parts throughout the different views.
The system and method to prevent audio watermark detection can be implemented in a number of systems, including in a video server. While the system and method to prevent audio watermark detection will be described herein as being implemented in a video playout server located at a network, the system and method to prevent audio watermark detection can be implemented in any device that processes an audio signal that may contain an audio watermark.
The system and method to prevent audio watermark detection can be implemented in hardware, software, or a combination of hardware and software. When the system and method to prevent audio watermark detection is implemented in software, as in one or more preferred embodiments, the software processes an audio file to alter a spectrum of the watermark so as to prevent a device from detecting the watermark. The software can be stored in a memory and executed by a suitable instruction execution system (e.g., a microprocessor).
The software for the system and method to prevent audio watermark detection comprises an ordered listing of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions.
In the context of this document, a “computer-readable medium” can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The computer-readable medium can be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: a portable computer diskette (magnetic), a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory) (magnetic), an optical fiber (optical), and a portable compact disc read-only memory (CDROM) (optical). Note that the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
The network 102 includes a network video source 108, which provides network video source programming to a video playout server 200 over connection 126. The video playout server 200 processes the network video programming and provides the network video programming over connection 128. The signal on connection 128 is a network audio/video feed and is provided to the satellite communication uplink station 132 for transmission.
The video playout server 200 includes an embodiment of the system and method to prevent audio watermark detection. In accordance with an embodiment of the system and method to prevent audio watermark detection, the video playout server 200 receives content that includes an audio watermark. The content that includes an audio watermark can be in the form of a watermarked content file 112, in the form of watermarked content 114, or any other watermarked content. The watermarked content file 112 is typically an electronic file that includes electronic content having an audio portion and a video portion. The audio portion may include an audio watermark. As an example, the watermarked content file 112 can be a television commercial, or other content, that is provided to the network 102 as an Internet Protocol (IP) file, or any other type of electronic file.
The watermarked content 114 is typically an analog or digital tape that includes content having an audio portion and a video portion. The audio portion may include an audio watermark. As an example, the watermarked content 114 can be a television commercial, or other content, that is provided to the network 102 in the form of a video tape.
The watermarked content file 112 is provided to a video catch server 104 over connection 116. The video catch server 104 can be implemented as a computing device that receives the electronic files representing the watermarked content file 112 via the World Wide Web (WWW), via a dedicated IP connection, or via another suitable connection. The video catch server 104 provides the watermarked content file 112 over connection 122 to the video playout server 200. The connection 122 can be implemented as an IP data connection, or another suitable data connection.
The watermarked content 114 is provided to a video tape recorder 106 over connection 118. The video tape recorder 106 can be implemented as a dedicated video tape playback device that plays the watermarked content 114 when the watermarked content 114 is provided via a conventional video tape. The video tape recorder 106 provides the watermarked content 114 over connection 124 to the video playout server 200. The connection 124 can be implemented as an audio/video connection, or another suitable connection.
The video playout server 200 includes an audio/video file intake element 202, which receives the output of the video catch server 104 (
When the watermarked content is provided as an electronic file, such as the watermarked content file 112, the audio/video file intake element 202 processes the watermarked content file 112 and provides an electronic version of the watermarked content file 112 over connection 206 to the audio/video separator 212. In an embodiment, the audio/video file intake element 202 can be, for example, an Ethernet IP data connection through which the watermarked content file 112 is introduced to the video playout server 200 in its native format and stored in the memory 226.
When the watermarked content is provided conventionally as an audio/video segment on tape, the audio/video intake element of 204 processes the watermarked content 114 and provides the audio and video of the watermarked content 114 to the audio/video separator 212 over connection 208. In an embodiment, the audio/video intake element 204 can be, for example, an SDI (Serial Digital Interface) audio/video connection that presents the watermarked content 114 in a real-time audio/video stream. A file 228 is created on the video playout server 200, typically in the memory 226 and the watermarked content 114 is captured into that file 228. Depending on the resolution of the video playout server 200, the audio/video content may be re-sampled to a lower or higher resolution (bit-depth).
The audio/video separator 212 separates the content into an audio portion and a video portion and provides the audio portion over connection 214 to the audio watermark reposition system 250. The video portion is provided over connection 216 such that it bypasses the audio watermark reposition system 250. The function of the audio/video separator 212 is the same regardless of whether or not the audio is watermarked or whether or not the audio is delivered as part of a streaming audio/video stream or as a file.
The audio watermark reposition system 250 includes a frequency shift element 252, and an audio resampler 258. The frequency shift element 252 includes a random number generator 254 and a band pass filter (BPF) 255. In accordance with an embodiment of the system and method to prevent audio watermark detection, the random number generator 254 is used to generate a random number that corresponds to an analog or digital frequency shift. The frequency shift is applied to the watermarked audio portion on connection 214 by the frequency shift element 252. If the audio signal on connection 214 includes an audio watermark, the application of the frequency shift will alter a spectrum of the watermark so as to prevent detection of the watermark by the device 144 (
In an embodiment, the frequency shift is continuously variable in that the random number generator 254 can continually generate different random numbers so as to discretely change and/or continuously vary the frequency shift. For example, the randomly generated number, and corresponding time or frequency shift can vary on the order of 0.2%, 0.4%, etc, or according to other orders. The frequency shift applied to the audio portion by the frequency shift element 252 can be applied in the time domain or in the frequency domain. With regard to a time domain frequency shift, the duration of the audio portion is changed. This change in duration can be expressed as expanding or contracting the length of the audio portion by a desired percentage. For example, a 30 second (900 frames of video at 30 frames per second) audio portion expanded by, for example, 0.2% would have an approximate duration of 30 seconds and 2 frames, or 30.066 seconds. Changing the duration of the audio portion also changes the duration of a watermark located within the audio portion. Changing the duration of the watermark changes the wavelength of the watermark, which changes the frequency of the watermark, thereby corrupting the watermark and making detection difficult.
With regard to a frequency domain shift, the audio portion is typically passed through the bandpass filter 255 to select the frequency range of audio to shift. For example, a typical filter pass band can be 0 to 5 Kilohertz (KHz). The construction of a bandpass filter having a pass band of 0 to 5 KHz is known to those skilled in the art. Then, the frequency of the audio portion can be shifted by a percentage of the pass band. For example, 0.2% of a 5 KHz pass band would equate to a frequency shift of approximately 10 Hz, thereby relocating the watermark in frequency and making detection improbable.
The audio signal having the spectrum of the audio watermark altered is provided over connection 256 to an audio resampler 258. The audio resampler 258 resamples the audio to return the audio portion on connection 256 to the proper length. For example, the audio resampler 258 can resample the audio portion on connection 256 at a 48 kHz sampling rate to restore the audio portion to its original length. While other resampling frequencies can be used, 48 KHz is a generally accepted frequency sampling rate for audio/video servers. Other frequency sampling rates exist for other applications such as 44 KHz for CD authoring. This process would work equally well in those domains.
The resampled audio signal is provided over connection 218 to an audio/video combiner 220. The audio/video combiner 220 combines the video signal on connection 216 with the resampled audio signal on connection 218 and provides the combined signal over connection 128.
In block 302, watermarked content is received by the video playout server 200. The watermarked content can be in the form of a program element file, such as the watermarked content file 112 (
In block 306, the random number generator 254 is used to generate a random number that corresponds to a frequency shift. The frequency shift is selected from a predetermined range, and can be, for example, a percentage of the duration of the audio portion, or can be a percentage of the frequency pass band of the audio portion. In an embodiment, the random number generator 254 selects a number from a range of numbers. The range of numbers correlates to a series of pre-selected frequency shifts that would then be applied to the watermarked audio content. As mentioned above, the frequency shifts can be, for example, a percentage of the duration of the audio portion (e.g., 0.2%) or a percentage of the pass band of the audio portion (e.g., 0.2% of a 5 KHz pass band). Other percentages can be applied, depending on application.
In block 308, it is determined whether the audio portion, also referred to as the audio selection, exceeds a predetermined length. Depending on the length of the audio portion, one or more different frequency shifts might be applied to different portions of the selected audio portion to alter a spectrum of watermark. A typical predetermined length of time for the audio portion may be one minute. However, other predetermined periods of time are possible. If it is determined in block 308 that the audio selection does not exceed the predetermined length of time, which can be, for example one minute, then, in block 312, the frequency shift element 252 applies a the frequency shift to the audio portion based on the frequency shift selected in block 306. Applying the frequency shift to an audio portion that contains an audio watermark alters a spectrum of the watermark, thus making subsequent detection of the audio watermark by the device 144 (
The magnitude of the frequency shift between the watermark tone 552 and the watermark tone 522 (and similarly, the series of watermark tones 550 relative to the series of watermark tones 520) is determined by the value of the frequency shift determined in block 306 and applied to the audio portion in block 312. In this manner, the watermark tone 552 (and all the watermark tones in the series of watermark tones 550) appears at a frequency that is different than the original frequency, which makes detection by an external device (e.g., device 144 of
In the example shown in
The spectrum of the watermark 656 is also expanded in time so that the duration of the watermark 656 is lengthened by 0.033 seconds, illustrated at 658. In this manner, the frequency of the watermark 656 is sufficiently altered so as to corrupt the watermark, thereby making detection difficult.
Returning now to
If, in block 308 it was determined that the audio selection exceeds the predetermined length of time, then, the process proceeds to
In block 402, the frequency shift element 252 applies a frequency shift to the audio portion based on the frequency shift selected in block 306. Applying the frequency shift to an audio portion that contains an audio watermark alters a spectrum of the watermark, thus making subsequent detection of the audio watermark by the device 144 (
In block 404 is determined whether the predetermined time period has elapsed. If the predetermined time period has not elapsed, then the process returns to block 402. If it is determined in block 404 that the predetermined time period has elapsed, then, in block 406, the random number generator 254 is used to generate a different random number that corresponds to a different frequency shift, and the frequency shift element 252 applies the different frequency shift to the audio portion based on the newly selected frequency shift.
In block 408 it is determined whether the predetermined time period has elapsed, if the predetermined time period has not elapsed, then the process returns to block 406. If it is determined in block 408 that the predetermined time period has elapsed, then, in block 412 it is determined whether the audio clip is complete. If it is determined in block 412 that the audio clip is not complete, then the process returns to block 406. If it is determined in block 412 that the audio clip is complete, then the process proceeds to block 414, where the audio portion is resampled using a sampling rate of 48 KHz to restore the audio clip to its original length. In block 416, the audio/video combiner 220 recombines the audio portion and the video portion. In block 418, the video playout server 200 provides the content having the spectrum of the audio watermark altered over connection 128.
While various embodiments of the invention have been described, it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible that are within the scope of the invention.
Number | Name | Date | Kind |
---|---|---|---|
6442283 | Tewfik et al. | Aug 2002 | B1 |
7031493 | Fletcher et al. | Apr 2006 | B2 |
7206649 | Kirovski et al. | Apr 2007 | B2 |
7266697 | Kirovski et al. | Sep 2007 | B2 |
7272718 | Matsumura et al. | Sep 2007 | B1 |
7299189 | Sato | Nov 2007 | B1 |
20030026422 | Gerheim et al. | Feb 2003 | A1 |
20040120523 | Haitsma et al. | Jun 2004 | A1 |
20060251252 | Quan | Nov 2006 | A1 |
20080209219 | Rhein | Aug 2008 | A1 |
Entry |
---|
Byeong-Seob Ko, et al.; Log Scaling Watermark Detection in Digital Audio Watermarking; REIC/GSIS, Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai, Japan; ICASSP, pp. III-81 to III-84, IEEE 2004. |
An Affine Resistant Watermarking Scheme for Audio Signals; http://www.cmlab.csie.ntu.edu.tw/˜dynamic/AWM/index.html; pp. 1-3. |
p2pnet; Stealthy Audio Watermarking: DRM; http://222.p2pnet/story/13310; pp. 1-7, Sep. 2007. |
Joseph R. Kardamis; Audio Watermarking Techniques Using Singular Value Decomposition; Rochester Institute of Technology; Jun. 5, 2007; pp. 1-81. |
Martin Steinebach et al.; StirMark Benchmark: Audio watermarking attacks; IEEE 2001; pp. 49-54. |
Number | Date | Country | |
---|---|---|---|
20110243327 A1 | Oct 2011 | US |