The present invention relates generally to data communications, and more particularly to communications systems and approaches involving use of a projection of the media for authenticating media.
Secure data communication is important in many communication environments, and in particular to network-based communications, to ensure that received data is legitimate. For instance, media authentication is important in many applications of content delivery via untrusted intermediaries, such as peer-to peer (P2P) file sharing or P2P multicast streaming. In these applications, many differently encoded versions of the original media file might exist. Moreover, transcoding and bitstream truncation at intermediate nodes might be required, giving rise to further diversity. On the other hand, intermediaries might tamper with the contents for a variety of reasons, such as interfering with the distribution of a particular file, piggybacking unauthentic content, or generally discrediting a particular distribution system.
Distinguishing the legitimate diversity of encodings from malicious manipulation is a major technical challenge for media authentication systems, and is particularly challenging in environments involving lossy communications. Generally, lossy communication refers to the communication of data in which some data is lost or changed, such as during compression (e.g., JPEG) or manipulation, and lossy channels are those involving data transfer that is characterized by lossy communication. Example channels that employ lossy communications include packet-based channels such as the Internet, mobile device networks and telephone networks. Past approaches to distinguishing legitimate variations from malicious manipulation have included the use of watermarks and media hashes.
A “fragile” watermark can be embedded into the host signal waveform without perceptual distortion. Users can confirm the authenticity by extracting the watermark from the received content. The system design should ensure that the watermark survives lossy compression, but that it “breaks” as a result of a malicious manipulation. Unfortunately, watermarking authentication is not backward compatible with previously encoded contents; unmarked contents cannot be authenticated later. Embedded watermarks might also increase the bit-rate required when compressing a media file.
Media hashing achieves verification of previously encoded media (as well as localization of tampering) by using an authentication server to supply authentication data to the user. Media hashes are inspired by cryptographic digital signatures, but unlike cryptographic hash functions, media hash functions are supposed to offer proof of perceptual integrity. Using a cryptographic hash, a single bit difference leads to an entirely different hash value. If two media signals are perceptually indistinguishable, they should have identical hash values. A common approach of media hashing is extracting features that have perceptual importance and should survive compression. Authentication data is generated by compressing these features or generating their hash values. The user checks the authenticity of the received content by comparing the features or their hash values to the authentication data. However, limitations in this approach relating to compression and otherwise have hindered the successful application of hashing to media authentication.
These and other issues have presented challenges to the communication of data, and particularly to the communication of legitimate media data.
The present invention is exemplified in a number of implementations and applications, including embodiments directed to addressing the above-mentioned issues, and some of which are summarized below.
According to an example embodiment of the present invention, media is authenticated using a projection of the media. A distributed encoding of the projection is provided and algorithmically decoded using the media and an editing characteristic for the media, to provide a decoded projection. A condition of authenticity of the media is determined based on the projection of the media and the decoded projection. In some embodiments, the projection has a size that is a function of an editing characteristic of the media, such as that relating to compression or other editing.
According to another example embodiment of the present invention, media is authenticated as follows. A projection of media is generated and processed through a cryptographic hash function to generate a media digest. The media digest is sent to a media recipient together with a distributed encoding of the projection. At the recipient, the encoded projection is decoded using the media as side data together with an editing characteristic of the media to provide a decoded projection therefrom. This decoded projection has characteristics relative to a degree of editing or distortion of the media (side data). The decoded projection is processed through a cryptographic hash function to generate a media digest of the decoded projection, which is compared with the media digest of the projection of the media. A condition of authenticity of the media is determined in response to this comparison.
The above summary is not intended to describe each illustrated embodiment or every implementation of the present invention. The figures, detailed description and claims that follow more particularly exemplify these embodiments.
The invention may be more completely understood in consideration of the detailed description of various embodiments of the invention that follows in connection with the accompanying drawings, in which:
While the invention is amenable to various modifications and alternative forms, specifics thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit the invention to the particular embodiments described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.
The present invention is believed to be useful for a variety of authentication and coding applications, and the invention has been found to be particularly suited for use with media authentication approaches and systems. While the present invention is not necessarily limited to such applications, various aspects of the invention may be appreciated through a discussion of various examples using this context.
According to an example embodiment of the present invention, communicated media is authenticated using a projection of the media, an encoding of the projection, and the media-to-be-authenticated. The encoding is decoded using the media-to-be-authenticated as side information, together with information characterizing editing characteristics of the media-to-be-authenticated. A degree of variation or distortion of the decoding, relative to the projection, is used to determine the legitimacy of the media-to-be-authenticated.
In a more particular embodiment, an extension of hashing for media authentication is based upon distributed source coding. An authentication server provides a user with a Slepian-Wolf encoded media waveform projection (bitstream), and the user attempts to decode this bitstream using media-to-be-authenticated as side information. The Slepian-Wolf result indicates that the lower the distortion between the side information and the original (from which the media-to-be-authenticated is originated), the fewer authentication bits are required for correct decoding. By correctly choosing the size of the authentication data, legitimate encoding variations of the media (e.g., due to compression and reconstruction) are distinguished from illegitimate modifications. That is, accurate projection decoding using media-to-be-authenticated that is highly-correlated to the original source requires a low rate of authentication data. However, if the media-to-be-authenticated is poorly-correlated due to illegitimate editing, the same low rate will be insufficient for use as side data and for corresponding authentication.
These approaches and others described herein are applicable to implementation with previously encoded media, regardless of the transmission or storage format. In addition, these approaches are applicable to the authentication of various types of media content such as images, audio, voice, video and 3D graphics. Such authentication is amenable to use in applications including those relating to creative works, surveillance data, and content distribution, with approaches to the latter involving both user-based verification that content is in fact a sincere version of the original, and content distributor-based assurance that its media is being redistributed without dramatic modifications such as unauthorized advertisements.
In connection with various example embodiments, the term projection or projection of media is used to characterize a portion of media, such as an image, audio or video. In this regard, a portion (projection) of media data is used to facilitate the detection or determination of a condition of authenticity of the media from which the projection was taken or derived. Such a condition of authenticity may involve, for example, detecting variation in the media due to unauthorized tampering, or variation in the media due to expected editing characteristics as described herein.
Turning now to the figures,
A comparator 150 determines a condition of authenticity of the media-to-be-authenticated based upon the projection 120 and the decoded projection 142. For instance, where the decoded projection 142 corresponds to the projection 120 with expected editing characteristics, the comparator 150 determines that the media-to-be-authenticated is legitimate. However, where the decoded projection corresponds to the projection 120 with expected editing characteristics as well as other characteristics relating to an illegitimate modification of the media-to-be-authenticated, the comparator 150 determines that the media-to-be authenticated is illegitimate.
The encoder arrangement 202 uses a projection 210 of the media to be authenticated 205 to generate authentication data. Specifically, a distributed encoder 220 encodes the projection 210 and sends an encoded projection 222 over the communications channel 204 to a recipient employing the decoder arrangement 206. A hash function/encryptor 230 generates a cryptographic hash of the projection 210, encrypts the hash and sends the encrypted hash 232 over the communications channel 204 to a recipient employing the decoder arrangement 206.
A decoder function 240 at the decoder arrangement 206 uses media to be authenticated 205 as side information to decode the projection 222 and generate a decoded projection 242 that is sent to a hash function 244. In some applications, the decoder arrangement 206 uses an editing characteristic of the media to be authenticated 205, such as that relating to compression or filtering, in generating the decoded projection 242. The hash function 244 generates a decoded hash 246 and sends the decoded hash to a binary comparator 260. A decryptor 250 at the decoder arrangement 206 decrypts the encrypted hash 232 to generate decrypted hash 252, and sends the decrypted hash to the binary comparator 260. The binary comparator 260 compares the decoded hash 246 with the decrypted hash 252 and generates an output 270 that is responsive to a degree of distortion in the media to be authenticated 205, thus providing an indication of the authenticity of the media to be authenticated (and any illegitimate characteristics thereof).
In various applications, the size of the projection 210 is set in accordance with an expected or acceptable degree of distortion in the media to be authenticated 205, and in this context, is selectively generated by the encoder arrangement 202. For example, where a known or estimated amount of distortion is determined for a particular set or type of media, the size of the projection 210 is chosen to facilitate favorable decoding at the decoder 240 when the media to be authenticated 205 exhibits an amount of distortion up to this known or estimated amount. Correspondingly, at or below the known or estimated amount of distortion, the comparator 260 generates an output 270 that is favorable (i.e., indicates that the media to be authenticated is legitimate). In this regard, if illegitimate data is included with the media to be authenticated, such as an advertisement or malicious data, the amount of distortion in the media to be authenticated is beyond the known or estimated amount. This distortion is reflected in the decoded projection 242 and, correspondingly, in the comparator output 270.
More specifically and referring to left-hand side of
The authentication arrangement 304 (e.g., a recipient/decoder), shown in the right-hand side of
A pseudorandom projection based on a randomly drawn seed Ks is applied to the original media x at block 320 and the projection coefficients are quantized at block 322 to yield a projection X. In some implementations, the random projection is blockwise and the block partition is fixed, pseudorandomly assigned or content adaptive. In addition, the quantization intervals at block 322 are fixed or pseudorandomly dithered in different applications.
A Slepian-Wolf encoder 330 derives a Slepian-Wolf bitstream S(X) from X based on rate-adaptive low-density parity-check (LDPC) codes. A cryptographic hash function 340 generates a cryptographic hash value of X (a media digest) and an asymmetric encryptor 350 signs the hash value with a private key to generate a digital signature D(X, Ks) that includes the signed hash and a seed Ks. For general information regarding distributed source encoding, and for specific information regarding approaches to Slepian-Wolf (distributed source) encoding as may be implemented with these or other embodiments, reference may be made to D. Varodayan, A. Aaron, and B. Girod, “Rate-adaptive codes for distributed source coding,” EURASIP Signal Processing Journal, Special Section on Distributed Source Coding, vol. 86, no. 11, pp. 3123-3130, November 2006, which is fully incorporated herein by reference.
In some applications, authentication data are generated as described above by a server upon request. The server uses a different random seed Ks in responding to each request, and the seed is provided to the authentication arrangement 304 as part of the authentication data to mitigate an attack that confines malicious editing to the nullspace of the projection. For example, where implemented for image media, based on the random seed, for each 16×16 non-overlapping block Bi, a 16×16 pseudorandom matrix Pi is generated by drawing its elements independently from a Gaussian distribution N(1,σz2) and normalizing so that ∥Pi∥2=1. The term σz is chosen at 0.2 empirically. The inner product Bi, Pi is quantized into an element of X.
The rate of the Slepian-Wolf bitstream S(X) is selected to determine or set a degree of statistical similarity between the media-to-be-authenticated and the original media in order to declare the media-to-be-authenticated as legitimate (i.e., authentic). If the conditional entropy H(X|Y) exceeds the bit-rate R (e.g., in bits per pixels for image media), X can no longer be decoded correctly. Therefore, the rate of S(X) is chosen to distinguish between the different joint statistics induced in the media contents by legitimate and illegitimate channel states. Accordingly, a Slepian-Wolf bit-rate that is just sufficient to authenticate a worst permissible quality is used at the encoder 330, facilitating the detection of illegitimate media data while permitting the acceptance of media that is simply distorted via acceptable communication conditions.
At the authentication arrangement 304, the media-to-be-authenticated y is authenticated using authentication data S(X) and D(X, Ks). The media-to-be-authenticated y is projected to Y via random projection 360 in the same way as the projection of media x to X during authentication data generation as described above. A Slepian-Wolf decoder 335 reconstructs X′ from the Slepian-Wolf bitstream S(X) using Y as side information. Decoding is via LDPC belief propagation initialized according to the statistics of the legitimate channel state at the worst permissible quality for the given original media. For general information regarding decoding, and for specific information regarding approaches respectively for using side information in reconstruction and belief propagation approaches that may be implemented in connection with these or other example embodiments, reference may be made to the following: A. Aaron, S. Rane, E. Setton, and B. Girod, “Transform domain Wyner-Ziv codec for video,” in SPIE Visual Communication sand Image Processing Conference, San Jose, Calif., 2004; to A. Wyner, J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Transactions on Information Theory, vol. 22, no. 1, pp. 1-10, January 1976; and to A. Liveris, Z. Xiong, and C. Georghiades, “Compression of binary sources with side information at the decoder using LDPC codes,” IEEE Communications Letters, vol. 6, no. 10, pp. 440-442, October 2002, all of which are fully incorporated herein by reference.
The media digest of X′ is computed using cryptographic hash function 370 and compared at 380 to the media digest that is output at 340, decrypted via asymmetric decryption 355 from the digital signature D(X, Ks) using a public key. In these contexts, one example approach to generating the media digest involves using a digital signature algorithm that generates a string of bits as a function of the source content (here, the projection) and an encryption key. If the media digests match (e.g., are identical via binary comparison), the media-to-be-authenticated y is determined to be authentic, and if there is no match, the media-to-be-authenticated y is determined to be inauthentic.
In certain embodiments where the media-to-be-authenticated y is determined to be inauthentic, the decoder 335 requests incremental authentication data to infer the location of illegitimate editing by one of several editing models supplied at 365. The editing models are categorized into groups including legitimate editing, and illegitimate editing groups. The legitimate editing may include various compression methods, up/down sampling, geometric transformations and format conversion. The illegitimate editing may include tampering, replacement of content, or one of many other malicious modifications to the media. In some implementations, rate-adaptive distributed source codes are implemented so that more information can be sent to receiver (decoder 335) incrementally to offer additional functions such as tampering localization, some of which are described further below. The location of illegitimate editing is determined and used to facilitate the future communication of authentic media.
As shown in the examples of
As discussed briefly in connection with different applications above, tampering localization is selectively carried out in connection with various embodiments. With media that is determined to be illegitimate, a portion of encoded data is used to identify tampered pixels with confidence, while correctly classifying untampered blocks. In some applications, a Slepian-Wolf bitstream of less than about 10% of the compressed media size is used for tampering localization, and in other applications, the contiguity of the tampered regions in a decoding model is used to facilitate a low-bitstream size for localization. The following describes more particular approaches to localization with media including image data, using the system 300 in
An authentication problem such as that discussed above with
Where rate-adaptive LDPC codes are used for Slepian-Wolf coding as described with
The decoder 335 applies a sum-product algorithm using the factor graph in
As would be apparent to the skilled artisan, various types of electronic circuits can be used to implement the modules or functional blocks discussed above. Depending on the application and available implementation resources, the blocks discussed above in connection with
The references cited in the above-referenced provisional patent application, to which priority is claimed and which is fully incorporated herein by reference, describe various approaches that may be implemented in connection with one or more example embodiments of the present invention, and are also fully incorporated herein by reference.
The various embodiments described above are provided by way of illustration only and should not be construed to limit the invention. Based on the above discussion and illustrations, those skilled in the art will readily recognize that various modifications and changes may be made to the present invention without strictly following the exemplary embodiments and applications illustrated and described herein. For instance, such changes may include modifying the order and content of the various authentication steps, using different legitimate or illegitimate variations of media-to-be-authenticated in decoding steps, or using different distributed encoding/decoding approaches. Such modifications and changes do not depart from the true spirit and scope of the present invention, which is set forth in the following claims.
This patent document claims the benefit, under 35 U.S.C. §119(e), of U.S. Provisional Patent Application Ser. No. 60/957,945, entitled Authenticated Media Communication System and Approach and filed on Aug. 24, 2007; this patent application is fully incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
60957945 | Aug 2007 | US |