The invention pertains to the identification of a photographic camera model, more particularly, the invention relates to a system for determining the identification of a photographic camera model on the basis of a digital photograph having undergone the whole set of processings of the acquisition chain, or indeed compressed according to the JPEG standard. The invention also relates to a method for implementing such a system. These systems find significant applications in determining the provenance of a photograph.
Digital forensics or the search for proofs in a digital medium has undergone significant development over the last decade. In this field, proposed procedures fall into two categories depending on whether one wishes to identify the photographic camera model or the camera itself (an instance of a certain model).
Generally, identification procedures are passive or active. In the case of active procedures, the digital data representing the content of the image are modified so as to insert an identifier (so-called watermarking procedure). When the inspected image does not contain any watermark, the acquisition camera must be identified on the basis of the data of the image.
Two key problems are identified in relation to forensics: identification of the origin of the image and detection of false images (see [1]-[3] and the references incorporated into these documents). Identification of the origin of the image is aimed at verifying whether a given digital image is acquired by a specific photographic camera (i.e. an instance) and/or at determining its model. The detection of false images is aimed at detecting any act of manipulation such as splicing, removal or addition in an image. Two approaches, active and passive, exist for solving these problems. Digital watermarking is considered to be an active approach. There are nonetheless a few limitations cf. [3] since the incorporation mechanism must be available, and the credibility of the information incorporated in the image remains arguable. The passive approach has been increasingly studied in the last decade. The watermark, or the prior information of the image, including the availability of the original image, is not required in its mode of operation. Passive forensics procedures rely on the fingerprints of the photographic camera, left in the image, to identify its origin and verify its authenticity. These prints are extracted by the image acquisition processing chain; see references [4]-[6], for an insight into the various steps and the structure of the various processing within a digital photographic camera.
The passive forensics procedures proposed for the problem of identifying the origin of the image can be divided between the following two fundamental categories. The procedure of the first category is based on the hypothesis that differences exist between the models of cameras, whether in respect of the image processing techniques and in respect of the technological components. Indeed, the aberration of the objective cf. [7], the “Color Filter Array” (CFA), the interpolation algorithm, dematrixing cf. [8]-[11], and JPEG compression see references [12], [13] are considered to be influential factors in identifying the model of the photographic camera when white balancing algorithms [14] are used for identifying the source camera. On the basis of these factors, a set of functionalities is provided and used in the automatic learning algorithm. The main challenge is that the image processing techniques remain identical or similar and, the components, produced by a few manufacturers, are shared between the models of photographic cameras. Moreover, as in all applications of automatic learning, it is difficult to select a set of precise functionalities. Furthermore, the analysis of the deployment of detection performance remains an open problem cf. [15].
The procedure of the second category is aimed at identifying the unique characteristics, or fingerprints, of the acquisition camera. The “Sensor Pattern Noise” (SPN) or characteristic noise of a sensor, is based on the imperfections resulting from the process for manufacturing the photographic sensor and on the non-uniformity during electronic conversion of the photo because of the lack of homogeneity of the silicon wafers (also called “Photo-Response Non-Uniformity” or PRNU). This is a unique fingerprint, see references [17]-[21]. Moreover, the procedures based on the presence of non-uniformity noise (PRNU) are also used in reference [22] for identifying the model of the photographic camera. These procedures are based on the hypothesis that the fingerprint obtained on the basis of an image in the TIFF or JPEG format contains traces of the intrinsic watermark, namely the SPN containing information about the model of the photographic camera.
It should be noted that the two main components of the Sensor Pattern Noise SPN are the “Fixed Noise Pattern” (FPN) and the “Photo-Response Non-Uniformity” PRNU. The FPN, or “fixed shape noise structure”, which is used in reference [23] for identifying the camera, is generally compensated for in a photographic camera by subtracting a dark image on the output image. Consequently, the Fixed Noise Pattern (FPN) is not a robust fingerprint and will not be able to be used in subsequent works. The PRNU is utilized directly in certain works, see references [17], [18], [21]. The ability to reliably extract this noise from the image is the main challenge in this category. Another challenge is the falsification of the origin of the image due to “counter-analysis” activities, see reference [24]. However, the existing procedures are designed with very limited utilization of the theory of hypothesis testing and statistical image models. Consequently, their performance is not yet analytically established.
Most legal medicine image procedures rely on the noise of the sensors, see references [22], [25], or on characteristics geared toward operations in the photographic camera, see reference [10]. Most digital photo cameras export images in the JPEG format. The ability to extract image characteristics is placed in doubt because the JPEG compression may seriously damage these characteristics. In the patent application referenced as [25] we have proposed to utilize the parameters (a, b) to passively identify a photographic camera model. This procedure is based on the heteroscedasticity of the noise present in a RAW image. A RAW image is an image that has not undergone any of the post-acquisition processing of the processing chain). The patent application referenced as [25] shows perfect detection performance for the identification of a photographic camera model on the basis of the RAW images, of uncompressed images or of lossless compressed images. However, it is useful to extend this procedure to TIFF or JPEG format compressed images.
The issue tackled in the present invention is that of the passive identification of an acquisition photographic camera model on the basis of a given compressed image. Passive identification is understood to mean taking a decision in the case where the image is not assumed to contain any information identifying the source. The problems that it is envisaged to solve are 1) to ensure that a photograph has not been taken by a given camera when the photograph is compromising or, 2) conversely, to guarantee that an inspected photograph has indeed been taken by one camera rather than by another. The examples that may be given are numerous, including: this photographic camera is responsible for the photograph of the confidential document (such as those available on wikileaks); could a given photograph of a pedopornographic nature have been acquired with the photographic camera of a suspect; has the copy of a contract been scanned with the client's camera; the photograph of a document makes it possible to copy and mark the document etc.
The aim of the invention is to provide a system for identifying a photographic camera model on the basis of images acquired with a known photographic camera model and of a photograph in the form of a compressed image, said photograph having undergone a post-acquisition processing and obeying the linear relation between the expectation and the variance of the pixels such that: σ2ym,n=a μym,n+b where a and b are two parameters characterizing said photographic camera model, μym,n and σ2ym,n are respectively, the expectation and the mathematical variance of the pixel ym,n at the position (m,n) having undergone the post-acquisition processing, the system is characterized in that it comprises an image processing device able to provide an analytical relation between parameters (α, β), of the statistical distribution model of the DCT, Discrete Cosine Transformation, coefficients and the parameters (a, b) of the photographic camera in the form: βp,q−1=cp,qαp,q+dp,q so that the parameters (c, d) determine fingerprints characterizing a photographic camera model and depend both on the frequency (p, q) and on the parameters a and b, in that the system furthermore comprises a device for executing statistical hypothesis tests of the DCT coefficient and a statistical analysis device so as to determine whether said photograph has been taken by said photographic camera model or by another photographic camera model. Basically the images for which the acquisition model is known constitute the source of the identification of the photographic camera model.
Advantageously, the analysis device provides an indication about the identification of said photographic camera model by certifying the accuracy of the identification with a previously defined precision.
The invention further relates to a method implemented in said system, which comprises the following steps:
Advantageously, the statistical tests are executed as a function of a prescribed constraint on the probability of error.
According to the invention, the photograph is a compressed image, according to the JPEG compression standard.
Advantageously, the photograph is a reference, or inter-frame, image belonging to a video stream, and compressed according to the MPEG compression standard.
The invention further relates to the use of the method hereinabove for the detection in an unsupervised manner of the falsification of a zone of a photograph.
Moreover, the invention relates to the use of the method hereinabove for the detection, in a supervised manner, of the falsification of a zone of a photograph. This detection is carried out by testing whether an a priori known zone originates from the same photographic camera as the remainder of the inspected image.
The invention further relates to the use of the method hereinabove in the search for proofs on the basis of a compromising image.
The invention relates to the application of the method hereinabove in specialized software, in the search for proofs on the basis of digital media.
Other characteristics, details and advantages of the invention will emerge on reading the description which follows, with reference to the appended figures, which illustrate:
For greater clarity, identical or similar elements are tagged by identical reference signs in all the figures.
It is on the basis of this photograph 3 that the system 1 will determine the photographic camera model that captured this photograph. This system is composed of a photo analyzer which will examine this photograph 3. The photograph 3 takes the form of a compressed file suitable for the processing which will follow. The formats of JPEG type or any image arising from the decompression of a previously compressed image in the JPEG format, compressed according to the JPEG standard, is suitable for this processing.
The system 1 can be implemented on a computer of PC type. This system 1 is furnished with an input facility 10 so as to be able to enter the data of the photograph 3. These data are processed by a processing facility 12 which implements a processing which will be explained hereinbelow. A device for executing statistical hypothesis tests on the DCT coefficients and a statistical analysis device 14 will provide an indication about the identification of the photographic camera model responsible for said photograph.
According to the method in accordance with the invention, in the first step, the digital photograph 2 is viewed as one or more matrices whose elements represent the value of each of the pixels. In the case of a gray level image, the photograph can be represented by a single matrix:
Z=z
i with 1≤i≤L
For color images, three distinct colors are usually used: red, green and blue. In this case, an image may be regarded as 3 distinct matrices, one matrix per color channel:
Z=z
i
k with 1≤k≤3
The second step of the method consists in separating the various color channels, when the analyzed image is in color. The series of operations being carried out in an identical manner with each of the matrices representing the color channels, we consider that the image is represented by a single matrix (the index k is omitted).
Noise present in digital photographs exhibits the property of being heteroscedastic: The (random) stochastic properties of noise are not constant over the whole set of pixels of the image.
Because of the large number of photons incident on the sensors, it is possible to approximate with high precision the process of counting by a Gaussian random variable.
Today, the JPEG image format is ever more apparent as a standard in digital images. Most digital photo cameras and software code images in this format. The use of JPEG compression is a question of balance between the storage size and the quality of the image. An image which is compressed with a high compression factor requires little storage space to the detriment of lesser visual quality.
The Discrete Cosine Transformation (DCT) operation is one of the key steps of JPEG compression. Given an arbitrary image Z, the DCT operation is applied to each block of 8×8 pixels of Z as follows
where zm,n represents a pixel inside a 8×8 block of Z, 0≤m≤7, 0≤n≤7 and Ip,q designates the two-dimensional DCT coefficient and
The term Tq can be easily derived by analogy with the term Tp. The DCT coefficient at the position (0, 0), called the direct current (DC) coefficient, represents the mean value of the pixels in the block of 8×8 pixels. The remaining 63 coefficients are called the alternating current (AC) coefficient. Two main advantages of the DCT operation are sub-optimal decorrelation and energy compacting. After the DCT operation, the energy is situated mainly in the low frequencies while the high frequencies mainly contain components of the noise.
The modeling of the distributions of the DCT coefficients has been widely studied in the literature. The Laplacian model in reference [27], generalized Gaussian model (cf. [28]) and generalized Gamma (GI−) model (cf. [29]) have been proposed for the DC coefficients. But the Laplacian distribution remains a dominant choice in image processing because of its simplicity and its relative precision. In our earlier works, we established a rigorous mathematical framework to model the AC coefficients, see more details in reference [6]. As the DC coefficient represents the mean value of the pixels in each block of 8×8 pixels, the DC coefficient distribution cannot be derived directly from the fact of the heterogeneity in a natural image. Let I be an AC coefficient, for the sake of clarity, the index of said AC coefficient is omitted. Because of the variability of the variance of the block, the probability density function (pdf) of the I is given as a function of a bi-stochastic model (cf. [27]) in the following manner
ƒI(x)=∫0∞ƒI|σ
where fX(x) designates the probability density function (pdf) of a random variable denoted X and where σ2 designates the variance of the block considered. It is assumed that the pixels inside a block are independently and identically distributed, cf. [27]. Given σ2 a constant variance of the block, the distribution of the AC coefficients of I can be approximated by a Gaussian distribution of zero mean, by virtue of Lindeberg's central limit theorem (CLT) for correlated random variables cf. [27], [30]
Furthermore, the distribution of the variance σ2 over the whole set of blocks constituting an image may be modeled in an approximate manner by the gamma distribution law G (α, β) cf. [6] defined by the following probability density function (pdf):
where α is a positive shape parameter, β is a positive scale parameter, and σ(⋅) represents the gamma function. On the basis of (3), (4) and (5), the statistical distribution model for the AC coefficient of I is given as:
On the basis of [31] and by using the integral representation of the modified Bessel function, denoted Kν(⋅), the integral (6) can be written
where Kν(x) represents the modified Bessel function [31, chapter 5.5]. The model of AC coefficients proposed in (7) comprises the particular cases of Laplacian distribution and Gaussian distribution (cf. [6]). As indicated in [6], this model surpasses the Laplace and Gauss models, but to the detriment of more complex expressions. By using the Laplace approximation [32] (see more details in Annex A), an approximation of the function fI(x) can be given as:
It should be noted that other polynomial expansions of the modified Bessel function Kν(x) are also given in [31], also a polynomial approximation of fI(x) can be derived. However, these approximations are not considered in the present patent application. The precision of the approximation given in (8) depends on the choice of the functions g (t) and h (t) (see further on in relation (56)). The main advantage of relation (8) is to provide an approximation in the form of an exponential function. This approximate model is used to simplify the calculation of the likelihood functions as well as the calculation of the threshold for the likelihood ratio used in the proposed tests. The estimation of the parameters is performed based on the exact model defined in relation (7).
It may be noted that this approximate model is a particular case of the generalized gamma distribution model when γ=1 (the variable γ is given in reference [29, Eq (6).]). An example is given in
The empirical data are extracted on the basis of a real image of the Dresden image database (see [33]) previously compressed using the JPEG compression standard. The parameters are estimated on the basis of the empirical data on the basis of the exact model of the distribution of the DCT coefficients. The exact and approximate models of the DCT coefficient are represented with the parameters estimated by maximum likelihood. The main drawback of the approximation model is that when x tends to 0, the function (8) tends to 0 when α>1, and it tends to infinity when α<1. This leads to an inaccuracy in the neighborhood of 0, as represented in
I
[I
2]=σ
I
[I
4]=σ
where x designates the mathematical expectation of a random variable X. Consequently, the method of moments (MM) allows the estimation of the parameters (α, β) as follows:
where ̂m2 and ̂m4 are respectively the second and the fourth empirical moments of I. The estimation of the maximum likelihood (ML) of the parameters (α, β) are defined as the solution of the maximization problem
where N is the number of coefficients at the same frequency. It should be noted that the likelihood function is differentiable but the calculation of the derivative seems to be extremely difficult. As there is no exact or analytical form for the maximum likelihood (13), it is proposed to solve the maximization problem numerically with the aid of the optimization procedure of Nelder-Mead [34]. The estimations MM (̂αMM, ̂βMM) are taken as initial solution in the optimization algorithm.
As discussed hereinabove, one of the main advantages of the approximation in (8) is the compacting of the energy. The energy has a tendency to be situated mainly in the low frequencies while the high frequencies mainly contain components of the noise. Consequently, there exists a difference of scale between the DCT coefficients. The DCT coefficients do not share the same parameters (α, β). As the estimation of the parameters (α, β) is performed separately on each frequency, we must designate the parameters (αp,q, βp q)) with respect to the DCT coefficient of
I
p,q.
5.A The Intrinsic Prints of the Photographic Camera
In our earlier works references: [25] and [35], the parameters (a,b) were utilized to passively identify a photographic camera model, where a and b represent parameters characteristic of a photographic camera. This procedure is based on the heteroscedasticity of the noise present in a RAW image. The (random) stochastic properties of noise are not constant over the whole set of pixels of the image. More precisely, the value of each pixel depends linearly on the number of incident photons. This model, which represents all the noise contaminating the RAW image, gives the variance of the noise as a linear function of the mathematical expectation of the pixels and obeys the following relation:
y
m,n˜(μy
where m,n is the measured value of the RAW pixel at the position (m,n) and μm,n its mathematical expectation. Even though this procedure shows an almost perfect detection performance, there are two main limitations. Firstly, it concentrates on the RAW images, which may not be available in practice. Indeed, the most difficult part when extending this procedure to other image formats, for example TIFF and JPEG, is the impact of the post-acquisition method (dematrixing, white balancing and gamma correction) as well as of the compression method, since the dematrixing causes spatial correlation between the pixels and the non-linear operations destroy the linear relation between the expectation and the variance of the pixel. Secondly, the proposed fingerprint, defined by the parameters (a, b), depends on the ISO sensitivity. Although this is not crucial in practice, since there is not much ISO sensitivity and just a small number of images is sufficient to estimate the reference parameters (a, b) for each ISO sensitivity, it is desirable to count on a print which is invariant with respect to the content of the image and which is robust for the nonlinear transformation operations (for example the gamma correction factor).
To render a color image complete on output and improve its visual quality, a RAW image requires a post-acquisition process, for example, dematrixing, white balancing, and gamma correction. To extend the method referenced in [21] to compressed images, let us assume that the effect of the algorithm for dematrixing and white balancing is negligible on the heteroscedastic relation of the expectation and of the variance of the pixel. Then the compressed image obeys the relation between the expectation and the variance of the pixels such that: σ2ym,n=a μym,n+b where a and b are two parameters characterizing a photographic camera model, μym,n and σ2m,n are respectively the expectation and the mathematical variance of the pixel ym,n at the position m,n.
We shall provide an analytical relation between the parameters (α, β) of the model of the DCT (discrete cosine transform) coefficient and the parameters of the photographic camera (a, b). It is assumed that the pixels in each 8×8 block are independent and identically distributed. Moreover, we assume that the effect of the dematrixing and white balancing algorithm is negligible on the heteroscedastic relation of the expectation and of the variance of the pixel. The gamma correction is defined by the transformation, applied to each pixel independently, defined as follows:
where |⋅| designates the absolute value and y is the correction factor (typically, γ=2.2). Here, m,n is referenced as the balanced white pixel and ηm,n is a zero-mean signal, representing the pixel noise at the position (m,n) with variance: σ2ηm,n=Var [ηm,n]=+b. The first order of the Taylor series expansion of (1+x)1/γ for x=0 leads to:
Taking the expectation and the variance on both sides of equation (16), we will obtain:
where μzm,n and σ2zm, n represent, respectively, the value of the expectation and of the variance of the pixel zm, n. Relation (17) is justified under the hypothesis of ηm,n<<μym, n. Moreover, it may be noted that the variance of the corrected gamma pixel, σ2zm, n is directly proportional to the variance of the noise σ2ηm,n and inversely proportional to the value of the expectation μzm,n. By taking the variance on both sides of equation (1), it follows that Var [Ip, q]=Var [zm, n]=σ2zm, n. This is justified under the hypothesis that the pixels are independent and identically distributed in the 8×8 block. It follows from equation (9) that
Furthermore, from (9) and (10), we obtain
Var[Ip,q2]=[Ip,q4]−2[Ip,q2]=2αp,q2βp,q2+3αp,qβp,q2. (19)
Moreover, a calculation on the basis of (1) yields
where it follows from (16) that
Consequently, from (19)-(22) we obtain:
(Sp,q+1)αp,qβp,q+3βp,q=4Sp,qμz
For the sake of simplification, by solving equations (18) and (23) and again using the Taylor series, we obtain a nearly linear relation
βp,q−1=cp,qαp,q+dp,q (24)
where the parameters (cp, q, dp, q) depend both on the frequency (p,q) and on the parameters (a, b) of the photographic camera. Consequently, the parameters (cp, q, dp, q) can be used as fingerprint for identifying the model of the photographic camera.
5.B—Estimation of the Print of the Photographic Camera
It is important to recall that the JPEG compression involves basic steps, namely: the DCT operation, the uniform quantization of the DCT coefficients with a quantization matrix and an entropy coding of the quantized values. Consequently, it seems that the fingerprints of the photographic camera (cp, q, dp, q) which is given by the mathematical analysis hereinabove could work with the quantized original DCT coefficients extracted from the JPEG file. However, because of the effect of the quantization, the information which is not visually significant is ignored. The high frequency contains most of the zeros. Thus there are insufficient statistics to estimate the parameters (cp, q, dp, q). Furthermore, the parameters (cp,q, dp, q) estimated at low frequency may be contaminated because their DCT coefficients are strongly influenced by the content of the image.
Let Z be a color image with three components Z={Zc} where c∈{R, G, B} designates the red, green and blue channel index. In order to attenuate the impact of the content of the image, we propose to delete the content of the image from the image Z given in the spatial domain by using a denoising filter D on each color channel to obtain a residual image Wc
W
c
=Z
c−(Zc) (25)
The residual image with 3 components {Wc} is converted into a monochrome image as
The residual image
I
W
=DCT(
where IW is the image of the DCT coefficients of the residual image
For the sake of clarity, the image IW is arranged in 64 vectors of DCT coefficients following the “zig-zag” order used in the JPEG compression standard. Let IWK=(Ik,1W, . . . , IWk,N) T, k∈{1, . . . , 64} be the vector of length N representing the k-th DCT coefficient where IWk, i, 1≤i≤N designates the value of the k-th DCT coefficient of block i and UT designates the transpose of the matrix U. In an analogous manner, the parameters characterizing the distribution of IWk are designated (αk, βk) and the fingerprints of the photographic camera are also denoted (ck, dk).
The mathematical analysis performed hereinabove is based on the strong hypothesis that the pixels are distributed in an identical manner inside a small 8×8 block. This hypothesis may not be realistic because of the presence of edges or of details in a natural image. Consequently, to ensure the mathematical framework in an approximate manner, we work solely on the homogeneous 8×8 blocks and we perform the estimation of the parameters on the selected data. Before the selection of a block, the saturated blocks must be excluded because they falsify the mathematical framework hereinabove. A pixel is designated as saturated if its gray level value lies in the zone corresponding to the lower or upper 10% of the dynamic range (typically the dynamic range is [0255] for an 8-bit image, the saturated zones are therefore [0,25] and [220 255]). A block is designated as saturated if there exists at least one saturated pixel in the block. The selection of the blocks is carried out by a simple non-adaptive method based on the standard deviation of the block. The idea is that in a homogeneous block with no edge or in detail, the standard deviation of the block ought to be low. The standard deviation of the block is estimated by a robust estimator: the median of the mean absolute deviation (MAD) value. Consequently, a block i is selected if the following two conditions are fulfilled
where k∈{2, . . . , 64} and Ik,i designates the k-th DCT coefficient in the block i of the gray level image
In practice, the print of the photographic camera (ck, dk) must be estimated on the basis of a single image. Consequently, several pairs of (αk,βk) based on a single image are necessary. Let
The OLS estimations (̂ck, ̂dk) are not biased and are asymptotically equivalent to estimations of the maximum likelihood ML since the ML estimations (αk, l, βk, l) follow the asymptotic Gaussian distribution cf. [37]. Consequently, the estimations (ĉk, ̂dk) also follow the distribution. In order to ensure that there are sufficient statistics for the estimation, we use L=200 and N=4096 (i.e. about 10% of
Even if the content of the image is deleted, the low-frequency DCT coefficients of IWk may still be slightly affected. Consequently, it is more relevant to use the high-frequency DCT coefficients of IWk. An example is given in
The present invention is aimed at identifying the models of photographic cameras based on DCT statistics. This part makes it possible to analyze two models of photographic cameras 0 and 1. Each photographic camera model j, j∈{0,1}, is characterized by parameters (ck,j, dk,j) where k designates the frequency with k∈{2, . . . , 64}. In a binary hypothesis test, the inspected image Z is either acquired by a photographic camera model 0, or by a photographic camera model 1. The test objective is to decide between two hypotheses defined by:
where αk, βk,j. represents the statistical distribution of the DCT coefficients of IWk,i under hypothesis of the parameters (αk, βk,j). As explained previously, the emphasis is placed on the prescribed guarantee of a probability of false alarm. Consequently, we define
to be the class of tests whose probability of false alarm is less than the bound prescribed by α0. Here, () represents the probability of event under the hypothesis with j∈{0,1}, and the supremum over θ should be understood as any value for the parameters of the model. Among all the tests of class Kα0, the aim is to find a test δ which maximizes the power function, defined by the probability of correct detection:
βδ=[δ(Z)=1]
The problem defined in (30) evinces three fundamental difficulties in identifying the model of the photographic camera. Firstly, even if all the model parameters (αk, ck, j, dk, j) are known, the most powerful test, namely the LRT, has never been studied for this problem. The second difficulty relates to the unknown nuisance parameters αk in practice. A possible approach for coping with unknown nuisance parameters consists in eliminating them by using the invariance principle cf. [38]. This approach was discussed in references [39], [40] and has achieved good performance in certain applications cf. [41]. However, this approach cannot be applied here since the tested hypotheses do not possess the “symmetry” properties required for the application of the invariance principle in statistics. Another approach is to design a GLRT test by replacing the unknown parameters with estimations of the maximum likelihood ML cf. [42]. Finally, the two hypotheses and are composite since the parameters of the photographic camera (ck, j, dk, j) are unknown.
For the sake of clarity, we assume that the parameters of the photographic camera (ck, 0, dk, 0) are known and we solve only the problem in which the alternative hypothesis is composite, stated otherwise, the parameters of the photographic camera (ck,1; dk,1) are not known. It should be noted that a test which maximizes the detection power whatever it be (ck,1; dk,1) could exist. Our main objective is to study the LRT test and to design the GLRT test to address the second and third difficulties.
Furthermore, it should be stressed that the GLRT test processed with unknown nuisance parameters αk, when the parameters of the photographic camera are known, may be interpreted as a closed hypothesis test where a given image is either acquired by the photographic camera model 0, or by the photographic camera model 1. Whereas, the GLRT test processed with the parameters of the unknown photographic camera (ck,1; dk,1) becomes an open hypothesis test in which a given image is acquired by a photographic camera model 0 or not. Indeed, the given image may be acquired by an unknown photographic camera model. Consequently, the two proposed tests may be applied, depending on the requirement of the context.
6.A—Test of the Likelihood Ratio for Two Simple Hypotheses
When all the parameters of the model are known, by virtue of the Neyman-Pearson lemma [31, theorem 3.2.1], the most powerful test, δ which solves the problem (30) is the LRT test proposed by the following decision rule:
the decision threshold is the solution of the following equation:
[Λ(Z)≥τ]=α0 (32)
to ensure that the LRT test is in the class Kα0, by using the approximation, defined in (8), of the DCT coefficient model, the likelihood ratio (LR) of an observation IWk,i is defined by
The statistical distribution of the likelihood ratio, LR denoted ∧(Z), under each hypothesis is given by
where the notation designates convergence toward the distribution and (mj, vj) designate respectively the mean and the variance of the LR A (Z) defined further on in (66) and (67). As a natural RAW image is heterogeneous, it is proposed to normalize the LR ∧ (Z) so as to fix the decision threshold independently of the content of the image. The normalized LR is defined by
Consequently, the corresponding test δ* is rewritten as follows:
Normalization of the LR of ∧ (Z) allows the test δ* to be applicable to any natural RAW image since the normalized LR ∧* (Z) follows the standard Gaussian distribution under the hypothesis . The decision threshold * and the power function βδ* are obtained through the following theorem:
Theorem 1:
Assuming that all the parameters of the model (αk, ck,j, dk,j) are exactly known, the decision threshold and the power function βδ* of the test δ* is given by
where ϕ (⋅) and ϕ−1 (⋅) designate respectively the distribution function of the standard Gaussian distribution and its reciprocal function.
This decision threshold guarantees that the false-alarm probability will be equal to α0, that is to say to decide that the photograph does not originate from the photographic camera 0 although this is actually the case.
As we propose to design the LRT test by using the approximation function for the DCT coefficient model defined in (8), it is desirable to evaluate the loss of power between the theoretical LRT test and the approximate LRT. In the theoretical LRT test, we use the exact DCT coefficient model, defined in (7), and the mean as well as the variance of the LR are calculated numerically by calculating integrals. The power function of the two LRT tests is represented in
In fact, the LRT test is aimed at taking a decision by using the ratio between the maximum likelihood function of a given image under the alternative hypothesis 1 characterized by the parameters of the photographic camera (ck, 1, dk, 1) and the maximum likelihood function under the null hypothesis characterized by the parameters of the photographic camera (ck, 0, dk, 0). If this ratio is below a threshold, the null hypothesis is accepted. Conversely, the alternative hypothesis is accepted. Consequently, the smaller the distance between the two points (ck, 0, dk, 0) and (ck, 1, dk, 1) the more difficult the identification of the photographic camera.
6.B—Test of the Generalized Likelihood Ratio
In this part, two GLRT tests are designed to deal with unknown parameters. It is proposed to replace the unknown parameters with their maximum likelihood ML estimations defined in (33). Assuming that the parameters of the photographic camera (ck, 0, dk, 0) and (ck, 1, dk, 1) are known, the first GLRT test is designed as follows
where, again to ensure ̂δ1 is in the class Kα0, ̂1 is the solution of the equation
[{circumflex over (Λ)}1(Z)≥{circumflex over (τ)}1]=α0 (40)
and the generalized likelihood ratio (GLR) of ̂∧1 (IWk,i) is given by
Here, the maximum likelihood ML estimation of ̂αk is given by the previously proposed procedure and ̂β−1k,j=ck, ĵαk+dk, j, j∈{0, 1}. The ML estimation of ̂αk converges asymptotically to its true value: ̂αkαk. Consequently, in accordance with the Slutsky's theorem [38, theorem 11.2.11], we obtain the asymptotic statistical distribution of the GLR of ̂∧1 (Z) under each hypothesis
As in the case of the LRT test, it is proposed to normalize the GLR ̂∧1 (Z). However, the expectation m0 and the variance v0 are not defined since the parameter αk is unknown in practice. We replace αk with ̂αk defined in (66) and (67) to obtain the estimations of m0 and of v0 designating respectively ̂m0(1) and ̂v0(1). Consequently, the normalized GLR ̂∧*1 (Z) is defined by:
and the corresponding ̂δ*1 is rewritten as follows
Again using the Slutsky's theorem [38, theorem 11.2.11], we obtain the decision threshold and the power function of the test ̂δ*1.
Theorem 2.
When the parameters of the photographic camera (ck, 0, dk, 0) and (ck, 1, dk, 1) are known, the decision threshold and the power function of the test ̂δ*1 are given by
When the parameters of the photographic camera (ck, 1, dk, 1) are not known, we can also design the GLRT test according to the procedure hereinabove. The unknown parameters (ck,1, dk,1) are replaced with the OLS estimations of (̂ck, 1, ̂dk,1) defined in (33)
where ̂β−1k,0=̂ck, 0̂αk+dk, 0 and ̂β−1k,j=̂ck, 1̂αk+̂dk, 1.
As the OLS estimations of (̂ck,1, ̂dk,1) are compliant, the GLR ̂∧2 (Z)=Σ
where ̂m0(2) and ̂v0(2) are the estimations of the expectation m0 and of the variance v0 on replacing αk with ̂αk and (ck,1, dk,1) with (̂ck,1, ̂dk,1). The corresponding test ̂δ*2 is rewritten as follows
In accordance with Slutsky's theorem, the decision threshold and the power function of the test ̂δ*2 are given by the following theorem:
Theorem 3.
When the parameters of the photographic camera (ck,0, dk,0) are known and the parameters (ck,1, dk,1) are not known, the decision threshold and the power function of the test δ*2 are given by:
The two proposed GLRT tests can be applied in the practical context. The first GLRT test ̂δ*1 is aimed at detecting any given image acquired by the photographic camera model 0 or the photographic camera model 1 while the GLRT test ̂δ*2 is aimed at verifying whether the given image is acquired by the model of the photographic camera 0.
This decision threshold guarantees that the false-alarm probability will be equal to α0, that is to say to decide that the photograph does not originate from the photographic camera 0 whereas this is actually the case.
The proposed procedure can be extended to the images arising from a video stream. A video stream is composed of a succession of images which file past at a fixed tempo. Video compression is a data compression procedure which consists in reducing the quantity of data, while minimizing the impact on the visual quality of the video. The benefit of video compression is to reduce the costs of storing and transmitting video files. Video sequences contain very great statistical redundancy, both in the temporal domain and in the spatial domain. The fundamental statistical property on which compression techniques are based is correlation between pixels. This correlation is both spatial, the adjacent pixels of a current image are similar, and temporal, the pixels of the past and future images are also very close to the current pixel. Video compression algorithms of MPEG type use the DCT (discrete cosine transform) transformation on blocks of 8×8 pixels, to efficiently analyze the spatial correlations between neighboring pixels of the same image. Thus, in the method according to the invention the photograph may be an image arising from a video stream and compressed according to the MPEG standard.
It is possible to envisage another case of using the system to identify a photographic camera model with a view to determining whether a zone of the image has not been falsified, by copying/pasting from another photograph or by deletion of an element.
Finally, it is possible to envisage another case of using the system to identify a photographic camera with a view to determining, in a supervised manner, if a zone of the image has not been falsified (by copying/pasting from another photograph or by deletion of an element). Here, “supervised” is intended to mean the fact that the user wishes to make sure of the integrity of a previously defined zone. The principle is then to apply the identification procedure to the two “sub-images” arising respectively from the zone targeted by the user and from its complement (the remainder of the image). If the inspected element originates from another photograph and has been added by copying/pasting, the noise properties will be different, something that the proposed system will be capable of identifying (by assuming that the photographs were not taken under the same acquisition conditions and with the same photographic camera model, which seems reasonable).
The proposed procedure for identifying the photographic camera model addresses the two weaknesses of the procedures briefly presented in the prior art: 1) their performance is not established and, 2) these procedures can be foiled by the calibration of the photographic camera. The proposed procedure relying on the noise properties inherent in the acquisition of compressed photographs, it is applicable whatever the post-acquisition processings applied by a user (in particular with a view to improving visual quality). Furthermore, the parametric modeling of the statistical distribution of the value of the pixels in the frequency domain makes it possible to analytically provide the performance of the proposed test. This advantage makes it possible in particular to ensure compliance with a prescribed constraint on the probability of error.
The main fields of application of the invention are on the one hand, the search for proof on the basis of a “compromising” image and, on the other hand, the guarantee that a photograph was acquired by a given camera.
The proposed procedure can be extended to checking the integrity of a photograph. The aim is then to guarantee that a photograph has not been modified/falsified since its acquisition. This makes it possible for example to detect photographs comprising elements originating from a different photographic camera, i.e. imported after acquisition, or else to ensure the integrity of a scanned or photographed document (a legal document for example).
The method of the invention will be able to be developed in specialized software from software manufacturers, in the search for proof on the basis of digital media. The method according to the invention can be used in court with a view to providing a decision aid tool.
7.A—Results on a Large Database
We shall briefly describe the Laplace approximation [43]. The Laplace procedure is aimed at providing an approximation of the integrals of the form ∫exp (−g (t)) dt when the function g (t) attains the global minimum at t*. Let us consider the integral
I=∫e
−g(t)
dt (52)
The Taylor expansion of the function g (t) at t* gives:
where g′ (t) and g″(t) represent respectively the first and second derivatives of the function g (t). Given that the function g (t) attains a minimum at t *, g′ (t *)=0. Consequently, the integral I can be approximated as
This integral takes the form of the Gaussian integral. We obtain:
A generalization was made in [32] with an arbitrary function h(t)
On the basis of relation (6), the DCT coefficient model of fI (x) is rewritten as follows
The function g(t) attains the minimum at r=|x|√β/2 and its second derivative is defined by g″ (t)=x2/t3. Consequently, the function fI (x) can be approximated as follows:
Note that on the basis of relation (33) it is necessary to define the first two moments of the variable |I|. Given a known variance σ2, the random variable I is the Gaussian variable of zero mean with the variance σ2. Thus, the random variable |I| follows the half-normal distribution cf. [44]. Consequently, we obtain
On the basis of the law for the total expectation, the mathematical expectation of |I| is given by
Furthermore, the variance of |I| is given by
Consequently, it follows from (33) that the first two moments of the LR A (IWk,i) under each hypothesis are given by:
where [⋅] and [⋅] designate respectively the expectation and the mathematical variance under the hypothesis . Because of a large number of coefficients in a natural image, by virtue of the central limit theorem of Lindeberg (CLT) [38, theorem 11.2.5], the statistical distribution of the LR ∧ (Z) is given by
where the notation designates convergence in terms of distribution and
with ∧ [(IWk,i)] and [∧ (IWk,i)] defined respectively in (63) and (64).
Number | Date | Country | Kind |
---|---|---|---|
1400768 | Mar 2014 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/FR15/50807 | 3/27/2015 | WO | 00 |