This application is based upon and claims priority to Chinese Patent Application No. 202311561214.1, filed on Nov. 22, 2023, the entire contents of which are incorporated herein by reference.
The present disclosure relates to the field of Deepfake detection, and in particular to an active-defense detection method based on facial landmark watermarking.
In recent years, Deepfake technology has entered an increasing number of fields in academia and industry, and has been widely used in multimedia forms such as video, audio, and images to generate fake multimedia products, resulting in various legal and ethical issues. In order to combat the aggressiveness of Deepfake technology, a new research branch called Deepfake detection has emerged. Currently, Deepfake detection mainly focuses on passive detection, that is, detection by artifacts in fake faces after generation. Passive detection methods usually only rely on passive defense and ex-post evidence collection when detecting deeply faked images or videos. This means that they cannot prevent the generation and propagation of deepfakes, and cannot avoid the potential harm caused by fake content.
At present, in methods based on semi-fragile watermarks, a watermark can be used to detect authenticity but cannot achieve tracing. In addition, in methods based on robust watermarks, the watermark embedded into the image is a randomly generated or fixed watermark, and unique watermarks cannot be generated for each individual.
In order to overcome the above shortcomings in the prior art, the present disclosure provides an active-defense detection method based on facial landmark watermarking, which can generate unique watermarks for each individual and achieve traceability and detection functions.
In order to solve the technical problem, the present disclosure adopts the following technical solution.
The active-defense detection method based on facial landmark watermarking includes the following steps:
Further, the step a) includes:
Further, the step b) includes:
Further, the step c) includes:
Preferably, in the step c-2), the convolutional layer of the original image processing unit includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1; and the atrous convolutional layer of the original image processing unit includes 64 channels and a convolutional kernel, with a size of 3, a dilation rate of 2, a stride of 1, and a padding of 1; in the step c-3), the first convolutional layer, the second convolutional layer, and the third convolutional layer in the first branch each include 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1; the first convolutional layer and the second convolutional layer in the second branch each include 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1; and the average pooling layer in the second branch has a window size of 4; in the step c-4), the linear layer of the watermark processing unit includes 256 input nodes and 256 output nodes; the convolutional layer of the watermark processing unit includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1; the atrous convolutional layer of the watermark processing unit includes 64 channels and a convolutional kernel, with a size of 3, a dilation rate of 2, a stride of 1, and a padding of 1; and the first deconvolutional layer and the second deconvolutional layer of the watermark processing unit each include 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1; in the step c-5), the first convolutional layer of the encoder includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1; and the second convolutional layer of the encoder includes 3 channels and a convolutional kernel, with a size of 1, a stride of 1, and a padding of 1.
Further, the step d) includes:
Further, the step e) includes:
Preferably, in the step e-1), the first convolutional layer of the decoder includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1; the first atrous convolutional layer of the decoder includes 64 channels and a convolutional kernel, with a size of 3, a dilation rate of 2, a stride of 1, and a padding of 1; the second convolutional layer of the decoder includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1; the second atrous convolutional layer of the decoder includes 64 channels and a convolutional kernel, with a size of 3, a dilation rate of 2, a stride of 1, and a padding of 1; and the flatten layer and the fully connected layer of the decoder each include 256 neurons.
Further, the step f) includes:
The present disclosure has the following beneficial effects. The present disclosure extracts facial landmarks from an original image and converts the extracted facial landmarks into a binary watermark. The present disclosure embeds the binary watermark into the original image to acquire a watermark image, allowing the watermark image to undergo a non-malicious/malicious operation to form a noise image or a malicious image. In this way, the model is robust to the non-malicious/malicious operation. The present disclosure introduces facial landmarks to generate a unique watermark for each individual and achieve traceability and detection functions.
The present disclosure is further described with reference to
An active-defense detection method based on facial landmark watermarking includes the following steps.
a) n facial images are acquired to form facial image set I I=(I1, I2, . . . , Ii, . . . , In, where I, denotes an i-th facial image, i ∈{1, . . . , n}. The i-th facial image Ii, i ∈ {1, . . . ,n}, is preprocessed to acquire preprocessed i-th facial image Icover_i, thereby acquiring preprocessed facial image set Icover.
b) Facial landmarks are extracted from the preprocessed i-th facial image Icover_i, and are converted is input into watermark Wm.
c) An encoder is constructed, and the i-th facial image Icover_i and the watermark Wm are input is input into the encoder to acquire watermark image Iwm.
d) The watermark image Iwm is injected into a noise pool to acquire noise image Inoise, and the watermark image Iwm is injected into a malicious pool to acquire malicious image Idep.
e) A decoder is constructed, and the noise image Inoise or the malicious image Idep is input into the decoder to acquire final watermark Wm1.
f) it is determined whether the noise image Inoise and the malicious image Idep are real or fake images based on the final watermark Wm1.
The present disclosure converts the extracted facial landmarks is input into a binary watermark. The present disclosure embeds the binary watermark is input into the original image to acquire watermark image, allowing the watermark image to undergo a non-malicious/malicious operation to form noise image. In this way, the model is robust to the non-malicious/malicious operation. The present disclosure can generate a unique watermark for each individual and achieve traceability and detection functions. The present disclosure is based on the idea of adversarial attacks. Typically, active defense involves two methods. Firstly, adversarial perturbations are added to images or videos to distort the content generated by Deepfake, achieving the effect of “knowing it is fake at a glance”. Secondly, adversarial watermarks are added to images or videos. Unlike adding perturbations, adding watermarks is done by training the robustness of the watermark. At present, in methods based on semi-fragile watermarks, a watermark can only detect authenticity and cannot achieve tracing. In addition, in methods based on robust watermarks, watermarks embed is input into the image are randomly generated or fixed watermarks, and unique watermarks cannot be generated for each individual.
In an embodiment of the present disclosure, the step a) is as follows.
a-1) The n facial images are acquired from a CelebA-HQ dataset to form the facial image set I. The CelebA-HQ dataset includes 30,000 facial images with different identities, each with a resolution of 1024*1024.
a-2) The i-th facial image Ii is resized by a resize( ) function in a Python imaging library (PIL) into a 256×256 image, thereby acquiring the preprocessed i-th facial image Icover_i and acquiring the preprocessed facial image set Icover={Icover_1, Icover_2, . . . , Icover_i, . . . , Icover_n}.
In an embodiment of the present disclosure, the step b) is as follows.
b-1) The facial landmarks in the preprocessed i-th facial image Icover_i are detected by a Dlib facial landmark detection algorithm to acquire facial landmark set Lm including m facial landmarks, Lm={l1,l2, . . . ,lm}, m=68, where {l1,l2, . . . , l17} are landmarks of a jawline, {l18, l19, . . . , l22} are landmarks of a right eyebrow, {l23,l24, . . . ,l27} are landmarks of a left eyebrow, {l28, l29, . . . , l36} are landmarks of a nose, {l37,l38, . . . ,l42} are landmarks of a right eye, {l43,l44′, . . . , l48} are landmarks of a left eye, and {l49,l50, . . . ,l68} are landmarks of a mouth.
b-2) i-th landmark li is defined by horizontal coordinate x, and vertical coordinate yi. A value of the horizontal coordinate xi is mapped to an integer range of 0-15 through a linear transformation, and the value is converted by a bin( ) function in Python into binary representation Wx
In an embodiment of the present disclosure, the step c) is as follows.
c-1) The encoder is constructed, including an original image processing unit, a watermark processing unit, a first convolutional layer, a batch normalization (BatchNorm) layer, an activation function layer, and a second convolutional layer.
c-2) The original image processing unit of the encoder is constructed, including a convolutional layer, a BatchNorm layer, a first rectified linear unit (ReLU) activation function, an atrous convolutional layer, a second ReLU activation function, a Dropout layer, a first CPC module, a second CPC module, and a third CPC module. The i-th facial image Icover_i is input into the convolutional layer, the BatchNorm layer, and the first ReLU activation function of the original image processing unit in sequence to acquire image feature Fcover_1. The image feature Fcover_1 is input into the atrous convolutional layer, the second ReLU activation function, and the Dropout layer of the original image processing unit in sequence to acquire image feature Fcover_2.
c-3) The first CPC module, the second CPC module, and the third CPC module are constructed, each including a first branch and a second branch, where the first branch includes a first convolutional layer, a first BatchNorm layer, a first ReLU activation function, a second convolutional layer, a second BatchNorm layer, a second ReLU activation function, a third convolutional layer, a third BatchNorm layer, and a third ReLU activation function in sequence, while the second branch includes an average pooling layer, a first convolutional layer, a ReLU activation function, and a second convolutional layer in sequence. The image feature Fcover_2 is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function in the first branch of the first CPC module in sequence to acquire image feature Fcover_2_1. The image feature Fcover_2_1 is input into the second convolutional layer, the second BatchNorm layer, and the second ReLU activation function in the first branch of the first CPC module in sequence to acquire image feature Fcover_2_2. The image feature Fcover_2_2 is input into the third convolutional layer, the third BatchNorm layer, and the third ReLU activation function in the first branch of the first CPC module in sequence to acquire image feature Fcover_2_3. The image feature Fcover_2 is input into the second branch of the first CPC module to acquire image feature Fcover_3. The image feature Fcover_3 and the image feature Fcover_2_3 are subjected to element-wise multiplication to acquire image feature Fcover_4. The image feature Fcover_4 and the image feature Fcover_2 are subjected to corresponding-elements addition to acquire image feature Fcover_5. The image feature Fcover_5 is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function in the first branch of the second CPC module in sequence to acquire image feature Fcover_5_1. The image feature Fcover_5_1 is input into the second convolutional layer, the second BatchNorm layer, and the second ReLU activation function in the first branch of the second CPC module in sequence to acquire image feature Fcover_5_2. The image feature Fcover_5_2 is input into the third convolutional layer, the third BatchNorm layer, and the third ReLU activation function in the first branch of the second CPC module in sequence to acquire image feature Fcover_5_3. The image feature Fcover_5 is input into the second branch of the second CPC module to acquire image feature Fcover_6. The image feature Fcover_6 and the image feature Fcover_5_3 are subjected to element-wise multiplication to acquire image feature Fcover_7. The image feature Fcover_7 and the image feature Fcover_5 are subjected to corresponding-elements addition to acquire image feature Fcover_8. The image feature Fcover_8 is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function in the first branch of the third CPC module in sequence to acquire image feature Fcover_8_1. The image feature Fcover_8_1 is input into the second convolutional layer, the second BatchNorm layer, and the second ReLU activation function in the first branch of the third CPC module in sequence to acquire image feature Fcover_8_2. The image feature Fcover_8_2 is input into the third convolutional layer, the third BatchNorm layer, and the third ReLU activation function in the first branch of the third CPC module in sequence to acquire image feature Fcover_8_3. The image feature Fcover_8 is input into the second branch of the third CPC module to acquire image feature Fcover_9. The image feature Fcover_9 and the image feature Fcover_8_3 are subjected to element-wise multiplication to acquire image feature Fcover_1. The image feature Fcover_10 and the image feature Fcover_8 are subjected to corresponding-elements addition to acquire image feature Fcover_11.
c-4) The watermark processing unit of the encoder is constructed, including a linear layer, a convolutional layer, a first BatchNorm layer, a first ReLU activation function, an atrous convolutional layer, a second ReLU activation function, a first Dropout layer, a first deconvolutional layer, a second BatchNorm layer, a third ReLU activation function, a second deconvolutional layer, a fourth ReLU activation function, a second Dropout layer, a first CPC module, a second CPC module, and a third CPC module. The watermark Wm is input into the linear layer of the watermark processing unit to acquire watermark feature f1. The watermark feature f1 is input into the convolutional layer, the first BatchNorm layer, and the first ReLU activation function of the watermark processing unit in sequence to acquire watermark feature f2. The watermark feature f2 is input into the atrous convolutional layer, the second ReLU activation function, and the first Dropout layer of the watermark processing unit in sequence to acquire watermark feature f3. The watermark feature f3 is input into the first deconvolutional layer, the second BatchNorm layer, and the third ReLU activation function of the watermark processing unit in sequence to acquire watermark feature f4. The watermark feature f4 is input into the second deconvolutional layer, the fourth ReLU activation function, and the second Dropout layer of the watermark processing unit in sequence to acquire watermark feature f5. The watermark feature f5 is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function in the first branch of the first CPC module in sequence to acquire watermark feature fm_5_1. The watermark feature fm_5_1 is input into the second convolutional layer, the second BatchNorm layer, and the second ReLU activation function in the first branch of the first CPC module in sequence to acquire watermark feature fm_5_2. The watermark feature fm_5_2 is input into the third convolutional layer, the third BatchNorm layer, and the third ReLU activation function in the first branch of the first CPC module in sequence to acquire watermark feature fm_5_3. The watermark feature f5 is input into the second branch of the first CPC module to acquire watermark feature fm_6. The watermark feature fm_6 and the watermark feature fm_5_3 are subjected to element-wise multiplication to acquire watermark feature fm_7. The watermark feature fm_7 and the watermark feature f5 are subjected to corresponding-elements addition to acquire watermark feature fm_8. The watermark feature fm_8 is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function in the first branch of the second CPC module in sequence to acquire watermark feature fm_8_1. The watermark feature fm_8_1 is input into the second convolutional layer, the second BatchNorm layer, and the second ReLU activation function in the first branch of the second CPC module in sequence to acquire watermark feature fm_8_2. The watermark feature fm_8_2 is input into the third convolutional layer, the third BatchNorm layer, and the third ReLU activation function in the first branch of the second CPC module in sequence to acquire watermark feature fm_8_3. The watermark feature fm_8 is input into the second branch of the second CPC module to acquire watermark feature fm_9. The watermark feature fm_9 and the watermark feature fm_8_3 are subjected to element-wise multiplication to acquire watermark feature fm_10. The watermark feature fm_10 and the watermark feature fm_8 are subjected to corresponding-elements addition to acquire watermark feature fm_11. The watermark feature fm_11 is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function in the first branch of the third CPC module in sequence to acquire watermark feature fm_11_1. The watermark feature fm_11_1 is input into the second convolutional layer, the second BatchNorm layer, and the second ReLU activation function in the first branch of the third CPC module in sequence to acquire watermark feature fm_11_2. The watermark feature fm_11_2 is input into the third convolutional layer, the third BatchNorm layer, and the third ReLU activation function in the first branch of the third CPC module in sequence to acquire watermark feature fm_11_3. The watermark feature fm_11 is input into the second branch of the third CPC module to acquire watermark feature fm_12. The watermark feature fm_12 and the watermark feature fm_11_3 are subjected to element-wise multiplication to acquire watermark feature fm_13. The watermark feature fm_13 and the watermark feature fm_11 are subjected to corresponding-elements addition to acquire watermark feature f6.
c-5) The image feature Fcover_11 and the watermark feature f6 are subjected to corresponding-elements addition to acquire feature F1. The feature F1 is input into the first convolutional layer, the BatchNorm layer, and the activation function layer of the encoder in sequence to acquire feature F2. The feature F2 is input into the second convolutional layer of the encoder to acquire the watermark image Iwm.
In the encoder, all convolutional layers, deconvolutional layers, and atrous convolutional layers are two-dimensional.
In this embodiment, preferably, in the step c-2), the convolutional layer of the original image processing unit includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1. The atrous convolutional layer of the original image processing unit includes 64 channels and a convolutional kernel, with a size of 3, a dilation rate of 2, a stride of 1, and a padding of 1. in the step c-3), the first convolutional layer, the second convolutional layer, and the third convolutional layer in the first branch each include 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1. The first convolutional layer and the second convolutional layer in the second branch each include 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1. The average pooling layer in the second branch has a window size of 4. In the step c-4), the linear layer of the watermark processing unit includes 256 input nodes and 256 output nodes. The convolutional layer of the watermark processing unit includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1. The atrous convolutional layer of the watermark processing unit includes 64 channels and a convolutional kernel, with a size of 3, a dilation rate of 2, a stride of 1, and a padding of 1. The first deconvolutional layer and the second deconvolutional layer of the watermark processing unit each include 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1. In the step c-5), the first convolutional layer of the encoder includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1. The second convolutional layer of the encoder includes 3 channels and a convolutional kernel, with a size of 1, a stride of 1, and a padding of 1.
In an embodiment of the present disclosure, the step d) is as follows.
d-1) The noise pool is constructed, including Identity noise, Dropout noise, Crop noise, GaussianNoise noise, SaltPepper noise, GaussianBlur noise, MedBlur noise, and joint photographic experts group (JPEG) noise. The watermark image Iwm is injected into the noise pool. A noise randomly selected from the noise pool is added to the watermark image Iwm to form the noise image Inoise. By implementing the source code described in the paper “MBRS: Enhancing Robustness of DNN-based Watermarking by Mini-Batch of Real and Simulated joint photographic experts group (JPEG) Compression”, the Identity noise, Dropout noise, Crop noise, GaussianNoise noise, SaltPepper noise, GaussianBlur noise, MedBlur noise, and joint photographic experts group (JPEG) noise are added. This is available in the prior art and will not be elaborated herein.
d-2) The malicious pool is constructed, including a simple swapping (SimSwap) model, an information bottleneck disentanglement for identity swapping (InfoSwap) model, a unified cross-entropy loss for deep face recognition (UniFace) model, and attribute manipulation algorithms (for manipulating nose, mouth, eyes, jawline, and eyebrow attributes). The watermark image Iwm is injected into the malicious pool. The watermark image Iwm is manipulated by a model or attribute manipulation algorithm randomly selected from the malicious pool to form the malicious image Idep. The SimSwap model achieves face swapping through the source code described in the paper “SimSwap: An Efficient Framework for High Fidelity Face Swapping”. The InfoSwap model achieves face swapping through the source code described in the paper “InfoSwap: Information Bottleneck Disentengement for Identity Swapping”. The UniFace model achieves face swapping through the source code described in the paper “Designing One Unified Framework for High-Identity Face Reenactment and Swapping”. The manipulating the shape of attributes such as nose, mouth, eyes, jawline, and eyebrows is achieved through the source code described in the paper “StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation”. This is available in the prior art and will not be elaborated herein.
In an embodiment of the present disclosure, the step e) is as follows.
e-1) The decoder is constructed, including a first convolutional layer, a first BatchNorm layer, a first ReLU activation function, a first atrous convolutional layer, a second ReLU activation function, a first Dropout layer, a first CPC module, a second CPC module, a third CPC module, a second convolutional layer, a second BatchNorm layer, a third ReLU activation function, a second atrous convolutional layer, a fourth ReLU activation function, a second Dropout layer, a flatten layer, and a fully connected layer. The noise image Inoise or the malicious image Idep is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function of the decoder in sequence to acquire image feature N1. The image feature N1 is input into the first atrous convolutional layer, the second ReLU activation function, and the first Dropout layer of the decoder in sequence to acquire image feature N2. The image feature N2 is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function in the first branch of the first CPC module in sequence to acquire image feature N2_1. The image feature N2_1 is input into the second convolutional layer, the second BatchNorm layer, and the second ReLU activation function in the first branch of the first CPC module in sequence to acquire image feature N2_2. The image feature N2_2 is input into the third convolutional layer, the third BatchNorm layer, and the third ReLU activation function in the first branch of the first CPC module in sequence to acquire image feature N2_3. The image feature N2 is input into the second branch of the first CPC module to acquire image feature N3. The image feature N3 and the image feature N2_3 are subjected to element-wise multiplication to acquire image feature N4. The image feature N4 and the image feature N2 are subjected to corresponding-elements addition to acquire image feature N5. The image feature N5 is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function in the first branch of the second CPC module in sequence to acquire image feature N5_1. The image feature N5_1 is input into the second convolutional layer, the second BatchNorm layer, and the second ReLU activation function in the first branch of the second CPC module in sequence to acquire image feature N5_2. The image feature N5_2 is input into the third convolutional layer, the third BatchNorm layer, and the third ReLU activation function in the first branch of the second CPC module in sequence to acquire image feature N5_3. The image feature N5 is input into the second branch of the second CPC module to acquire image feature N6. The image feature N6 and the image feature N5_3 are subjected to element-wise multiplication to acquire image feature N7. The image feature N7 and the image feature N5 are subjected to corresponding-elements addition to acquire image feature N8. The image feature N8 is input into the first convolutional layer, the first BatchNorm layer, and the first ReLU activation function in the first branch of the third CPC module in sequence to acquire image feature N8_1. The image feature N8_1 is input into the second convolutional layer, the second BatchNorm layer, and the second ReLU activation function in the first branch of the third CPC module in sequence to acquire image feature N8_2. The image feature N8_2 is input into the third convolutional layer, the third BatchNorm layer, and the third ReLU activation function in the first branch of the third CPC module in sequence to acquire image feature N8_3. The image feature N8 is input into the second branch of the third CPC module to acquire image feature N9. The image feature N9 and the image feature N8_3 are subjected to element-wise multiplication to acquire image feature N10. The image feature N10 and the image feature N8 are subjected to corresponding-elements addition to acquire image feature N11. The image feature N11 is input into the second convolutional layer, the second BatchNorm layer, and the third ReLU activation function of the decoder in sequence to acquire image feature N12. The image feature N12 is input into the second atrous convolutional layer, the fourth ReLU activation function, and the second Dropout layer of the decoder in sequence to acquire image feature N13. The image feature N13 is input into the flatten layer of the decoder to acquire image feature N14. The image feature N14 is input into the fully connected layer of the decoder to acquire the final watermark Wm1.
In this embodiment, preferably, in the step e-1), the first convolutional layer of the decoder includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1. The first atrous convolutional layer of the decoder includes 64 channels and a convolutional kernel, with a size of 3, a dilation rate of 2, a stride of 1, and a padding of 1. The second convolutional layer of the decoder includes 64 channels and a convolutional kernel, with a size of 3, a stride of 1, and a padding of 1. The second atrous convolutional layer of the decoder includes 64 channels and a convolutional kernel, with a size of 3, a dilation rate of 2, a stride of 1, and a padding of 1. The flatten layer and the fully connected layer of the decoder each include 256 neurons.
In an embodiment of the present disclosure, the step f) is as follows.
f-1) Constant count1 is defined with an initial value of 0. It is determined whether the binary values at corresponding positions of the final watermark Wm1 and the watermark Wm are the same. If the binary values of the final watermark Wm1 and the watermark Wm are different in one bit, it indicates that the binary values at corresponding positions of the final watermark Wm1 and the watermark Wm are different. At this point, the constant count1 is incremented by 1, and a final value of the constant count1 is divided by 256 to acquire bit error rate Ebit.
f-2) If the bit error rate Ebit is less than 0.5, it indicates that the final watermark Wm1 is the watermark Wm of the i-th facial image Icover_i, and the face in the i-th facial image Icover_i does not change, achieving a traceability function. Therefore, the noise image Inoise is a real image. If the bit error rate Ebit is greater than or equal to 0.5, the noise image Inoise is a fake image.
f-3) The malicious image Idep includes a trace of manipulation. Therefore, the i-th facial image Icover_i in the step b) is replaced with the malicious image Idep, and the step b) is repeated to acquire watermark W′m.
f-4) Constant count2 is defined with an initial value of 0. It is determined whether binary values at corresponding positions of the watermark W′m and the watermark Wm are the same. If the binary values of the watermark W′m and the watermark Wm are different in one bit, the constant count2 is incremented by 1, and a final value of the constant count2 is divided by 256 to acquire bit error rate E′bit.
f-5) It is determined that the malicious image Idep is a real image if the bit error rate E′bit is less than or equal to 0.5. It is determined that the malicious image Idep is a fake image if the bit error rate E′bit, is greater than 0.5. Since the watermark in the malicious image Idep can be robustly recovered from the decoder, the trustworthy original image with the watermark Wm can be tracked through matching between facial landmarks and the watermark.
The quantitative comparison results of the bitwise restoration accuracy of watermarks after a common image processing operation and a malicious face swapping operation on the CelebA-HQ dataset at 256×256 resolution are shown in Table 1. The robustness of watermarks is measured based on the accuracy of watermark restoration. The method proposed by the present disclosure achieves an average accuracy of 98.95% in the common image processing operation, which is superior to the state-of-the-art methods. The average accuracy of the method proposed by the present disclosure is improved by 14.29% compared to MBRS and 18.56% compared to FaceSigns. The generalization ability of different face swapping algorithms is evaluated. The method proposed by the present disclosure restores watermarks with an average accuracy of 98.05%, which is improved by 47.82% compared to MBRS and 47.94% compared to FaceSigns.
Finally, it should be noted that the above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Although the present disclosure has been described in detail with reference to the foregoing embodiments, those skilled in the art may still modify the technical solutions described in the foregoing embodiments, or equivalently substitute some technical features thereof. Any modification, equivalent substitution, improvement, etc. within the spirit and principles of the present disclosure shall fall within the scope of protection of the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
2023115612141 | Nov 2023 | CN | national |