The present invention relates to interpretation maps of convolutional neural networks, and more particularly, to interpretation maps of convolutional neural networks having certifiable robustness using Rényi differential privacy.
Convolutional neural networks have been successfully demonstrated on a variety of different computer vision applications, such as image classification, object detection, and semantic segmentation. With each of these applications, it is important to understand why the convolutional neural network makes the correct prediction.
With computer vision applications, an interpretation map is used to explain which part of an input image plays a more important role in the prediction of convolutional neural networks. However, it has been shown that many interpretation maps such as Simple Gradient, Integrated Gradient, DeepLIFT (Deep Learning Important FeaTures) and GradCam (Gradient-weighted Class Activation Mapping) are vulnerable to imperceptible input perturbations. See, for example, Ghorbani et al., “Interpretation of Neural Networks Is Fragile,” The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), pp. 3681-3688 (July 2019). In other words, slight perturbations to an input image can undesirably cause a significant discrepancy in its coupled interpretation map, while keeping the predicted label unchanged. These slight perturbations can be imperceptible to the human eye and can be generated by measurement bias or by adversaries (i.e., adversarial perturbations).
Perturbations can create confusion between the model interpreter and the classifier, which diminishes the trustworthiness of systems that use the interpretations in down-stream actions such as making medical recommendations, source code captioning, and transfer learning. Thus, the robustness of the interpretation maps against perturbations is essential.
Differential privacy (DP) has recently been introduced as a tool to improve the robustness of machine learning algorithms Differential privacy-based robust machine learning processes can provide theoretical robustness guarantees against adversarial attacks. Rényi differential privacy (RDP) is a generalization of the standard notion of DP. It has been proven that analyzing the robustness of convolutional neural networks using RDP can provide a stronger theoretical guarantee than standard DP. See, for example, Bai et al., “Certified Adversarial Robustness with Additive Noise,” 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) (hereinafter “Bai”). For instance, as provided in Bai, model robustness based on Rényi divergence between the outputs of models for natural and adversarial examples shows a higher upper bound on the tolerable size of perturbations as compared to standard DP.
Thus, RDP-based techniques for certifiable prediction robustness in the interpretation maps of convolutional neural networks would be desirable.
The present invention provides interpretation maps of convolutional neural networks having certifiable robustness using Rényi differential privacy. In one aspect of the invention, a method for generating an interpretation map is provided. The method includes: adding generalized Gaussian noise to an image x to obtain T noisy images, wherein the generalized Gaussian noise constitutes perturbations to the image x; providing the T noisy images as input to a convolutional neural network; calculating T noisy interpretations of output from the convolutional neural network corresponding to the T noisy images; re-scaling the T noisy interpretations using a scoring vector ν to obtain T re-scaled noisy interpretations; and generating the interpretation map using the T re-scaled noisy interpretations, wherein the interpretation map is robust against the perturbations.
Advantageously, the resulting interpretation map has certifiable robustness against perturbations which can be generated by measurement bias or even by adversaries, i.e., adversarial perturbations, and which can be imperceptible to the eye. However, without such a robustness interpretation, these perturbations can lead to a significant discrepancy in the associated interpretation map.
The generalized Gaussian noise can be drawn from a generalized normal distribution (μ,σ,b). For instance, a random variable X follows the generalized normal distribution (μ,σ,b) if its probability density function is:
wherein μ correspond to an expectation of X, σ correspond to a standard deviation of X and b correspond to a shape factor of X.
Introducing this generalized Gaussian noise to the input image generates T ‘votes,’ which are then aggregated using the scoring vector ν. The scoring vector ν can be designed according to a sigmoid function. According to an exemplary embodiment, the scoring vector ν=(ν1, . . . , νn). By way of example only, ν1≥ . . . ≥νn can be set for the scoring vector ν.
A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.
As provided above, many common interpretation maps of convolutional neural networks are vulnerable to external perturbations. For instance, slight perturbations to an input image, which may even be imperceptible to the human eye, can lead to a significant discrepancy in the associated interpretation map. These perturbations can be generated by measurement bias or even by adversaries, i.e., adversarial perturbations.
Advantageously, provided herein are Rényi differential privacy (RDP)-based techniques for certifiable prediction robustness against such perturbations. An RDP-based process can guarantee that its output distribution is insensitive to small perturbations of the input. In the context of the present invention, the term ‘privacy’ generally refers to an amount of information that is known by adversaries from the output. In other words, the more information known by adversaries the less privacy there is, and vice versa. Advantageously, as will be described in detail below, generalized Gaussian noise is added to an image to obtain T noisy images, which are then re-scaled using a scoring vector. The interpretation map generated from the re-scaled noisy images is certifiably robust against external perturbations.
Notably, the present RDP-based interpretation approach offers provable and certifiable top-k robustness. That is, the top-k important attributions of the interpretation map are provably robust under any input perturbation with bounded d-norm (for any d≥1, including d=∞). Further, the present RDP-based interpretation approach offers˜10% better experimental robustness than existing approaches in terms of the top-k attributions, and can provide a smooth tradeoff between robustness and computational efficiency. Thus, the certifiably robust interpretation map generated in accordance with the present techniques is also referred to herein as a Rényi-Robust-Smooth map. Experimentally, its top-k attributions are twice more robust than existing approaches when the computational resources are highly constrained.
As will be described in detail below, digital images are provided as input to a convolutional neural network for analysis such as for classification. By way of example only, if an input digital image provided to the convolutional neural network is that of a butterfly, then ideally the convolutional neural network outputs the classification (Butterfly). As highlighted above, interpretation maps are used to show how important the pixels of the input image are in the process of the image classification.
One important aspect of this process is whether the predictions made by the convolutional neural network are trustworthy. For instance, if slight imperceptible perturbations to the input image can cause vastly different interpretations and potential misclassification or misidentification of the image (see above), then this calls into question the trustworthiness of the predictions. Advantageously, the present techniques provide certifiably robust interpretation maps against d-norm perturbations, where d∈[1,∞] (i.e., d∈[1,∞] represents [1,∞)∪{∞}). Advantageously, the defender need not know the type of perturbation, as long as it is an d-norm perturbation. However, a stronger robustness guarantee can be provided if the exact perturbation type is known. The term ‘defender’ as used herein refers to a user(s) of the present techniques to defend against adversarial (or other) perturbations to the input image provided, e.g., by an adversary. The term ‘perturbation type’ generally refers to the method used to generate the adversarial perturbations.
In the following description, the interpretations of a convolutional neural network for image classification are used as an illustrative, non-limiting example, where the input is a digital image x, and the output is a label in C. An interpretation of this convolutional neural network explains why the convolutional neural network makes this decision by showing the importance of the features in the classification. An interpreter g:″×C→″ maps the (image, label) pairs to an n-dimensional interpretation map. Both the input image x and interpretation map m are treated as vectors of length n. The dimension of the output can be different from that of the input. ″ is used to simplify the notation. The output of h consists of pixel-level attribution scores, which reflect the impact of each pixel on making the prediction. Because the perturbations will not change the prediction of the convolutional neural network, the label part of input may be omitted when the context is clear (one input of the interpretation is the classification, i.e., the prediction of the neural network). When perturbations are introduced (e.g., by an adversary), x is replaced with its perturbed version x while keeping the predicted label unchanged. It is assumed that the perturbation is constrained by d-norm ∥x−{tilde over (x)}∥d(Σi=1n|xi−{tilde over (x)}i|d)1/3≤L. When d=∞, the constraint becomes maxi|xi−{tilde over (x)}i|23 L.
According to an exemplary embodiment, interpretation robustness is measured by the overlapping ratio between the top-k components of g (x, C) and the top-k components of g ({tilde over (x)},C). Here, Vk(g(x,C), g({tilde over (x)},C)) is used to denote this ratio. For example, V2((1, 2, 3), (2, 1, 2))=0.5 indicates that the 2nd and 3rd components are the top-2 components of (1, 2, 3) while the 1st and 3rd components are the top-2 components of (1, 2, 3). It is notable that the relative order between top-k components is not taken into account.
To formally define the top-k overlapping ratio, the set of top-k component Tk(·) of a vector is first defined as,
T
k(x){x:x∈x{circumflex over ( )}#{x′:x′∈x{circumflex over ( )}x′≥x}≤k}. (1)
Using the above notations, the top-k overlap ratio Vk between any pair of vectors x and {tilde over (x)} is then defined as:
In one exemplary embodiment, β-Top-k Robustness (interpretation robustness) is defined as follows. For a given input x with label C, an interpretation method g(.,C) is β-Top-k robust to an d-norm perturbation of size L if for any {tilde over (x)} s.t. ∥x−{tilde over (x)}∥d≤L,
V
k(g(x,C),g({tilde over (x)},C))≥β. (3)
As highlighted above, RDP is a novel generalization of standard DP. RDP uses Rényi divergence as the measure of difference between distributions. For instance, for two distributions P and Q with the same support S, the Rényi divergence of order α>1 is:
In one exemplary embodiment, a generalized RDP is adopted by considering ‘adjacent’ inputs whose d distance is no larger than L. Standard RDP assumes that the 0-norm of adjacent inputs is no more than 1. For instance, a randomized function ‘g is (α,∈,L)-Rényi differentially private to d distance, if for any pair of inputs x and {tilde over (x)} s.t. ∥x−{tilde over (x)}∥d≤L,
D
α(‘g(x)∥‘g({tilde over (x)}))≤∈. (4)
As in standard DP, smaller values of E correspond to a more private ‘g(·). RDP generalizes the standard DP, which is RDP with α=∞. As provided above, x represents the input image and ‘g represents a (randomized) interpretation algorithm (an ‘ sign is added on top of all randomized functions).
It is assumed that the randomized function ‘g(·) is Rényi differentially private. Thus, for any input x, ‘g(x) (which is a distribution) is insensitive to small perturbations on x. For instance, consider a deterministic algorithm h(x)E‘g[‘g(x)] that outputs the expectation of ‘g(x). It is thus intuitive that h(x) is also insensitive to small perturbations on x. In other words, the RDP property of ‘g(·) leads to the robustness of h(·).
A new robustness notion, Rényi robustness, is provided herein whose merits are two-fold. First, the (α,∈,L)-RDP property of ‘g(·) directly leads to the (α,∈,L)-Rényi robustness on E[‘g(·)]. Thus, like RDP, Rényi robustness also has many desirable properties. Second, Rényi robustness is closely related to other robustness notions. For instance, if setting α=∞, then the robustness parameter ∈ measures the average relative change of output when the input is perturbed. Further, as will be described in detail below, Rényi robustness is closely related to β-top-k robustness.
According to an exemplary embodiment, for a given input x∈A, a deterministic algorithm h(·):A→[0,1]n is (α,∈,L)-Rényi robust to d-norm perturbation if for any x and {tilde over (x)} s.t. ∥x−{tilde over (x)}∥d≤L,
wherein h(·)i refers to the i-th components of h(·).
Given the above overview,
As shown in step 102, the inputs to the process are an image x, a base interpretation method g, a scoring vector ν, a number of samples T, and a convolutional neural network. Thus, the image x may also be referred to herein as an ‘input image’. According to an exemplary embodiment, the input image x is a digital image. In general, a digital image is composed of a collection of elements or pixels. Each pixel has associated therewith a numeric representation of its intensity or gray level. According to an exemplary embodiment, the number of samples T is user-defined and depends, for example, on the computation resources the user has.
As highlighted above, the interpreter g maps features in the input image x to an n-dimensional interpretation map. According to an exemplary embodiment, the interpreter g is a software tool that is used to provide how important the input pixels in the image x are in the output classification, e.g., via importance scores. For instance, the more important one pixel is, the larger its value will be in the interpretation maps, and vice versa.
As will be described in detail below, independently and identically distributed (i.i.d.) generalized Gaussian noise will be added to each pixel in the input image x to generate a plurality of noisy images from the input image x. Introducing this external noise to the input image generates T ‘votes,’ which are then aggregated using the scoring vector ν.
Convolutional neural networks are a type of deep neural network that are often applied in computer vision applications. In machine learning and cognitive science, deep neural networks are a family of statistical learning models inspired by the biological neural networks of animals, and in particular the brain. Deep neural networks may be used to estimate or approximate systems and cognitive functions that depend on a large number of inputs and weights of the connections which are generally unknown.
Deep neural networks are often embodied as so-called “neuromorphic” systems of interconnected processor elements that act as simulated “neurons” that exchange “messages” between each other in the form of electronic signals. See, for example,
Similar to the so-called ‘plasticity’ of synaptic neurotransmitter connections that carry messages between biological neurons, the connections in a deep neural network that carry electronic messages between simulated neurons are provided with numeric weights that correspond to the strength or weakness of a given connection. The weights can be adjusted and tuned based on experience, making deep neural networks adaptive to inputs and capable of learning. For example, a deep neural network for image recognition is defined by a set of input neurons (see, e.g., input layer 202 in deep neural network 200) which may be activated by the pixels of an input image. After being weighted and transformed by a function determined by the network's designer, the activations of these input neurons are then passed to other downstream neurons, which are often referred to as ‘hidden’ neurons (see, e.g., hidden layers 204 and 206 in deep neural network 200). This process is repeated until an output neuron is activated (see, e.g., output layer 208 in deep neural network 200). The activated output neuron determines what image was read.
Instead of utilizing the traditional digital model of manipulating zeros and ones, deep neural networks such as deep neural network 200 create connections between processing elements that are substantially the functional equivalent of the core system functionality that is being estimated or approximated. For example, IBM's SyNapse computer chip is the central component of an electronic neuromorphic machine that attempts to provide similar form, function and architecture to the mammalian brain. Although the IBM SyNapse computer chip uses the same basic transistor components as conventional computer chips, its transistors are configured to mimic the behavior of neurons and their synapse connections. The IBM SyNapse computer chip processes information using a network of just over one million simulated “neurons,” which communicate with one another using electrical spikes similar to the synaptic communications between biological neurons. The IBM SyNapse architecture includes a configuration of processors (i.e., simulated “neurons”) that read a memory (i.e., a simulated “synapse”) and perform simple operations. The communications between these processors, which are typically located in different cores, are performed by on-chip network routers.
Referring back to methodology 100 of
Notably, the i.i.d. generalized Gaussian noise in step 104 is drawn from a generalized normal distribution (GND) (μ,σ,b). A random variable X follows the generalized normal distribution (μ,σ,b) if its probability density function is:
wherein μ, σ and b correspond to the expectation, standard deviation and shape factor of X, respectively, and Γ(·) refers to gamma functions. This GND generalizes Gaussian distributions (b=2) and Laplacian distribution (b=1).
The shape parameter d* is set according to the prior knowledge of the defender. It is assumed that the defender knows that the perturbations are based on d-norm wherein d≤dprior. For instance, dprior=∞ means the defender has no prior knowledge about the perturbations. According to an exemplary embodiment, the shape parameter d* is set according to the following setting:
In other words, d* is the round-up of dprior to the next even number, except that d=1 or d is sufficiently large. An upper threshold is set on d* because ln n-norm is close to ∞-norm in practice.
As shown in
In step 106, the interpreter g is used to calculate T noisy interpretation maps (or simply ‘interpretations’) ‘gt*(x)g(x+δt) from the output of the convolutional neural network corresponding to the T noisy (input) images. As provided above, according to an exemplary embodiment, the interpreter g is a software tool that is used to provide how important the pixels in the input image x are in the output classification, e.g., via importance scores. For instance, the more important one pixel is, the larger its value will be in the interpretation maps, and vice versa. By way of example only, in one exemplary embodiment, the interpreter g converts the T noisy images into saliency maps. A saliency map is an image that illustrates the unique quality of each pixel in the image. Saliency maps simplify, or in some other way change, the representation of an image into something that is simpler and easier to analyze.
An interpretation map such as a saliency map is a mapping of abstract concepts such as predictions from a convolutional neural network into a form that a human user can understand and use. For instance, users can interpret data in the form of images or text that they can view/read to understand. On the other hand, data containing sequences of unknown words and/or symbols are abstract and cannot be interpreted by users. Thus, an interpreter is needed to convert the data to a human-understandable form. According to an exemplary embodiment, a commercially-available interpreter software such as DeepLIFT (Deep Learning Important FeaTures) and/or Integrated Gradients is used.
In step 108, the T noisy interpretations ‘gt* are re-scaled (i.e., normalized) using a scoring vector ν=(ν1, . . . , νn) to obtain T re-scaled noisy interpretations. As will be described in detail below, the scoring vector ν is designed according to a sigmoid function. The T noisy interpretations ‘gt* after re-scaling are denoted by ‘gt(x), where ‘gt(x)i=νj if and only if ‘gt*(x) is ranked j-th in ‘gt*(x). Without loss of generality, ν1≥ . . . ≥νn is set for the scoring vector ν. This assumption guarantees that the pixels ranked higher in ‘gt(x) will contribute more to the Rényi-Robust-Smooth map.
In step 110, a certifiably robust interpretation map m is generated using the re-scaled T noisy interpretations ‘gt(x), i.e.,
Namely, according to an exemplary embodiment, an averaging of the re-scaled T noisy interpretations is performed in step 110 to generate interpretation map m. By ‘averaging’ it is meant that every pixel of the output interpretation map m is the arithmetic average of the corresponding pixels in the re-scaled T noisy interpretations. Advantageously, the interpretations generated in accordance with the present techniques are certifiably robust against external perturbations. As highlighted above, small perturbations to the input image x that might be imperceptible to the human eye can undesirably lead to different interpretations. With the present techniques, however, the interpretations are robust to external perturbations meaning that the external perturbations do not change the interpretations.
An exemplary implementation of methodology 100 is now described by way of reference to
In step S1, i.i.d. generalized Gaussian noise is added to the input image 302. The i.i.d. generalized Gaussian noise constitutes perturbations to the input image 302. According to an exemplary embodiment, the i.i.d. generalized Gaussian noise is added to each pixel of the input image 302. In the same manner as described above, step S1 is independently repeated T times (e.g., from about 20 times to about 200 times and ranges therebetween) on the input image 302 to obtain T images with noise (i.e., T noisy images 304).
It is notable that these slight perturbations to the input image 302 are imperceptible to the human eye. For instance, if one compares input image 302 to the T noisy images 304 shown in
The T noisy images 304 serve as input to a convolutional neural network which will analyze the image data, and in step S2 an interpreter is used to calculate T noisy interpretations 306 of the output from the convolutional neural network corresponding to the T noisy images 304. According to an exemplary embodiment, a commercially-available interpreter software such as DeepLIFT (Deep Learning Important FeaTures) and/or Integrated Gradients is used. According to an exemplary embodiment, the T noisy interpretations 306 are in the form of saliency maps.
In step S3, the T noisy interpretations 306 are re-scaled using a scoring vector ν to obtain T re-scaled noisy interpretations 308. According to an exemplary embodiment, the scoring vector ν=(ν1, . . . , νn), and ν1≥ . . . ≥νn is set for the scoring vector ν.
In step S4, a certifiably robust interpretation map 310 is generated using the re-scaled T noisy interpretations 308. According to an exemplary embodiment, an averaging of the re-scaled T noisy interpretations is performed in step S4 to generate interpretation map 310. As provided above, averaging means that every pixel of the output interpretation map 310 is the arithmetic average of the corresponding pixels in the re-scaled T noisy interpretations 308.
The robustness of the expected output, denoted as [‘g(x)], of the present Rényi-Robust-Smooth process is now described by way of reference to workflow 400 shown in
As highlighted above, the noise added to the image can guarantee the RDP property of ‘g(x). See description of Theorem 3, below. According to the intuitions of the RDP-robustness connection described above, it is expected that [‘g(x)] is robust to input perturbations. Here, m=[‘g(x)] and {tilde over (m)}=[‘g(x)].
As highlighted above, Theorem 1 connects RDP with Rényi robustness. Namely, if a randomized function ‘g(·) is (α,∈,L)-RDP to d distance, then ‘g[‘g(·)] is (α,∈,L)-Rényi robust to d distance.
As highlighted above, Theorem 2 connects Rényi robustness and β-top-k robustness. To simplify the notation, it is assumed that the Rényi-Robust-Smooth map m=(m1, . . . , mn) is normalized (∥m∥1=1). Further, is used to denote the i-th largest component in m. k0└(1−β)k┘+1 is used to denote the minimum number of changes to violet β-ratio overlapping of top-k (i.e., the overlapping ratio less than β). S={k−k0, . . . , k+k0+1} is used to denote the set of last k0, components in top-k and the top k0, components out of top-k. With Theorem 2, function h(·) is held to be (α,∈,L)-Rényi robust to d distance. Then, m=h(x) is β-Top-k robust to an d-norm perturbation of size L, if
In other words, Theorem 2 shows the level of Rényi robustness required by the β-top-k robustness condition. Regarding ∈robust, there are two terms inside of the ln(·) function. The first term corresponds to the minimum information loss to replace k0, items from the top-k. The second term corresponds to the unchanged components in the process of replacement.
Combining the results in Theorem 1 and Theorem 2, it is seen that β-top-k robustness can be guaranteed by the Rényi differential privacy property on the randomized interpretation algorithm ‘g(x)=ƒν, (g(x+δ,C)), where ƒν is the re-scaling function using scaling vector ν. Next, the RDP property of ‘g(·) is shown. In the following description, Γ(·) is used to denote the gamma function and (·) is used to denote the indicator function. The following equations are used to simplify notation,
Regarding noise level and RDP, Theorem 3 provides that for any re-scaling function ƒν(·), ‘g(x)=ƒv(g(x+δ)) where δ˜(0,σ2I, d*). Then, ‘g has the following properties with d distance:
wherein, according to the post processing property of RDP, Dα(‘g(x)∥‘g({tilde over (x)}))≤Dα(x+δ∥{tilde over (x)}+δ). When d∈[1,2], it is always the case that d*≥d and ∥x−{tilde over (x)}∥d≥∥x−{tilde over (x)}∥d*. Thus, all conclusion requiring ∥x−{tilde over (x)}∥d*≤L will also hold when the requirement becomes ∥x−{tilde over (x)}∥d≤L. Then, 2° of Theorem 3 follows by standard conclusions of RDP on Gaussian and Laplacian mechanisms. In Lemma 4 (see below), the RDP bound is proven for generalized normal mechanisms of even shape parameters. This bound can be directly applied to bound the KL-privacy of generalized Gaussian mechanisms.
Lemma 4: for any positive even shape factor b, letting x˜(0,σ,b) and {tilde over (x)}˜(L,σ,b) provides
it is always the case that d*≥d and 1° of this case holds according to the same reason as 2° (see above). When
∥x−{tilde over (x)}∥d*≤n1/3*−1/3·∥x−{tilde over (x)}∥d≤e·∥x−{tilde over (x)}∥d. Thus, 1° of Theorem 3 also holds when
Combining the conclusions of Theorem 1-3, the theoretical robustness of Rényi-Robust-Smooth shown in
It has been assumed thus far that the value of m=[‘g(x)] can be computed efficiently. However, there may not even exist a closed form expression of m, and thus m may not be computed efficiently. Here, the robustness-computational efficiency trade-off is evaluated when m is approximated through sampling. That is, m is approximated using Σt=1T‘gt(x) (the same procedure as methodology 100). To simplify notation, {circumflex over (β)} is used to denote the calculated top-k robustness from table 500. β is used to denote the real robustness parameter of m.
It is notable that the present approach will become more computational-efficient when the number of samples T becomes smaller. Theorem 5 shows that the Rényi robustness parameter ∈robust will have a larger probability to be large when the number of sample T becomes larger. The conclusion on attack size L or robust parameter β follows by applying Theorem 5 to table 500 or Theorem 2, respectively. Namely, according to Theorem 5, letting {circumflex over (∈)}robust denote the estimated Rényi robustness parameter from T samples, then
Pr[∈robust≥(1−δ∈){circumflex over (∈)}robust]≥1−negl(T), (9)
wherein negl(·) refer to the negligible function.
As provided above, a digital image x such as image 604 is provided as input to the present process. By way of example only, this original input image 604 can be provided by a user 606 to computer-based apparatus 602. Computer-based apparatus 602 is then configured to carry out one or more of the steps of methodology 100 of
The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
Turning now to
Apparatus 700 includes a computer system 710 and removable media 750. Computer system 710 includes a processor device 720, a network interface 725, a memory 730, a media interface 735 and an optional display 740. Network interface 725 allows computer system 710 to connect to a network, while media interface 735 allows computer system 710 to interact with media, such as a hard drive or removable media 750.
Processor device 720 can be configured to implement the methods, steps, and functions disclosed herein. The memory 730 could be distributed or local and the processor device 720 could be distributed or singular. The memory 730 could be implemented as an electrical, magnetic or optical memory, or any combination of these or other types of storage devices. Moreover, the term “memory” should be construed broadly enough to encompass any information able to be read from, or written to, an address in the addressable space accessed by processor device 720. With this definition, information on a network, accessible through network interface 725, is still within memory 730 because the processor device 720 can retrieve the information from the network. It should be noted that each distributed processor that makes up processor device 720 generally contains its own addressable memory space. It should also be noted that some or all of computer system 710 can be incorporated into an application-specific or general-use integrated circuit.
Optional display 740 is any type of display suitable for interacting with a human user of apparatus 700. Generally, display 740 is a computer monitor or other similar display.
The present techniques are further illustrated by the following non-limiting examples. The robustness and interpretation accuracy of the present Rényi-Robust-Smooth process were evaluated experimentally. It was found that the present approach performs better in terms of both robustness and interpretation accuracy as compared to other approaches such Sparsified SmoothGrad (a sparsified version of the SmoothGrad method).
In this implementation, the PASCAL Visual Object Classes Challenge 2007 dataset (VOC2007) was used to evaluate both interpretation robustness and interpretation accuracy. The annotated object positions in VOC2007 were compared with the interpretation maps to benchmark the accuracy of the interpretation methods. The visual geometry group (University of Oxford) VGG-16 was adopted as the convolutional neural network backbone for all the experiments. See Simonyan et al., “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv: 1409. 1556 April 2015 (14 pages). Simple Gradient is used as the base interpretation method g (denoted as g (·) in the input of methodology 100—see above).
The evaluation was focused on the most challenging case of d-norm perturbation: a ∞-norm perturbation. The robustness and accuracy of the present process was evaluated under the standard ∞-norm perturbation against the top-k component of Simple Gradient. The detailed configuration of the ∞-norm perturbation on top-k overlap was as follows:
Inputs: An integer k, learning rate lr, an input image x∈n; an interpretation method g (·,·), maximum ∞-norm perturbation L, and the number of iterations T.
In one exemplary implementation, the following parameters were used: ∞-norm perturbation size L=8/256≈0.03, learning rate lr=0.5 and the number of iterations T=300.
According to the noise setup described above, the shape factor of GND was set as d*=10 in consideration of the VOC2007 dataset image size. To compare with other approaches, the standard deviation of noise is fixed to be 0.1. The scoring vector ν used to aggregate “votes” is designed according to a sigmoid function. It is provided that
where Z=Σi′=1n[1+eη·(i′−k*)]−1 is the normalization factor. k* and η are user-defined parameters to control the shape of ν, wherein η=10−4 is set for the purposes of the present description.
The β-top-k robustness of Rényi-Robust-Smooth was compared with Sparsified SmoothGrad. Rényi-Robust-Smooth gets not only tighter robustness bound, but stronger robustness against ∞-norm attack in comparison with the base interpretation method. For both Sparsified SmoothGrad and the present approach, the number of samples T was set to be 50.
With regard to experimental Robustness,
The dash lines in
Accuracy is another main concern of interpretation maps. That is, to what extent the main attributes of an interpretation overlap with the annotated objects. A generalization of a pointing game was used to evaluate the performance of the present attribution approach. In a standard pointing game, an interpretation method calculates the interpretation map and compare it with the annotated object. If the top pixel in the interpretation map is within the object, a score of “+1” will be granted to the object. Otherwise, a score of “−1” will be granted. The pointing game score is the average score of all objects.
However, not only the top-1 pixel affects the quality of interpretation maps. Hence, the idea of the pointing game is generalized here to the top-k pixels by checking the ratio of top-k pixels of interpretation map within the region of the object and applying the same procedure as the standard pointing game.
Computational efficiency is another main concern of interpretation methods. However, most previous works on interpretation of certified robustness did not pay much attention to the computational efficiency. If the settings of those works are applied to larger images (i.e., ImageNet or VOC figures) or complex convolutional neural networks (i.e., VGG or GoogleLeNet), it may take hours to generate a single interpretation map. Thus, the performance of the present approach was experimentally verified when the number of generated noisy interpretations T is no larger than 30. Here, T can be used as a measure of computational resources because the time to compute on a noisy interpretation map is similar in the present approach and Sparsified SmoothGrad. It is also notable that the time taken by other steps are negligible in comparison with computing the noisy interpretations. Table 900 in
The following is a proof of the Theorem 1 that was described in conjunction with the description of
Applying generalized Radon's inequality provides,
The (α,∈)-RDP property of ‘gt(x) provides
Using the condition of ∥ν∥1=1 and νi≤1,
R
α([m]∥[{tilde over (m)}])≤∈.
By now, RDP has already been connected with the robustness on the measure of Rényi divergence.
The following is a proof of the Theorem 2 that was described in conjunction with the description of
In all remaining proof of Theorem 2, mi and {tilde over (m)}i are used to represent [mi] and [{tilde over (m)}i] respectively. Without loss of generality, it is assumed that m1≥ . . . ≥mn. Then, it is shown that {tilde over (m)}1≥ . . . ≥mk−k
It can be seen that s([mi]∥[{tilde over (m)}]) reaches the minimum on the same condition as Rα([mi]∥[{tilde over (m)}]) To outline the proof, the following claims are proven:
1° [Natural] To reach the minimum, there is exactly k0 different components in the top-k of [mi] and [{tilde over (m)}i].
2° To reach the minimum, {tilde over (m)}k−k
2°′ To reach the minimum, {tilde over (m)}k+1, . . . , {tilde over (m)}k+k
3° To reach the minimum, one must have {tilde over (m)}i≥{tilde over (m)}j for any i≤j.
It can be seen that the above claims on [{tilde over (m)}] are equivalent to the following Karush-Kuhn-Tucker (KKT) condition:
Solving this, it is known that s([mi]∥[{tilde over (m)}]) reaches minimum when
wherein
Plugging in the above condition provides,
It is then known that the β-top-k robustness condition will be filled if the Rényi divergence does not exceed the above value.
The following are proofs of the claims used in the proof of Theorem 2 (see above):
1° [Natural] To reach the minimum, there are exactly k0 different components in the top-k of [m] and [{tilde over (m)}]
Proof. Assume that i1, . . . , ik
is replaced by
in other words, there are k0+j displacements in the top-k of {tilde over (m)} while there are k0+j−1 displacements in the top-k of {tilde over (m)}(2). Thus,
it is provided that s([mi]∥[{tilde over (m)}(2)])−s([mi]∥[{tilde over (m)}])≤0. Thus, reducing the number of misplacements in top-k can reduce the value of s([m]∥[{tilde over (m)}]). If at least k0 displacements are required, then the minimum of s([m]∥[{tilde over (m)}]) must be reached when there are exactly k0 displacements.
2° To reach the minimum, {tilde over (m)}k−k
Proof. Assume that i1, . . . , ik
Note that {tilde over (m)}i
2°′ To reach the minimum, {tilde over (m)}k+1, . . . , {tilde over (m)}k+k
The following is a proof of the Lemma 4 that was described in conjunction with the description of
Because the behavior of Dα when α→1 is of interest, to simplify notation, it is provided that δ=α−1. When α−1→0, a first-order approximation is applied to exp(·),
It is notable that the first order approximation is exact when α→1. Thus, all “≈” become “=” when α→1. Because
is the probability density function (PDF) of (0,σ,b), then
By applying first-order approximation to ln(·),
provides,
When b−i is odd,
is an odd function and the integral will be zeros. When b−i is even, it will become an even function. Thus,
Through substituting
Finally, Lemma 4 follows by the definition of Gamma function.
The following is a formal version of Theorem 5 (with proof) that was described in conjunction with the description of
Considering that ∥m∥1=1, it is provided that ∈robust=−ln(1+ϕ(m)). To simplify notation, let
Thus ϕ(m)+2k0ψ(m)−Σi∈Smi*.
Lemma 6: using the above notations:
Proof the following statement is first proven: if for all i∈[n], |{circumflex over (m)}i−mi|≤δ, it is always that:
wherein ϕ(·) is a concave function when all components of the input vector are positive. Thus, if ϕ(m)≥ϕ({circumflex over (m)}),
Then, Inequality 10 follows by the fact that
Then, according to Hoeffding bound:
Pr[∀i∈[n],|{circumflex over (m)}i−mi|≤δ]≥1−n·exp(−2Tδ2).
Because [∀i∈[n],|{circumflex over (m)}i−mi|≤δ] is a special case of [ϕ(m)≤ϕ({circumflex over (m)})+δ], Lemma 6 follows.
Theorem 5 (formal) Using the notations above,
Proof. Theorem 5 follows by applying ∈robust=−ln(1+ϕ(m)) to Lemma 6.
Although illustrative embodiments of the present invention have been described herein, it is to be understood that the invention is not limited to those precise embodiments, and that various other changes and modifications may be made by one skilled in the art without departing from the scope of the invention.