This disclosure relates generally to machine learning systems, and more specifically to deep generative models that are robust to adversarial attacks.
In general, machine learning systems, such as deep neural networks, are susceptible to adversarial attacks. As an example, a machine learning system may be attacked via its input. Such adversarial attacks include perturbations on the input that cause a change in the output of the machine learning system. For instance, when the adversarial attacks relate to sensor data, the perturbations on the sensor data may cause the machine learning system to behave in a non-desired manner, for example, by providing incorrect output data, thereby resulting in negative consequences and effects. While there is some work relating to adversarial attacks in classification settings and, to a lesser extent, other supervised settings such as object detections or image segmentations, there does not appear to be much work with respect to providing generative models with defensive solutions to adversarial attacks.
The following is a summary of certain embodiments described in detail below. The described aspects are presented merely to provide the reader with a brief summary of these certain embodiments and the description of these aspects is not intended to limit the scope of this disclosure. Indeed, this disclosure may encompass a variety of aspects that may not be explicitly set forth below.
According to at least one aspect, a computer-implemented method includes obtaining input data. The input data includes sensor data and a radius of an p norm ball of admissible perturbations. The method includes generating input bounding data based on the input data. The method includes generating first bounding data and second bounding data by propagating the input bounding data on first and second outputs of an encoder network. The method includes generating third bounding data, which is associated with a latent variable and which is based on the output of the encoder network. The method includes generating fourth bounding data by propagating the third bounding data on an output of a decoder network. The method includes establishing a robustness certificate with respect to the input data by generating a lower bound of an evidence lower bound (ELBO) based on the first bounding data, the second bounding data, the third bounding data, and the fourth bounding data. The method includes updating the encoder network and the decoder network based on the robustness certificate such that the machine learning system, which includes the encoder network and the decoder network, is robust with respect to defending against the admissible perturbations.
According to at least one aspect, a system includes an actuator, a sensor system, a non-transitory computer readable medium, and a control system. The sensor system includes at least one sensor. The non-transitory computer readable medium stores a machine learning system having an encoder network and a decoder network that are trained based on a robustness certificate that lower bounds a loss function of the machine learning system. The control system is operable to control the actuator based on communications with the sensor system and the machine learning system. The control system includes at least one electronic processor that is operable to obtain input data that includes sensor data from the sensor system and perturbation data from a disturbance, wherein the sensor data is perturbed by the perturbation data. The input data is processed via the machine learning system. Output data is generated via the machine learning system. The output data is a reconstruction of the sensor data. The output data is associated with a likelihood that is unperturbed by the perturbation data. The likelihood corresponds to the ELBO. The sensor data and the output data are in-distribution data, which correspond to a model distribution associated with the machine learning system. The machine learning system identifies and processes the input data as being within a range of the in-distribution data even if the perturbation data is constructed to make the machine learning system identify and process the input data as being out-of-distribution data that is outside of the model distribution.
According to at least one aspect, a non-transitory computer readable medium includes at least computer-readable data, which when executed by an electronic processor, is operable to implement a method for training a machine learning system to be robust to perturbations. The method includes obtaining input data that includes sensor data and a radius of an p norm ball of admissible perturbations. The method includes generating input bounding data based on the input data. The method includes generating first bounding data and second bounding data by propagating the input bounding data on first and second outputs of an encoder network. The method includes generating third bounding data, which is associated with a latent variable and which is based on the output of the encoder network. The method includes generating fourth bounding data by propagating the third bounding data on an output of a decoder network. The method includes establishing a robustness certificate with respect to the input data by generating a lower bound of an evidence lower bound (ELBO) based on the first bounding data, the second bounding data, the third bounding data, and the fourth bounding data. The method includes updating the encoder network and the decoder network based on the robustness certificate such that the machine learning system, which includes the encoder network and the decoder network, is robust with respect to defending against the admissible perturbations.
These and other features, aspects, and advantages of the present invention are discussed in the following detailed description in accordance with the accompanying drawings throughout which like characters represent similar or like parts.
The embodiments described herein, which have been shown and described by way of example, and many of their advantages will be understood by the foregoing description, and it will be apparent that various changes can be made in the form, construction, and arrangement of the components without departing from the disclosed subject matter or without sacrificing one or more of its advantages. Indeed, the described forms of these embodiments are merely explanatory. These embodiments are susceptible to various modifications and alternative forms, and the following claims are intended to encompass and include such changes and not be limited to the particular forms disclosed, but rather to cover all modifications, equivalents, and alternatives falling with the spirit and scope of this disclosure.
As described herein, the embodiments relate to applications of provably robust trainings in the context of generative models. More specifically, the embodiments construct provable bounds in relation to loss functions in the context of unsupervised generative models rather than supervised classification tasks. In an example embodiment, for instance, the provably robust training relates to at least one generative model, such as a variational auto-encoder (VAE). In this regard, a certifiably, robust lower bound is defined on the variational lower bound of the likelihood, and then this lower bound is optimized during training to generate a provably robust VAE (“proVAE”). Also, these provably robust generative models are evaluated to be substantially more robust to adversarial attacks (e.g., an adversary trying to perturb inputs so as to drastically lower their likelihood under the generative model) compared to a control group of generative models.
The control system 120 is configured to obtain the sensor data directly or indirectly from one or more sensors of the sensor system 110. Upon receiving input data (e.g., the sensor data and/or image data based on the sensor data), the control system 120 is configured to process this input data via a processing system 140 in connection with a machine learning system 200. In this regard, the processing system 140 includes at least one processor. For example, the processing system 140 includes an electronic processor, a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor, a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), any suitable processing technology, or any combination thereof. Upon processing this input data, the processing system 140 is operable to generate output data via the machine learning system 200. Additionally or alternatively, the processing system 140 is operable to generate classification data that classifies the output data of the machine learning system 200. Also, the processing system 140 is operable to provide control data to an actuator system 170 based on the classification data and/or the output data of the machine learning system 200.
In an example embodiment, the machine learning system 200 is stored in a memory system 160. In an example embodiment, the memory system 160 is a computer or electronic storage system, which is configured to store and provide access to various data to enable at least the operations and functionality, as disclosed herein. In an example embodiment, the memory system 160 comprises a single device or a plurality of devices. In an example embodiment, the memory system 160 can include electrical, electronic, magnetic, optical, semiconductor, electromagnetic, or any suitable technology. For instance, in an example embodiment, the memory system 160 can include random access memory (RAM), read only memory (ROM), flash memory, a disk drive, a memory card, an optical storage device, a magnetic storage device, a memory module, any suitable type of memory device, or any combination thereof. In an example embodiment, with respect to the control system 120 and/or processing system 140, the memory system 160 is local, remote, or a combination thereof (e.g., partly local and partly remote). For example, the memory system 160 can include at least a cloud-based storage system (e.g. cloud-based database system), which is remote from the processing system 140 and/or other components of the control system 120.
In an example embodiment, the machine learning system 200 includes at least one deep neural network. More specifically, the deep neural network includes at least one trained, provably robust generative model (“PROVAE”). In response to input data, the processing system 140 (in connection with the machine learning system 200) is operable to generate output data that is a reconstruction of the input data. For example, when the input data is sensor data (and/or image data based on the sensor data), the processing system 140 is operable to generate output data via the machine learning system 200 in which the output data is a reconstruction of the sensor data. As another example, when the input data includes sensor data (and/or image data based on the sensor data) that is perturbed by perturbation data, the processing system 140 is operable to generate output data via the machine learning system 200 in which the output data is a reconstruction of the sensor data, whereby the likelihood effects are not corrupted by the perturbation data. This feature of the machine learning system 200 is advantageous in providing a defensive solution to adversarial attacks in that such perturbation data does not cause drastic changes in the likelihood effects and/or output data of the machine learning system 200.
In addition, the system 100 includes other components that contribute to an operation of the control system 120 in relation to the sensor system 110 and the actuator system 170. For example, as shown in
Additionally or alternatively to the first application (
Upon completing the training process 204 with the in-distribution data 202, the trained provably robust generative model (e.g., PROVAE 200A) is generated and ready for operation.
In an example embodiment, as shown in
Upon receiving training data 302 (e.g., sensor data and/or image data based on the sensor data), the processing system 310 is configured to train the generative model in connection with the machine learning data 304. In this regard, the processing system 310 includes at least one processor. For example, the processing system 310 includes an electronic processor, a CPU, a GPU, a microprocessor, a FPGA, an ASIC, any suitable processing technology, or any combination thereof. In an example embodiment, the processing system 310 communicates with the memory system 300 to generate the trained provably robust generative model (“PROVAE”) 200A based on the training data 302 and the machine learning data 304.
In general, the VAE is trained based upon a bound on the log-likelihood, which the processing system 310 is configured to further bound in the adversarial setting. The VAE is trained based upon the so-called evidence lower bound (ELBO) L(x), which expresses the probability p(x) in terms of a latent variable z∈Rk and then bounds the likelihood as
log p(x)=log∫p(x|z)p(z)dz≥z˜q(z|x)[log p(x|z)]−KL(q(z|x)∥p(z))≡L(x) (1)
where q(z|x) is a so-called variational distribution, that attempts to approximate the posterior p(z|x) (for which case the bound is tight), but which does so via a more tractable distribution class. In the VAE setting, the processing system 310 selects
q(z|x)=(z;μθ(X);σθ2(x)I) (2)
p(x|z)=(x;gθ(z);σ02I) 3
p(z)=(z;0,I) (4)
where μθ(x) and σθ2(x) are the encoder networks that predict the mean and variance of the Normal distribution q from the input x and gθ(z) is the decoder network that generates a sample in input space given a latent vector z.
Under these assumptions, the ELBO has the following explicit form:
where c is a constant. In general, the encoder and decoder networks are jointly trained to maximize the lower bound as represented by the following equation:
using, for example, stochastic gradient descent, where the processing system 310 replaces the sampling procedure z˜(μ(x); σ2(x)I) with the equivalent process z=μ(x)+σ(x)·δ˜(0,I) to draw a sample and ensure that the mean and variance terms can be backpropagated through via a so-called reparameterization technique.
The processing system 310 is configured to obtain a lower bound for the ELBO for all the possible perturbations δ∈Δ(x) as L(x)≤L(x+δ)≤log(p(x+δ)). This lower bound provides a certificate of robustness of the ELBO. The effect on the ELBO of any possible perturbation in Δ(x) will be lower bound by L. The optimization of the lower bound L effectively trains the network to be robust to the strongest possible out-of-distribution attack within Δ(x) (t∞ ball of radius ∈train around x).
In order to lower bound the ELBO, the processing system 310 performs IBP throughout the layers of μθ, σθ, and gθ such that the processing system 310 obtains bounds for the propagation of the admissible perturbations on the input space in terms of the ELBO. The processing system 310 is thus configured to bound both the Kullback-Leibler (KL) divergence of the perturbed input KL(q(z|x+δ)∥p(z)) and the expected value of the perturbed conditional log-likelihood σ02∥x−gθ(z)∥22. To do so, the processing system 310 performs IBP on the encoder networks μθ and σθ, and IBP on the decoder network gθ.
As preliminaries to the method 400, the processing system 310 propagates lower and upper bounds on building blocks of the encoder and decoder networks. In general, the building blocks include at least linear and convolution layers, and monotonic element-wise activation functions. These features enable the processing system 310 to sequentially connect the different interval bounds, from input to output of the deep neural network (e.g., the VAE). In this disclosure, for convenience and lightness of notation, the upper bound of p is denoted as
With respect to linear operators, the processing system 310 considers Wν to be a linear operator W applied to ν, and (
Wν=W+ν+W−
With respect to monotonic functions, the processing system 310 is configured to denote νt=h(νt-1) as a monotonic (non-decreasing or non-increasing) function applied element-wise on νt-1. The processing system 310 expresses the upper and lower bounds of νt in terms of h and the upper and lower bounds of νt-1 as follows,
νt=min{h(
These bounds hold for monotonic activation functions, such as ReLU and sigmoid.
With respect to 2 norm squared, the processing system 310 is configured to obtain lower and upper bounds of the p norm squared of ν by recognizing that there is an element-wise dependency on the lower and upper bounds of ν. As ∥ν∥22=Σi=1nνi2, where νi denotes the ith component of ν, the processing system 310 obtains the respective upper and lower bounds as a function of
After the preliminaries are performed, the processing system 310 implements the method 400 to optimize the robustness certificate obtained from the worst-case perturbation, for example, in terms of at least one log-likelihood function. The method 400 includes one or more iterations (or epochs). In this case, each iteration (or epoch) includes computing the robustness certificate through bound propagation (e.g., steps 402-412) and optimizing the robustness certificate (e.g., step 414), thereby providing robustly trained encoder and decoder networks (e.g., step 416). Referring to
At step 402, the processing system 310 is configured to obtain an input x, and generate at least one bound on the input x. For example, the input x includes training data, such as X={x1, . . . , xn} and where xi∈M. For x∈X, the processing system 310 is configured to generate input bounding data on the input xi. The input bounding data includes upper bound data on the input xi and lower bound data on the input xi.
Also, given the first encoding component μθ(x) and the second encoding component σθ(x), the processing system 310 constructs the encoder networks to be a succession of convolutional layers with ReLU activations with at least one last layer being at least one fully connected linear layer. In addition, without requiring perturbations as input x, the processing system 310 is operable to consider any admissible perturbed input xi+δ, by defining perturbation data as δ∈Δ∈
xi=xi−∈train1 (13)
At step 404, the processing system 310 is configured to generate bounds for outputs of the encoder network. These bounds include first bounding data and second bounding data of the VAE. The first and second bounding data relate to the respective outputs of the first and second encoding components of the encoder. More specifically, the processing system 310 is configured to generate first upper bound data and first lower bound data for the output of the first encoding component μθ(x). In addition, the processing system 310 is configured to generate second upper bound data and second lower bound data for the output of the second encoding component σθ(x). The processing system 310 is configured to generate the first bounding data of the first encoding component μθ(x) independently of the second bounding data of the second encoding component σθ(x). In this regard, the processing system 310 is configured to generate the first bounding data and the second bounding data at the same time or at different times.
With the propagation of the interval bounds for linear and convolution layers in equations 7-8 and for the activation functions in equations 9-10, the processing system 310 is configured to bound the outputs of the encoder network based on the IBP of xi and
μi=min{μθ(xi),μθ(
σi=min{σθ(xi),σθ(
where μi=μθ(xi) and σi=σθ(xi) are the outputs of the encoder, μθ and
Given the bounds on the outputs of the encoder network, the processing system 310 is configured to bound the KL divergence between (μi,σiI) and (0,I) via
where (μi)j2 and (σi)j2 denote the jth component of the squared mean and covariance of the ith sample, as outputted by the encoder. In addition, the processing system 310 is configured to continue from the bounds on μi and σi at an end portion of the encoder networks to enable IBP to be performed via the decoder network.
At step 406, the processing system 310 is configured to draw at least a sample, denoted as “ε,” and compute bounds on latent variable, denoted as “z.” For example, the processing system 310 is configured to obtain bounds (or third bounding data of the VAE) on the latent variable via a reparameterization technique. More specifically, with the reparameterization technique, the bound on the latent variable follows from the bound for linear operators in equations 7-8, as the reparameterization is a linear operator.
For example, the processing system 310 is configured to process a sample, as denoted by ∈˜(0,I), ∈+=max(∈,0), and ∈−=min(∈,0) such that ∈=∈++∈−, where 0 represents the mean and I represents an identity vector for the covariance. This reparameterization technique decouples the randomness from the encoder by expressing the latent variable as zi=μi+σi∈. After using the reparameterization technique, the processing system 310 is configured to bound the latent variable zi (e.g., generate the third bounding data), which is represented as
zi=μi+
At step 408, the processing system 310 is configured to generate bounds for the output of the decoder network gθ(x). In this regard, the processing system 310 is configured to generate fourth bounding data of the VAE, which includes fourth upper bound data and fourth lower bound data of the decoder network gθ(x). For example, after step 408, the bounds on the latent variable (i.e., z) are then propagated through the decoder network gθ, which includes linear and convolutional layers (e.g., linear operators where bounds can be propagated with equations 7-8) with ReLU and sigmoid activations (e.g., monotonic activation functions where bounds can be propagated with equations 9-10). Accordingly, the processing system 310 to provide bounds on the output of the decoder network as a function of the bounds on the latent vector zi, as with the encoder networks.
In addition, the processing system 310 addresses the problem of bounding the conditional log-likelihood log p(xi|zi). To do so, the processing system 310 fixes the diagonal covariance σθI in p(xi|zi)=(x; gθ(zi),σθ2I). The processing system 310 thus reduces the problem of bounding the conditional log-likelihood to a problem of bounding ∥xi−gθ(zi)∥2. Upon solving equations 11-12, the processing system 310 is configured to bound this function via
where the processing system 310 is configured to take the element-wise max and min and sum in j across the elements of x.
At step 410, the processing system 310 is configured to generate a robustness certificate by generating at least one bound for the ELBO L, which serves as a loss function. More specifically, the processing system 310 is configured to generate lower bound data on the ELBO. For example, the processing system 310 combines the upper and lower bounds for the encoder network and decoder networks, and associated lower bounds on the conditional log-likelihood and upper bound on the KL divergence, as the ELBO takes into account the negative of the KL divergence, thereby obtaining lower bound data from the following lower bound:
where the upper and lower bounds for the encoder networks are propagated, and the reparameterization technique as
zi=μi+
μi=min{μθ(xi+∈train1),μθ(xi−∈train1)} (28)
σi=min{σθ(xi+∈train1),σθ(xi−∈train1)} (30)
The resulting lower bound on the ELBO lower bounds the log-likelihood of a perturbed sample log p(xi+δ), working as a robustness certificate for the perturbation. This means that if L≥α with input interval bounds fed into the encoder being xi−∈train1 and xi+∈train1 (an ∞ ball centered in xi of radius ∈train), this guarantees that log p(x+δ)≥α for all δ:∥δ∥∞≤∈train.
The method 400 includes training the VAE by optimizing the lower bound. For example, the processing system 310 trains the provably robust deep generative model by optimizing the lower bound of the ELBO L, corresponding to optimizing the robustness certificate, instead of optimizing the ELBO L directly.
At step 414, the processing system 310 is configured to update encoder network and decoder network to optimize the lower bound of the ELBO. For example, the processing system 310 is operable to update the parameters of the VAE and maximize the lower bound of the ELBO L directly. In this case, the parameters (i.e., θ) include at least internal weights, which are associated with the encoder and decoder networks of the VAE.
At step 416, the processing system 310 is configured to output robustly trained networks, which include at least the first encoder network μθ(x), the second encoder network σθ(x), and the decoder network gθ(x). Once the processing system 310 outputs the robustly trained networks, the processing system 310 is configured to deploy or transmit the PROVAE 200A for use. For example, once robustly trained, the PROVAE 202 is deployable in and/or employable by the system 100 of
Furthermore, the method 400 is not limited to the steps shown in
In example assessments, the VAE and the PROVAE 200A are evaluated based on the unperturbed samples of image data 500. When this unperturbed sample of image data 500 is presented as input data to a VAE, then the loss is represented as L=−28.28. As a comparison, for example, when the unperturbed sample of image data 500 is presented as input data to a PROVAE 200A with ∈train=0.01, then the loss is represented as L=−31.10. As another comparison, for example, when the unperturbed sample of image data 500 is presented as input data to a PROVAE 200A with ∈train=0.1, then the loss is represented as L=−41.31. As demonstrated by these assessments, there is not a significant difference in performance between the VAE and the PROVAE 200A. Moreover, as demonstrated by the loss values, the VAE and the PROVAE 200A are operable to identify and process the image data 500 correctly as being in-distribution data and within a range of handwritten digits.
In other example assessments, the VAE and the PROVAE 200A are evaluated based on the perturbed samples of input data in which the image data 500 (
As described herein, the embodiments include a number of advantageous features and benefits. For example, the embodiments relate to training and generating provably robust generative models, which are based on defining robust lower bounds to the variational lower bounds of the likelihoods (i.e., the ELBO) and optimizing these lower bounds to train the provably robust generative models. These embodiments introduce provable defenses against adversarial attacks in the domain of generative models, namely out-of-distribution attacks, where a sample within the distribution of the model is perturbed to lower its likelihood.
In addition,
Also, the embodiments are advantageous in providing technical solutions to the technical problems associated with the susceptibility of machine learning systems (e.g., deep generative models) to adversarial attacks. These adversarial attacks have been known to causes imperceptible changes to input data, which may lead to drastic changes in likelihood functions, thereby providing incorrect output data. In addressing this technical issue, the embodiments, as disclosed herein, provide provably robust generative models in which these small changes (e.g. perturbations) to the inputs of machine learning systems do not cause drastic changes in the likelihood functions of the machine learning systems. Accordingly, as discussed above, the embodiments described herein are advantageous in providing generative models with defensive solutions to adversarial attacks.
That is, the above description is intended to be illustrative, and not restrictive, and provided in the context of a particular application and its requirements. Those skilled in the art can appreciate from the foregoing description that the present invention may be implemented in a variety of forms, and that the various embodiments may be implemented alone or in combination. Therefore, while the embodiments of the present invention have been described in connection with particular examples thereof, the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the described embodiments, and the true scope of the embodiments and/or methods of the present invention are not limited to the embodiments shown and described, since various modifications will become apparent to the skilled practitioner upon a study of the drawings, specification, and following claims. For example, components and functionality may be separated or combined differently than in the manner of the various described embodiments, and may be described using different terminology. These and other variations, modifications, additions, and improvements may fall within the scope of the disclosure as defined in the claims that follow.
Number | Name | Date | Kind |
---|---|---|---|
10270591 | Brockmann | Apr 2019 | B2 |
10324467 | Abeloe | Jun 2019 | B1 |
11374952 | Coskun | Jun 2022 | B1 |
20150096057 | Kiefer | Apr 2015 | A1 |
20160364553 | Smith | Dec 2016 | A1 |
20190318206 | Smith | Oct 2019 | A1 |
20190370257 | Wolf | Dec 2019 | A1 |
20200004933 | Cocchi | Jan 2020 | A1 |
20200242252 | Chen | Jul 2020 | A1 |
20200342326 | Rahnama Moghaddam | Oct 2020 | A1 |
20210003380 | Diehl | Jan 2021 | A1 |
20210065063 | Gazzetti | Mar 2021 | A1 |
20210272304 | Yang | Sep 2021 | A1 |
20220237189 | Park | Jul 2022 | A1 |
Entry |
---|
Kingma et al., “Auto-Encoding Variational Bayes,” arXiv:1312.6114v10 [stat.ML], 14 pages, May 1, 2014. |
Gondim-Ribeiro et al., “Adversarial Attacks on Variational Autoencoders,” arXiv:1806.04646v1 [cs.CV], 10 pages, Jun. 12, 2018. |
Raghunathan et al., “Certified Defenses Against Adversarial Examples”, arXiv:1801.09344v1 [cs.LG], 15 pages, Jan. 29, 2018. |
Gowal et al., “On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models,” arXiv: 1810.12715v3 [cs.LG], 14 pages, Jan. 28, 2019. |
Number | Date | Country | |
---|---|---|---|
20210125107 A1 | Apr 2021 | US |