The present disclosure generally relates to neural network security, and in particular, to a system and associated method for a full-stack neural network obfuscation tool to mitigate neural architecture theft.
The architecture information of a Deep Neural Network (DNN) model is very sensitive and should never be exposed. It is a valuable Intellectual Property (IP) that costs companies lots of time and resources. Knowledge of the exact architecture allows an adversary to build a more precise substitute model and use the model to launch devastating adversarial attacks. For instance, it is shown that accurate architecture information enables the adversary to improve the attack success rate of input adversarial attack by almost 3 times.
It is with these observations in mind, among others, that various aspects of the present disclosure were conceived and developed.
Corresponding reference characters indicate corresponding elements among the view of the drawings. The headings used in the figures do not limit the scope of the claims.
Side-channel based deep-neural network (DNN) architecture theft has been reported in several prior works. An outsider can extract the DNN architecture through side-channel information leakage, as shown in
Previous efforts on preventing DNN architecture stealing have focused on hardware to eliminate information leakage. Oblivious Random Access Memory (ORAM) technology prevents memory access leakage by encrypting the memory address. Miss Status Holding Registers (MSHR) were redesigned in one prior art example to obfuscate GPU memory access and add a layer of randomness. Though hardware modifications are effective countermeasures, they are not beneficial to existing devices and have high performance overhead. Recently, one such prior work includes a decision tree-based detection method against spy applications on GPU. However, it suffers from high false positive rate and is not practical. Tensor Virtual Machine (TVM) has also been proposed as a potential countermeasure. Nevertheless, as shown by experiments reported herein (
The present disclosure provides a framework 100 (
The framework 100 uses a genetic algorithm to search for the best combination of sequence and dimension obfuscations that achieve strong obfuscation for a given user-defined time budget. The obfuscation strength of the framework 100 can be measured by Layer Error Rate (LER) which represents normalized editing distance of extracted layer sequence given the ground-truth layer sequence in sequence obfuscation, and Dimension Error Rate (DER) which represents the normalized error of extracted dimension parameters in a layer in dimension obfuscation. The contributions of the framework 100 can be summarized as follows:
This is the first work on mitigating NN architecture stealing attack with pure-software obfuscation. This disclosure describes a total of 8 obfuscating knobs that can be applied by the framework 100 across an entire DNN execution stack to achieve sequence and dimension obfuscations, and demonstrate the performance on state-of-the-art GPUs.
The framework 100 is an obfuscation tool backed by genetic algorithm to search for the best combination of obfuscations to obfuscate any neural network architecture with user-defined inference latency budget.
For sequence obfuscation, the framework 100 can obfuscate a ResNet-18 architecture to have a 2.44 LER (which translates to a 44-layer editing distance) against state-of-art LSTM-based sequence predictors with only 2% increase in overall latency.
For dimension obfuscation, the present disclosure shows how a convolution layer with 64 input and 128 output channels can be obfuscated so that it is extracted as a layer with 207 input and 93 output channels with only 2% increase in layer-level latency.
Table I summarizes neural network notation that is used throughout this disclosure. In particular, the present disclosure focuses on the Conv2D operator, represented in 4D by (k1, k2, c, j).
Generally, an NN architecture is a topology of neural network layers with non-linear functions.
The first step of the typical process 10 is scripting (coding) of a DNN architecture using Python with popular frameworks such as Pytorch or Tensorflow. The scripting step transforms the raw design into a high-level dataflow graph (aka. computational graph). Next, the high-level graph is ported to TVM for further optimization. One can also directly use TorchScript or Tensorflow XLA for graph optimization. For instance, in TVM, the graph optimization process is handled by Relay module, which provides handy options such as: 1) “FoldConstant ( )”, which evaluates expression involves only constants; 2) “EliminateCommonSubexpr ( )” which creates a shared variable for multiple expressions with same output to avoid the same expression being evaluated multiple times; and 3) “FuseOps ( )”, which fuses multiple expressions together. User can specify which optimizations to enable.
The last step in the typical process 10 is scheduling which optimizes the execution of operators on a given device. In a TVM compiler framework, a machine-learning based scheduling called “AutoTVM” is used to generate optimized codes. For each operator in the optimized low-level graph, AutoTVM module uses Xgboost to search for the best schedule within a predefined search space. Point (c) in
Extracting the architecture sequence is not trivial. Since neural network execution goes through several steps of optimization as shown in
Some prior works use Long-Short-Term-Memory (LSTM) models to predict the layer sequence. First, massive profiling of randomly generated DNNs on the target devices is done offline. After proper labeling (an example is shown in
Dimension extraction is done for each identified layer operator once its time-step (position) and class (layer type) are known. This is considered to be simpler than sequence extraction. Note that, dimension extraction can be done either manually or automatically.
In summary, existing architecture stealing attacks heavily rely on the run-time trace, and so to mitigate such stealing attacks, our obfuscating tool changes the run-time trace as much as possible.
The present disclosure considers architecture stealing on applications running on common GPU devices. For other devices such as FPGAs, CPUs and ASICs, the obfuscation methods used by the framework 100 disclosed herein can also be used. The present disclosure considers NN applications running in both remote and local settings.
In remote setting, assume that the owner runs the NN application on a third-party cloud computing platform and the attacker acts as a normal user (without system privilege) on the same machine. Specifically, the attacker can perform “driver downgrading attack” to access the profiling API and thus conduct GPU profiling on target neural network applications at run-time.
In local setting, the present disclosure assumes that the device is off-the-shelf and the attacker can do profiling on an identical device to train a predictor model. While the target application is running, the attacker can get access to the run-time hardware traces of the target neural network applications through side-channel attacks.
Depending on the attack scenario and capability, the attacker can be categorized with respect to an extent of information leakage. Table II describes three cases (from weakest to strongest):
In all cases, the attacker does massive profiling of the DNN model's run-time trace to steal the architecture.
Neural network architecture stealing is possible because typical neural network execution processes are deterministic, as shown in
Many of the function-preserving transformations that have been successfully used in evolution NAS can be used in obfuscation. More specifically, the framework 100 uses layer widening, layer branching, layer deepening, layer skipping and kernel widening and dummy addition obfuscating knobs in this phase. Note that while many of these operators have been introduced before in the context of architecture evolution, the framework 100 is the first to use them as countermeasures for side-channel attacks. Layer branching is redesigned, and dummy addition is added for dimension obfuscation.
1) Layer Widening: Layer widening increases output channel j of a Conv2D layer or a linear layer. Basically, the weights of the added output channels are duplicates of the weights of existing output channels. The framework 100 allows the widening operator to take fractional numbers. For example, if the weight Wk
To preserve the functionality, next layer's weights need to be adjusted accordingly. In this example, the next layer Wk
Purpose: Layer widening increases memory accesses for the current and the next layer by around (N-1) times for widening factor N. This results in increased number of input/output channels and affects dimension extraction.
2) Layer Branching: Layer branching breaks a single NN layer operator into smaller ones. For example, a Conv2D operator Wk
Concate(Uk
While some previous works only consider output-wise branching of Conv2D/linear layers, the framework 100 also considers layer branching in the input channel dimension, referred to as input-wise branching. Here a Conv2D layer of weight Wk
Add(Uk
Note that the activation input needs to be sliced into two to match the halved input channel dimension of two smaller convolutions. Various branching methods are feasible, for example, one can also separate it into more than two parts or even do unbalanced branching. Here the present disclosure considers balanced branching into two or four parts, for both input-wise and output-wise branching.
Purpose: Layer branching increases the number of layer operators and changes the data volume that needs to be accessed for each operator. For input-wise branching, the input activation and weight volume are halved for each small kernel, and for output-wise branching, input activation is the same but weight and output activation volumes are halved. This knob can be used for both sequence and dimension obfuscation.
3) Dummy Addition: Dummy addition is simply adding zero to the activation results. The framework 100 creates a zero matrix of the same shape as the activation output X of current layer.
D
b,j,h
,w
(i)
=O
b,j,h
,w
(3)
A dummy addition factor of N means that we create and add the dummy matrix to the output repeatedly N×.
Purpose: Addition operators are “fused” into previous layer operators in the fusion step in graph optimization (refer to
4) Layer Deepening: Layer deepening inserts an extra computational layer at the end of current layer's activation function. The insertion of a deepening layer U(i) does not change the original result.
φ(U(i)*φ(W(i)*X(i)))=φ(W(i)*X(i)) (4)
For linear layers, the deepening layer U(i) is simply an identity matrix of the same size as its input. For Conv2D layer, layer U(i) of size (k1, k2,j,j) need to be initialized as:
The framework 100 favors a kernel size of k1=k2=1 which avoids too much extra computation. Notice that the correctness of Eq. (4) also depends on whether the activation function φ(·) results stay the same when it gets stacked φ(·)=φ(φ(·)). Fortunately, the most popular ReLU activation subscribes to this property. The same property does not hold for batch normalization, so the deepening layer must be added before batch normalization, as shown in
Purpose: Add an extra computational layer to the layer extraction result. This can be used for sequence obfuscation.
5) Layer Skipping: Layer skipping inserts an extra computational layer as illustrated in
For an activation of size (b,j, ho, wo), the skipping layer can be a Conv2D layer U(i) that has a shape of (k1,k2,j,j) and all entries are zero. The output of the skipping layer is:
U
(i)
*X
b,j,h
,w
+X
b,j,h
,w
=X
b,j,h
,w
(6)
Purpose: Add an extra computational layer to the layer extraction result. This can be used for sequence obfuscation.
6) Kernel Widening: Kernel widening increases the kernel size of a Conv2D layer. It is done by padding zeros to both the input and convolution kernels. A kernel widening of “+1” to a Conv2D layer of shape (k1, k2, c,j) results in a new weight of shape (k1+2, k1+2, c,j) and input of shape (b, c, hi+2, wi+2). This is useful in particular for Conv2D layers that have kernel size of 1×1. These small 1×1 kernels would then transform to 3×3 kernels after widening.
Purpose: Change kernel size of the Conv2D operator, resulting in a completely different trace. This affects dimension extraction.
Fusion is an important graph optimization technique in the TVM Relay module. It fuses subsequent injective operators (scaling or addition) in complex layer operators, such as Conv2D, linear and max-pooling, and transforms the shape of the inputs completely. Fusion ensures execution efficiency as it improves the data reuse and avoids context switching overhead. As shown in
7) Selective Fusion: Selective fusion is a controllable version of the generic fusion. While the generic fusion fuses successive injective operators greedily, the selective fusion allows N successive operators to fuse and forbids more operators to fuse. For example, by setting N to zero for a Conv2D operator shown in
Purpose: Increase the number of operators. Setting N to a small value decreases the memory access and latency of a layer operator, and affects both sequence and dimension extraction.
In the backend, AutoTVM handles the compilation and generates optimized code for a given device. It provides options such as the number of trials for tuning, etc.; as such, AutoTVM was investigated to determine whether these options can be used to generate randomness in the final result and thereby help in obfuscation. In particular, 3 rounds were tried with different number of trials using the default Xgboost (XGB) tuner in AutoTVM for a Conv2D operator. All these trials generated the same schedule, which is understandable because tuning is designed to optimize latency. The profiling results in
8) Schedule Modification: To generate schedules via AutoTVM with different outcomes, the search space has to be modified. Actually changing the search space requires a time-consuming tuning (searching) process each time, and so the framework 100 employs a simple approach that directly modifies the derived schedules with a small sacrifice on operator's performance. For example, the schedule derived by Xgboost in
Purpose: Derive different schedules for the same operator that present differences in latency, DRAM access and cache performance. This affects dimension extraction.
A tool flow of the framework 100 includes two key steps: 1) sequence obfuscation which obfuscates the layer sequence including layer type and topology, and 2) dimension obfuscation which obfuscates the dimensions of individual layer operators. The roles of each obfuscating knob of the framework 100 are summarized in
This disclosure investigates sequence and dimension obfuscation separately since their purpose is orthogonal. This also helps comparison with prior work that focuses only on sequence obfuscation. In addition, to reduce search time, the framework 100 reduces the search space as follows.
Knob Partition. First, the framework 100 partitions the set of knobs, as shown in
Limited Obfuscation Knob Option. Each obfuscation knob of the framework 100 comes with a list of options where the i-th entry of the list denotes a specific obfuscation choice for the i-th layer operator. The available options for each entry are limited to reduce the search time. For example, the framework 100 limits the layer deepening and layer skipping to at most 1, which means at most one deepening layer and one skipping layer can be applied to each layer.
Restricted Search Space. The framework 100 restricts the search space by keeping the number of entries (length of the list) for each knob fixed based on the vanilla architecture. Otherwise, knobs such as branching, deepening and skipping add extra computational layers and can result in the search space exploding if they are applied recursively.
A combinatorial optimization problem is modeled to derive the best set of obfuscating knobs for sequence obfuscation, which can be solved using a genetic algorithm. The framework 100 finds the set of sequence obfuscating knobs such that the obfuscated NN achieves strong obfuscation and can be executed within a given time budget. The obfuscation metric is given by layer prediction error rate or LER and the time budget is a small fraction of the inference latency. The overview of the obfuscation framework is given in
The inputs to the framework 100 are a vanilla neural network model (e.g., an unmodified neural network architecture to be obfuscated by the framework 100) and time budget (steps 1-3). A computing device 200 (
1) Evaluator: LSTM Predictor Testbed: To evaluate the obfuscation effect, a testbed is provided that performs stealing attack on the obfuscated architecture based on existing stealing methods.
Dataset Generation. To mimic the attacker, first, massive profiling has to be done on the user's device. A random neural network architecture generator was built for this purpose, which is used as input to the profiling toolset. It first fixes the depth of the network (number of computational layers), and at each step, randomly inserts neural network convolution layer with random dimension parameters (input channel size and output channel size), ResNet and MobileNet computing blocks and pooling/batch normalization (BN) layers. Linear layers with random number of neurons are added only after all the Conv2D layers. The classification layer (linear layer with neuron equals to the number of class) and the softmax layer are added at the end. 6,000 different neural network architectures are generated for input size of [3, 32, 32] and number of classes equals to 10, to match the CIFAR-10 dataset setting. Another 6,000 architectures are generated for input size of [3, 224, 224], and number of classes equal to 1,000 to match the ImageNet dataset setting. Because normally the BN/ReLU are fused with complex layer operators (Conv2D, Linear, etc.), only the complex operators are labeled. An example of a randomly generated architecture is shown in
Run-Time Profiling. Both the offline and run-time profiling is done using Nsight Compute (a tool for CUDA kernel profiling, similar to NVPROF), which uses “kernel replay” for accurate trace generation. This tool is used to simulate the three cases (cases A, B and C) of attack described in Section III. Two contemporary NVIDIA GPUs are used for profiling, i.e., a Turing GPU (GTX-1660) to profile models on CIFAR-10 dataset and an Ampere GPU (RTX-3090) to profile models on ImageNet dataset. A number of cycles, DRAM and cache performance metrics are collected for each issued operator of the model running in inference mode. The metric guide from NSight Compute is followed to select proper features for three cases. In practice, the attacker gets noisy trace information through side-channels. To study the worst-case (i.e., strongest attack), it is assumed that the attacker can obtain an accurate trace.
LER metric. The LSTM-based predictor for the testbed is a single-layer LSTM-RNN model with a Connectionist Temporal Classification (CTC) decoder as adopted in Deep-sniffer. The Layer prediction Error Rate (LER) is used to quantitatively measure the performance of a trained predictor. The LER has the form:
where L is the predicted sequence and L* is the ground-truth, ED denotes editing distance (Levenshtein distance [131) and |·| denotes the length.
Three sets of LSTM predictors are derived—one for each attack case. For each case, a number of hidden units of the LSTM network is set to be 64, 96, 128, 256 and 512, resulting in a total of 3×5=15 LSTM predictors. Each dataset is split into 4:1 for training and validation subsets, and trained for 150 epochs. The final validation LERs for all the LSTM predictors are shown in Table Ill. An excellent layer sequence extraction performance is observed on Case-C, where all the latency, DRAM and cache features are considered for each time-step, and a comparatively poor performance for Case-A, where only the latency feature is considered.
Predictor Training. The Evaluator 104 of the framework 100 uses the bagging approach and provides the average LER of LSTM predictors for different input sizes, where the input sizes are chosen to match that of CIFAR-10 and ImageNet datasets. The Evaluator 104 is shown in
2) GA-based Obfuscator: The next goal is to maximize the obfuscation given a user-defined latency budget, B using the Obfuscator 102. For instance, B=0.1 means that the user can afford up to 10% extra inference latency. Then the optimization problem can be set up as a constrained discrete optimization problem that maximizes the average LER given the latency budget:
where, N is the number of predictors in bagging, S denotes the set of obfuscation options, T is the latency with obfuscation and T* is the clean latency without obfuscation.
Genetic Algorithm. The genetic algorithm (GA) is selected to solve the discrete optimization problem. Since the optimization with constraints in Eq. (8) cannot be directly used in GA the reward R (a.k.a. fitness score) for GA is designed as follows:
The constraints in Eq. (8) are replaced with a penalty term, which penalizes the reward when latency T deviates from the total latency (1+B)T*. This deviation is normalized and squared and a small offset term ϵ is added to avoid zero proximity. The block diagram of the Obfuscator 102 is shown in
The initial value of each obfuscating knob is randomly generated based on the search space provided in step in
For dimension obfuscation, the present disclosure focuses on the obfuscating knobs, such as layer widening, kernel widening, dummy addition and schedule modification, that affect the dimension parameters the most. Obfuscation is described herein on standard Conv2D operators as they appear most frequently in the DNN architectures that were tested.
DER metric. To evaluate the prediction error of a layer's dimension parameter, the Dimension parameter prediction Error Rate (DER) is used as a measure of the obfuscation effect similar to the LER metric. If the number of input/output channels of a Conv2D operator be (c,j), the DER for a given prediction (c,j) on layer i is defined as:
where, c* and j* represent the original (without obfuscation) input and output channels, respectively.
Predictor Training. The Random Forest (RF) model is adopted as a bagging version of decision trees for the dimension parameter extraction testbed. Around 50,000 traces are collected for Conv2D operators with different input channel and output channel parameters (c,j), which are the two most important dimension parameters. Note that stride, kernel size and padding features are a lesser focus of this disclosure because they rarely change. In particular, an RF regression model is trained with a different number of trees (30, 50, 100 and 200) to predict c and j separately. The training and validation ratio is set to 4:1. The average DER of the validation dataset (20% of the data) is recorded for ImageNet and CIFAR-10 for the three attack cases. The results in Table IV show that the dimension extraction has negligible error for cases B and C and comparably high error for Case-A because it has only latency feature. Furthermore, the number of trees do not affect the prediction performance much.
The dimension obfuscation framework is similar to that of sequence obfuscation of the framework 100 shown in
The performance of our obfuscation tool was evaluated on a series of standard models. Specifically, VGG-11, VGG-13, ResNet-20, and ResNet-32 models were selected on CIFAR-10 dataset running on a Turing GPU (GTX-1660). VGG-19, ResNet-18 and MobileNet-V2 models were selected on ImageNet dataset running on an Ampere GPU (RTX-3090). For the GA, the population size was set to be 16 and was ran until the fitness score stabilized, which occurred around 20 generations. The standard deviation σ for the mutation step was set to a high value (i. e. σ=8.0) at the beginning and was halved after every 4 generations. To eliminate the randomness, for each data point reported here, the average of 3 runs was selected.
Effect of Individual Knobs. First, this disclosure investigates the effect of individual knobs on stand-alone Conv2D operators with different dimension parameters. Latency overhead in each case is listed in Table V. Layer branching introduces extra operators with a low latency cost. For example, output-wise layer branching by 4 adds 3 extra Conv2D operators and 1 concatenate operator with at most 49% latency increase. Selective fusion increases latency by around 15% but it only introduces one extra ReLU operator and BN operator. In contrast, deepening layer and skipping layer introduce an extra Conv2D operator at a lot higher latency cost, and is thus not effective.
Since the latency overhead due to application of an obfuscation knob on a single operator is large, the obfuscation knobs have to be applied selectively to only certain layers. Next, this disclosure demonstrates the contribution of individual knobs on a full model using the Obfuscator 102 which is GA-based. Only one obfuscating knob was allowed to be available at a time during the GA search, with a budget of B=0.02. The results are shown in
NeurObfuscator—Sequence Obfuscation. This disclosure demonstrates the performance of the framework 100 on CIFAR-10 and ImageNet datasets. For VGG-11, VGG-13 and ResNet-32 running on CIFAR-10, bagging of all 15 LSTM predictors was used.
The LER results under different latency budgets are shown in
For VGG-19, ResNet-18 and MobileNet-V2 on ImageNet dataset, Case-A and Case-B LSTM predictors struggle to get good extraction performance, i.e., provide low LER for the baseline architecture. So, bagging of three “elite” LSTM predictors was used (number of units of 128, 256, 512 LSTM predictors in Case-C), which have near-zero clean LER. The results are shown in
Summary 1: The present disclosure demonstrates the performance of four knobs of the framework 100, namely, layer deepening, layer skipping, layer branching and selective fusion, on sequence obfuscation. While layer branching and selective fusion have relatively strong performance, combination of all four knobs by GA in the framework 100 results in the strongest performance. The framework 100 was evaluated on multiple models taking CIFAR-10 and ImageNet datasets as input data. On a ResNet-18 ImageNet model, a 2.44 LER was achieved (translates to 44 layers' difference) with a mere 2% inference latency overhead.
For dimension obfuscation, the RF regression testbed and DER metric (Eq. (10)) were used to evaluate the effect of obfuscation. A Conv2D layer (C2) was selected for a 3×3 kernel with 64 input channels and 128 output channels from VGG-19 network as an example.
Layer Widening. Grid-search was used on applying widening factor of 1 to 1.5 (3/2) for C1 and C2. As shown in
Kernel Widening. Kernel widening affects both types of Conv2D operator. However, as shown in
Dummy Addition. Dummy addition does not affect the dimension parameters of C2, because dummy operator is issued after “winograd kernel2” and will not be fused into kernel1. However, for a standard Conv2D such as C1, dummy addition has a dramatic effect. As shown in
Schedule Modification. For the schedule modification knob, the schedules of two templates were targeted (plain-Conv2D and winograd-Conv2D), with a total of 13 distinct tunable parameters. Since the search space is very large, 100 trials of random choices were performed.
NeurObfuscator—Dimension Obfuscation. The performance of the framework 100 was evaluated on dimension parameter obfuscation. Using the same GA setting as in sequence obfuscation, and replacing the LER with DER, the results shown in Table VI were obtained. Note that final results are significantly better than when individual obfuscation knobs are used. A high DER of 2.51 is achieved with only 0.02 latency budget, i.e. a 2% increase in inference latency. This corresponds to the case where (c,j)=(64,128) is extracted to (c,j)=(207,93).
Summary 2: This disclosure demonstrates the performance of four dimension-obfuscating knobs, namely, layer widening, kernel widening, dummy addition and schedule modification. While schedule modification has the strongest performance among all four, the framework 100 achieves the best dimension obfuscation, as expected. On an example Conv2D layer with 64 input channels and 128 output channels, RF regression-based dimension extraction achieves 2.05 DER and 2.51 DER under 1% and 2% inference latency overhead, respectively.
The effectiveness of the proposed obfuscation techniques employed by the framework 100 were tested against various types of adversarial attacks. In particular, a methodology was adopted where a hypothetical attacker uses the extracted model (instead of using ensemble) to craft adversarial samples and use them as inputs of the target model. FGSM, PGD and targeted-PGD attacks (in which the attacker chooses the label) were performed multiple times. An average Attack Success Rate (ASR) (e.g., the percentage of samples that is transferred successfully) is reported to show the attacking performance.
Three models were selected based on VGG-11 on CIFAR-10 dataset. Model-A is the original VGG-11 architecture, Model-B is Model-A with randomly selected sequence obfuscations, and Model-C is Model-B with an additional set of dimension obfuscations. The results are shown in Table VII. Note that while all three models have comparable accuracies, models with more obfuscation results in worse attack performance. So Model-A has the highest ASR followed by Model-B followed by Model-C.
To mitigate neural architecture stealing on GPU devices, the present disclosure describes the framework 100, which is a NN obfuscating tool that provides both sequence obfuscation and dimension obfuscation. The framework 100 uses a total of eight obfuscating knobs across scripting, optimization and scheduling phases of a neural network model execution. Application of these knobs affect the number of computations, latency and number of memory accesses, thus altering the execution trace. To achieve the best obfuscation performance for a user-defined latency overhead, the genetic algorithm is leveraged to identify the best combination of obfuscation knobs. For instance, on a ResNet-18 ImageNet model, sequence obfuscation helps achieve a 2.44 LER (which translates to 44 layers' difference) with a mere 2% latency overhead. Similarly, dimension obfuscation with 2% latency overhead for Conv2D can result in (input channel c, output channel j)=(64, 128) to get extracted to (c,j)=(207,93). Thus, the framework 100 successfully hides the DNN model and provides a mechanism to prevent architecture stealing.
Computing device 200 comprises one or more network interfaces 210 (e.g., wired, wireless, PLC, etc.), at least one processor 220, and a memory 240 interconnected by a system bus 250, as well as a power supply 260 (e.g., battery, plug-in, etc.).
Network interface(s) 210 include the mechanical, electrical, and signaling circuitry for communicating data over the communication links coupled to a communication network. Network interfaces 210 are configured to transmit and/or receive data using a variety of different communication protocols. As illustrated, the box representing network interfaces 210 is shown for simplicity, and it is appreciated that such interfaces may represent different types of network connections such as wireless and wired (physical) connections. Network interfaces 210 are shown separately from power supply 260, however it is appreciated that the interfaces that support PLC protocols may communicate through power supply 260 and/or may be an integral component coupled to power supply 260.
Memory 240 includes a plurality of storage locations that are addressable by processor 220 and network interfaces 210 for storing software programs and data structures associated with the embodiments described herein. In some embodiments, device 200 may have limited memory or no memory (e.g., no memory for storage other than for programs/processes operating on the device and associated caches).
Processor 220 comprises hardware elements or logic adapted to execute the software programs (e.g., instructions) and manipulate data structures 245. An operating system 242, portions of which are typically resident in memory 240 and executed by the processor, functionally organizes device 200 by, inter alia, invoking operations in support of software processes and/or services executing on the device. These software processes and/or services may include neural network obfuscation processes/services 290 which can include a set of instructions within the memory 240 that cause the processor 220 to implement aspects of framework 100 upon execution by the processor 220. Note that while neural network obfuscation processes/services 290 is illustrated in centralized memory 240, alternative embodiments provide for the process to be operated within the network interfaces 210, such as a component of a MAC layer, and/or as part of a distributed computing network environment.
It will be apparent to those skilled in the art that other processor and memory types, including various computer-readable media, may be used to store and execute program instructions pertaining to the techniques described herein. Also, while the description illustrates various processes, it is expressly contemplated that various processes may be embodied as modules or engines configured to operate in accordance with the techniques herein (e.g., according to the functionality of a similar process). In this context, the term module and engine may be interchangeable. In general, the term module or engine refers to model or an organization of interrelated software components/functions. Further, while the neural network obfuscation processes/services 290 is shown as a standalone process, those skilled in the art will appreciate that this process may be executed as a routine or module within other processes.
It should be understood from the foregoing that, while particular embodiments have been illustrated and described, various modifications can be made thereto without departing from the spirit and scope of the invention as will be apparent to those skilled in the art. Such changes and modifications are within the scope and teachings of this invention as defined in the claims appended hereto.
The present document is a non-provisional patent application that claims benefit to U.S. provisional application Ser. No. 63/350,765 filed on Jun. 9, 2022; all of which is incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63350765 | Jun 2022 | US |