This application claims priority from EP 22386002.4, filed on Jan. 21, 2022, the contents of which are incorporated by reference herein in its entirety.
Embodiments of the present invention relate to a quantization method to improve the fidelity of rule extraction algorithms for use with artificial neural networks.
There is a growing need for interpretable artificial intelligence (Al) models, both in industry and academia. Many critical areas where machine learning is applied, for example healthcare, autonomous driving and finance, require the model to explain its decisions in a human interpretable way in order to be trusted. For example, explaining the decision-making process of a neural network may assist doctors in trusting the output classification of a neural network, allowing them to make a better judgement of patient condition and mitigating human errors. Moreover, the way a network training dataset is collected may introduce some hidden bias or spurious correlations between different classes, which a model being trained could learn and cut corners to make more accurate predictions but for the wrong reason. Explaining black-box models may help domain experts to identify errors in, or underlying biases of, those models. After diagnosing these errors/biases the models may be re-trained to produce a trusted model.
Methods for extracting rules to explain the decision-making process of a neural network essentially take as input the quantized filter/neuron activations from a subset of layers and measure the association with the target output. Many post-hoc rule extraction algorithms for CNNs rely on aggregating filter activations into a single value, since a filter is composed of many neurons. Thresholding is then carried out to determine whether a filter may be considered active (a process known as “quantization” - see J. Townsend, T. Chaton, J. M. Monteiro, “Extracting relational explanations from deep neural networks: A survey from a neural-symbolic perspective”, IEEE Transactions on Neural Networks and Learning Systems 31 (2020) 3456-3470), detecting a specific pattern across images. For feed forward neural networks (FFNNs), rule extraction methods directly threshold neuron activations, so there is no need to aggregate.
As discussed in H. Jacobsson, “Rule extraction from recurrent neural networks: Ataxonomy and review”, Neural Computation 17 (2005) 1223-1263, and elsewhere, a number of different techniques are known in the art to determine whether a filter/neuron is considered active. For example, in CNNs that use as non-linearity a sigmoid activation function
a neuron is commonly considered active if s(x) > 0.5. Other techniques involve using discriminant hyperplanes and a stairway activation function (quantization of the sigmoid function) for thresholding neurons, using the per sample mean for each filter, dynamically thresholding above a neuron-specific percentile, k-means clustering, hierarchical clustering after neural learning for quantization in recurrent neural networks (RNNs), or using a Kohonen self-organizing map with star topology of neurons to quantize recurrent network states.
By associating filters/neurons with literals, rules are formed to explain the classification output of the CNN, where each atom used in the explanations corresponds to an active filter. Rule extraction may be applied to general ANNs by replacing filters with neurons.
However, if the fidelity, i.e. the accuracy of the extracted program with respect to the original model (e.g. a measure of how often the output of the logic program is the same as the model predicted) is not very high, such post-hoc rule extraction models will essentially remain a black box. In this respect, existing approaches relying on sample statistics are not always appropriate because each filter activation may follow a different distribution, and thresholding with respect to sample statistics may overestimate or underestimate when a filter/neuron is considered active. In order for the output rules to approximate the original model more faithfully, the quantization step, that converts a float representation (or a matrix of float values in the case of convolutional neural networks (CNNs)) into a binary value, should result in minimal information loss.
Therefore, it is desirable to develop an improved quantization approach in order to reach higher fidelity during rule extraction.
An embodiment according to a first aspect may provide a computer-implemented method comprising: recording node activations for each node in a layer of a trained ANN and predictions of the ANN, in respect of each item of training data used to train the ANN; taking as input the recorded node activations and as targets the recorded predictions of the ANN, creating at least one decision tree, where each decision tree is trained to approximate the ANN and optimize a defined criterion, for each node of the decision tree a threshold value for the defined criterion being calculated to determine for which node of the ANN the input activations should be split between branches of the decision tree; recording the threshold values associated with respective nodes of the ANN; obtaining threshold value combinations, each combination comprising one of the threshold values obtained for respective nodes of the ANN, and, for each of the threshold value combinations, performing a selected rule extraction algorithm using the combination of threshold values to extract from the ANN at least one rule for explaining the output of the layer of the ANN, and obtaining a fidelity metric for the at least one rule using that combination of threshold values, the fidelity metric indicating the accuracy of the rule with respect to the predictions of the ANN; determining which of the combinations of threshold values yields the best fidelity metric; and using the selected rule extraction algorithm with the combination of threshold values determined to yield the best fidelity metric to extract at least one rule for explaining the output of the layer of the ANN.
An embodiment according to a second aspect may provide a computer program comprising instructions which, when executed by a computer, cause the computer to carry out a method embodying the first aspect.
An embodiment according to a third aspect may provide apparatus comprising: at least one computer processor, and at least one memory connected to the at least one computer processor to store: node activations for each node in a layer of a trained artificial neural network, predictions of the ANN, in respect of each item of training data used to train the ANN, and instructions to cause the processor to: taking as input the recorded node activations and as targets the recorded predictions of the ANN, create at least one decision tree, where each decision tree is trained to approximate the ANN and optimize a defined criterion, for each node of the decision tree a threshold value for the defined criterion being calculated to determine for which node of the ANN the input activations should be split between branches of the decision tree; cause the threshold values associated with respective nodes of the ANN to be recorded; obtain threshold value combinations, each combination comprising one of the threshold values obtained for respective nodes of the ANN, and, for each of the threshold value combinations, perform a selected rule extraction algorithm using the combination of threshold values to extract from the ANN at least one rule for explaining the output of the layer of the ANN, and obtain a fidelity metric for the at least one rule using that combination of threshold values, the fidelity metric indicating the accuracy of the rule with respect to the predictions of the ANN; determine which of the combinations of threshold values yields the best fidelity metric; and use the selected rule extraction algorithm with the combination of threshold values determined to yield the best fidelity metric to extract at least one rule for explaining the output of the layer of the ANN.
Embodiments provide a quantization method that aims to find threshold values for each filter in a CNN (or, more generally, for each neuron in an artificial neural network, ANN) that more accurately capture when a filter (or neuron) is truly active in detecting a specific pattern in an input image. Embodiments find threshold values that result in minimum information loss during the quantization process and assist subsequent rule extraction algorithms to approximate more faithfully complex black-box models. According to an embodiment this is accomplished by training various decision trees that approximate the CNN/ANN behavior and are built by taking as input filter/neuron activations (not quantized) and as output the CNN/ANN prediction. The threshold values used in splitting the nodes of each tree are stored in a list Tj for each filter fj separately. Afterwards rule extraction is performed for combinations of threshold values (for example, each possible combination of threshold values), and the threshold values of the combination that results in highest fidelity, i.e. the threshold values that result in a logic program that more faithfully approximates the behavior of the original model, are chosen.
Performing a search in the space of threshold values found by the decision trees (e.g. a random forest) that approximate the ANN and choosing the ones that result in higher fidelity may result in an extracted program that approximates the model more faithfully. Embodiments may find more appropriate threshold values for each filter, that more accurately capture when a filter is active, because the threshold values used in splitting the nodes of the decision trees are incentivized to reduce the information loss in order to optimize fidelity.
As optimal threshold values are found by optimizing a user-defined criterion and building decision trees where each tree approximates the neural network instead of relying on sample statistics, embodiments are highly versatile and do not rely on any knowledge about the distribution of filter activations.
According to the first or second aspect, recording the threshold values may include ranking the threshold values, for each node of the ANN, according to occurrence frequency and average depth of appearance in the decision tree, and performing the selected rule extraction algorithm for each combination of threshold values includes performing the selected rule extraction algorithm first on that combination of threshold values which includes the threshold values occurring with the highest frequencies.
According to the first or second aspect, obtaining threshold value combinations may comprise one of: obtaining all possible combinations of the threshold values; obtaining combinations using only a preset number of the most frequently-appearing threshold values for each node of the ANN; obtaining combinations using only a random subset of the threshold values for each node of the ANN; obtaining combinations of only threshold values for each node of the ANN which meet a user-defined metric.
According to the first or second aspect, the defined criterion to be optimized may be entropy or Gini index.
According to the first or second aspect, recording the threshold values associated with respective nodes of the ANN may comprise, when there is no threshold value associated with a particular node of the ANN, recording as a threshold value for the node the per sample mean activation of the node.
According to the first or second aspect, creating at least one decision tree may comprise using a random forest generation algorithm to build a plurality of diverse decision trees.
A method, program or apparatus according to an embodiment may be used to extract at least one rule for an ANN for use with one of an autonomous driving algorithm and a healthcare algorithm.
A method, program or apparatus according to an embodiment may be used to either: (i) extract the at least one rule for a CNN used in the control of an autonomous driving vehicle; or (ii) determine, using the extracted at least one rule, that the ANN is functioning correctly.
Embodiments may boost the fidelity of post-hoc rule extraction algorithms that try to explain the decision of black box models by providing extracted rules which explain more faithfully the decisions of the models in an interpretable language that humans understand. Moreover, extraction programs with high fidelity may be used to detect hidden biases or errors in the original model, which may then be re-trained to get rid of those issues and obtain a more robust model.
Many consumers may be more attracted by black box models that are highly accurate and are accompanied by a rule extraction algorithm that explains faithfully the decisions of the model to the consumers. Accurately interpreting the decisions of black-box models is crucial for applying deep learning in critical areas, such as healthcare, fully autonomous driving, criminal justice and finance.
Finally, the proposed method may contribute to XAI research and assist in neuro-symbolic integration by reducing the gap between neural and symbolic representations. This happens because after applying the proposed method symbols may more accurately capture the state of the model due to reduced approximation loss during quantization.
Rule extraction algorithms take as input the activations of filters from a subset of layers and measure the association with the target output. Embodiments may be applied to any rule extraction algorithm that relies on neuron/filter activations to distil the knowledge from a neural network and explain its decisions. For example, embodiments may be combined with the post-hoc rule extraction program proposed by J. Townsend, T. Kasioumis, H. Inakoshi, in “ERIC: extracting relations inferred from convolutions”, 15th Asian Conference on Computer Vision, Kyoto, Japan, Revised Selected Papers, Part III, volume 12624 of Lecture Notes in Computer Science, Springer, Nov. 30 - Dec. 4, 2020, pp. 206-222) to achieve better and more interpretable rules. However, any other existing or future method which maps filters/neurons to literals and generates rules over those literals may also be used.
Reference will now be made, by way of example, to the accompanying drawings, in which:
Without loss of generality the proposed method is described below for convolutional neural networks, but it should be noted that everything described in this section may be generalized trivially to ANNs by replacing filters with neurons or nodes in the terminology.
The fidelity of rule extraction programs that rely on quantization of filter activations to explain the output classification of a trained CNN heavily depends on the quantization procedure. Quantization may be regarded as a 2-step process: a) applying an aggregation function (like mean, sum, max, etc.) and b) thresholding. The goal of quantization is to convert the output feature map produced by convolutional kernel multiplication into a binary value in order to determine whether a kernel is considered active, i.e. detecting a pattern in the input image. The aggregation function is needed in order to convert the multidimensional feature map into a single value that depicts the strength of kernel activation, whereas the thresholding is necessary to determine above which value θ is a kernel considered active. In order for the output rules to approximate the original CNN faithfully (i.e. have high fidelity) the quantization step, that converts a float representation (or a matrix of float values in the case of CNNs) into a binary value, should result in minimal information loss. If the quantization step introduces high approximation error ε, then rule extraction is unable to reach its highest potential and the fidelity of the extracted program will be upper bounded by 1-ε. Hence, developing more principled quantization approaches is key to reaching higher fidelity during rule extraction.
As mentioned above, many approaches in the literature use sample statistics to determine whether a filter is considered active, detecting a specific pattern in the input image. For example, for CNNs different images from a dataset are passed through the network and filter activations are recorded; if
are images in a dataset then for each filter
at layer (l) the activations
are stored in a list
In approaches relying on sample statistics a filter
is considered active on an image Xs if its activation
is either above
However, such threshold values in practice fail to determine whether a filter is truly active. One reason for this inefficiency is that such threshold values only make sense if the distribution of each filter’s activations is Gaussian-like, an assumption that might be strongly violated in practice if the distribution has heavy tails, or is highly skewed, or is multimodal for example.
Another issue with using the mean as a threshold is that it is highly affected by extremal values. For example, consider a dataset of images where a subset of them has many traffic signs (maybe these were captured from a very busy spot in a town or from a training driving course), the majority of the images contain 1 traffic sign and the rest have 0 zero traffic signs (e.g. captured in a forest). On images with multiple traffic signs present, we expect the magnitude of the feature map of filters detecting traffic signs to be very high (because many traffic signs are detected in multiple locations). As shown in
Replacing the mean with the mode of the distribution (since the mode is not affected by extremal values) solves the previous issue, but it will introduce another problem. Consider a dataset where the majority of the images were captured in a town and contain multiple traffic signs. On those images the activation magnitude of filters that detect traffic signs will be very high because there will be many peaks in the feature map. Suppose there exist some images with only 1 traffic sign in them which is far away (hence appearing small in the image). This means that the activation magnitude is small but higher than zero. Also suppose there exist a few images with no traffic sign in them. This dataset could result in an activation distribution of filters detecting traffic signs as shown in
Another reason that threshold values relying on sample statistics may fail in practice has to do with the size of the pattern that is being detected by a filter. Objects that are far away from the camera look smaller in the image and hence will occupy a smaller region in the feature map in the layer of interest after many convolutions and maxpooling operations. For example, suppose that a filter fires in response to traffic signs and an image contains a traffic sign that is far away from the camera. Then, after convolution and maxpooling operations, the corresponding traffic sign activation region will get much smaller. As a consequence, the magnitude of activation will also be small, due to many aggregations with regions of zero-activation around the traffic sign. This means that the activation magnitude of the filter may be lower than the mean activation threshold over different images, and the filter will be falsely deemed in-active even though the image contains a traffic sign.
It is evident from the aforementioned examples that threshold values depending on the mean or mode do not generalize well to arbitrary distributions of filter activations. These types of threshold values will introduce high approximation errors during the quantization process, and this will have an impact on the rule extraction fidelity.
The present proposal introduces a quantization method that aims to find threshold values which more accurately capture when a filter is truly active and that maximize the fidelity of subsequent rule extraction programs. This is accomplished by training a random forest where each tree approximates the neural network behavior and is built greedily (i.e. according to a greedy algorithm) to optimize a user-defined criterion. Each tree takes as input the filter activations (not quantized) from a subset of layers of the CNN and as target the CNN predictions. The threshold values used in splitting the nodes of the trees in the random forest are stored in a list Tf for each filter f separately. Afterwards, rule extraction for each possible combination of threshold values is performed (an exhaustive grid search may be performed to find the threshold values that result in highest fidelity, but other approaches are possible, for example random search or Bayesian approaches (i.e. guided search), and the ones that result in higher fidelity are chosen, i.e. the threshold values that result in a logic program that approximates the behavior of the original CNN more faithfully. By choosing a better policy for quantizing the kernels, information loss may be reduced and the fidelity of rule extraction algorithms that explain the decisions of neural networks may be increased.
Embodiments are distinguished over the prior art by one or more of the following:
Before explaining a method according to an embodiment, some necessary preliminaries will be introduced and the notation will be fixed.
Assume that activations of J filters
in layer l of a CNN are to be quantized and optimal threshold values {θ1,..., θJ} to improve the fidelity of subsequent rule extraction algorithms that approximate the behavior of the CNN are to be found. Without loss of generality, the method is described for one layer, but similar quantization may be done on multiple layers of the CNN.
Let C denote the total number of classes for the classification problem at hand and let ci denote the ground truth label/class for an image Xi in the dataset. Given a training dataset {X1,...,XN} of images, let
stand for the feature map (also known as activations) output of the l-th layer for the i-th image. Each
is a 2D matrix of activations that is defined as the convolution of the feature map of layer l - 1 with the j-th filter for i-th image in the batch, i.e.,
where ∗ stands for the convolution operator followed by ReLu (and in some cases by maxpooling also, depending on the architecture) and
is the input image. This is illustrated in
Definition: A filter
is said to be ‘active’ for an image Xi if its activation
surpasses a filter-specific threshold θj (to be determined with the proposed method) i.e., if
As mentioned previously, many previously-proposed approaches use sample statistics to determine when a filter is active on a given image. The presently-proposed method differs in that a random forest is used to find filter-specific threshold values, where each tree approximates the original CNN. After finding a list of plausible threshold values θj for each filter
a search through that space of threshold values is performed to find the optimal ones which result in the highest fidelity for the subsequent rule extraction algorithm of choice.
In the rest of this section the steps of the proposed quantization method are discussed with reference to a CNN on image data, but the described method may be generalized to any ANN using non-pictorial data by replacing filter activations (spatial average of feature map) with neuron activations.
A flow diagram of the method is shown in
Quantization process (numbering corresponds to steps in
1) Obtain training images.
2) Forward pass images {Xt,...,XN} through the CNN (module 61), get predictions {
for each filter
For example, create a table with columns the filter activations and predictions of CNN across all image data, i.e. the first column stores
the second column stores
etc. up to the J-th column
and the (J + 1)-th column stores the predictions {
3) Build one or more decision trees, for example using the random forest method (module 62) which generates multiple decision trees with some random variation each time. The variation in the process for generating trees does not necessarily need to be random (i.e. a deterministic process might be used), but random variation is likely to yield better results. In the present example a random forest, that takes as input the J-th columns of the above table that contain the activations and as targets the predictions of the CNN, is built. Each tree approximates the CNN’s predictions (knowledge distillation - see Hinton G., Vinyals O., Dean J., “Distilling the Knowledge in a Neural Network”, NIPS Deep Learning and Representation Learning Workshop, 2015, http://arxiv.org/abs/1503.02531) and optimizes a user-defined criterion. At each node of the decision tree the defined criterion is used to determine for which feature input activations should be split, the feature with the minimum (or, depending on the criterion, the maximum) value for the defined criterion being selected as the feature used for splitting the activations. This value, which corresponds to the value at which the node of the CNN is considered to be active for the feature, is the node’s “threshold value”. In this example, feature subsample (e.g. 75%) and bootstrap (e.g. 75%) are used to create diverse trees, i.e. to avoid the creation of trees that use the same filters and threshold values. Methods other than bootstrap or feature subsampling may be used to vary the trees.
4) Store the threshold values (module 62) used in splitting the nodes of each tree in a list Tj and the location/depth that they appear in each tree (i.e. the number of nodes traversed from the root node) in a list Lj for each filter fj separately. This means that multiple values of the same threshold might occur in the list. For each threshold in the list Tj, determine its frequency, i.e. count how many times it appears in that list (no distinction is made between threshold values that match up to a given precision, i.e. consider them as the same threshold values in the count). The higher the frequency, the more important the threshold. Finally, order the threshold values with respect to that frequency (if ordered in descending order, a higher frequency threshold will appear at a higher index in the list; if ordered in ascending order, a higher frequency threshold will appear at a lower index in the list). It should be noted that, although in practice it may be beneficial to store the threshold values in a ranked list, it is not essential to do so.
If two threshold values have the same frequency, then order them according to the average depth of decision nodes that they appear when splitting the trees in random forest. For each filter fj the average depth that it appears is just the mean of the values in Lj.
Therefore, after building the random forest, each filter will be associated with a list of threshold values: fj←→Tj = [θj1, θj2,...,θjkj], where kj denotes the number of unique threshold values for each filter and the list is sorted in descending order with respect to their occurrence frequency when splitting the nodes and average depth of appearance.
Remark 1: If a list Tk is empty, a situation which happens when filter fk does not appear in any tree of the random forest while splitting the nodes, then append to the list Tk the per sample mean µk activation of that filter, i.e. Tk = [µk]. In the corresponding Lk list append 1 depth, i.e., Lk = [1].
Remark 2: In order to make the list of threshold values more manageable, additional quantization/binarization or clustering algorithms may be used to group similar values together in a bin, and then a representative for them (e.g. the mean threshold) may be chosen to replace them with that value (however, this binarization procedure may introduce approximation errors, and it is recommended only if the threshold list is quite large (e.g. more than 4-5 threshold values for each kernel)).
The reasoning behind steps 3) and 4) is threefold:
5) (steps 5 to 11 of
Remark: Instead of an exhaustive grid search through the different threshold values, different strategies may be used for increased speed:
6) (step 12 of
An algorithm for carrying out the method described above is set out below: Quantization Algorithm
1: Initialization: Layer l to apply quantization, dictionary T = {} to store the threshold values used for splitting the nodes of the trees in the random forest for each filter.
2: Build random forest that approximates the CNN: (preferably using feature subsample and bootstrap as mentioned in step (3) above)
3: For each tree in the random forest:
4: - Train the tree to approximate the CNN, using as inputs the activation of filters from layer l and targets the CNN predictions.
5: - After building the tree, store the threshold values used in splitting the tree nodes in a list T f = [threshold values for filter f].
7: End For
8: Measure the fidelity of the extracted program on every combination of threshold values in the list T[f] for each filter f (grid search). If there are n filters and kn threshold values for each filter then there will be a total of
different combinations (this step may be speeded up by doing a guided search or random search instead of grid-search).
9: Return the threshold values that yielded the best fidelity and the extracted program.
Experiments were conducted using the above-described quantization policy on a toy subset of the Places365 dataset that contains 3 categories: ‘forest road’, ‘highway’, ‘street’. Hereafter this 3-class dataset will be referred to as the ‘road’ dataset for convenience. The training-validation-test split that used was 10,445 - 1,500 - 3,055 with 500 images per class for validation and roughly 1018 images per class for testing. This dataset was chosen because scenes could be described through sub-objects and topics present within them, making it a good candidate for rule extraction.
Classifying scenes may be particularly useful in an autonomous driving scenario where a neural network is used to make decisions about steering, braking, accelerating etc. It is well known that neural networks are subject to adversarial attacks. An example from real life is someone tricking a vehicle autopilot into believing that a 35mph speed sign was a 85mph speed sign by using adhesive tape on the 35 mph sign, as shown in
Firstly, a network for multiclass classification was trained on the road dataset. VGG-16 of the network was initialized with weights pretrained on ImageNet, followed by preprocessing and augmentations as done by A. Krizhevsky, I. Sutskever and G. E. Hinton (“Imagenet classification with deep convolutional neural networks”, Advances in Neural Information Processing Systems, volume 25, Curran Associates, Inc., 2012). The last dense layer was finetuned for 50 epochs using the Adam optimizer (described in D. P. Kingma, J. Ba, Adam: “A method for stochastic optimization”, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015) with β1 = 0.9,β2 = 0.999, ε = 1e-08, ridge regularization parameter 0.005 and learning rate 5×10-5 using an NVIDIA GPU 1080 Ti 12GB and TensorFlow™ framework. Afterwards, training of all layers was resumed for 100 epochs with learning rate 10-6. The conv13 layer was selected because filters in deeper layers in CNNs tend to represent object parts and more semantic concepts than earlier layers.
After finetuning VGG-16 on the ‘road’ dataset, rule extraction was performed using the ERIC technique (J. Townsend, T. Kasioumis, H. Inakoshi, “ERIC: extracting relations inferred from convolutions”, 15th Asian Conference on Computer Vision, Kyoto, Japan, Revised Selected Papers, Part III, volume 12624 of Lecture Notes in Computer Science, Springer, Nov. 30 - Dec. 4, 2020, pp. 206-222), using the per sample mean activation for each kernel as a baseline, and the fidelity of the extracted program was measured. The results are shown in the 2nd column of Table 1.
To measure its effectiveness, the proposed quantization policy was applied after finetuning VGG-16 of the ‘road’ dataset. Firstly, all images of the road dataset were passed through the CNN, and the filter activations in layer conv13 of VGG-16 were recorded. For each image Xi, 1 ≤ i ≤ N in the ‘road’ dataset, a 512 array
of filter activations was stored (for the training, validation and test datasets, i.e., N=10,445, N=1,500 and N=3,055 respectively). An example of such a 512-dimensional array of filter activations is (0.05635, 0.36253, 0.36554, ...., 0.00932, 0.019682, 0.043455).
After passing all images through the CNN, three tables (for the training, validation and test datasets respectively) were created, with each ith row consisting of, for each sample, a vector of the activations
and the predictions
of the CNN on the i-th image.
Then, a random forest was trained on the training dataset (10,445 rows) using feature subsample and bootstrap for diversified trees.
of the CNN model. In this Figure, prediction 0 stands for ‘forest road’, prediction 1 for ‘highway’ and prediction 2 for ‘street’, and f232 > 0.19302 for example means “if the 232-th filter’s activation surpasses the threshold 0.19302, then traverse to left, otherwise go on the right branch”.
For each tree, the threshold values in splitting the nodes of each tree were stored in a list Tj and the location/depth that they appear in each tree were stored in a list Lj for each filter fj separately.
After training the random forest, a list of threshold values for each filter was obtained:
and
If a list is empty (i.e. the filter does not appear as a splitting node in the random forest), then the per sample mean of filter activations is appended to the list.
Then, filters fj were ranked with respect to the number of threshold values in their list Tj, where filters with longer lists were more important:
Afterwards, rule extraction was performed using the ERIC technique on every combination of threshold values (θi1, θi2,..., θin) ∈ T1 × T2 × ... × Tn, and the fidelity of the program was measured. The combination of threshold values that yielded the best fidelity was chosen. In this example the best combination of threshold values used in splitting the nodes in the random forest was:
Results of rule extraction using the above best combination of threshold values are shown in the 3rd column of Table 1. A 3% improvement in the fidelity after using the threshold values obtained from the proposed quantization method was observed compared to the baseline per sample mean filter threshold values.
In a medical scenario the input images might be medical images, for example one or more of X-rays, CT-scans, etc., with the CNN being trained to confirm the presence of one or more features in an input image. On the basis of this information, combined with the explanation as to why the CNN made that determination, a medical expert might be able to confirm the presence of an abnormality, make a diagnosis or determine a course of treatment. For example, if the input images are X-rays of lungs, the y labels in
Reaching high levels of fidelity while maintaining interpretability is very beneficial for using highly accurate black-box models in critical domains such as:
In autonomous driving there is a great need for interpretable models because deep neural networks may be responsible for steering, braking, stopping, accelerating the vehicle and other actions, based on input images from the environment. Rule extraction algorithms may be used to explain the decisions of black-box models in a human interpretable way in order to build trust into these models. The proposed method finds more appropriate threshold values that boost rule extraction fidelity, meaning that the extracted program represents more faithfully the black-box model.
Moreover, if the model relied on spurious correlations and biases during training to reach high accuracy, then the extracted program may cause the user to ignore the model’s prediction (assuming that the extracted program has high enough fidelity) or try to debug it. Afterwards, the model may be re-trained to mitigate these errors and biases.
Extracted rules may also be useful for audit purposes, where an insurance agent might be interested in knowing why a car behaved incorrectly, what caused a crash and who should take responsibility. Furthermore, extracted rules may be useful for taxis, haulage and other transport companies auditing their own cars/employees; if for example a customer or witness makes a complaint that is not bad enough for an insurance or legal claim, but still bad enough to harm a company’s reputation.
Hence, logic programs with high fidelity are highly desirable, otherwise the logic programs may not represent faithfully the black-box model and explain its decisions in a human interpretable manner. The proposed quantization method helps in that direction by choosing more appropriate threshold values to create faithful logic programs.
In healthcare, state-of-the-art deep neural networks may reach high accuracy, however their predictions may not be trusted by doctors, especially in life critical situations, if it is not clear what drives their decisions. Rule extraction algorithms may approximate these models in a human interpretable way and assist doctors in understanding the predictions of black box models. Since the fidelity of the extracted rules measures how well the output logic program approximates the black box model, reaching high fidelity is key to the understanding of those models and the proposed method assists in this direction. Similar to autonomous driving, these rules may be used to indicate bugs or errors of the model to the doctor, who may then either ignore their decisions or cause the model to be retrained to correct them.
Highly faithful rule extraction algorithms may be very beneficial in finance applications, for example to explain the decisions of black-box models that are utilized to make automated decisions for loans and/or predictions about a person being in default, or criminal justice applications, for example if a neural network is used to recommend sentencing to a judge who makes the final decision, and the judge wishes to understand how the sentence is derived from facts about the crime and the defendant.
The computing device 10 comprises a processor 993 and memory 994. Optionally, the computing device also includes a network interface 997 for communication with other such computing devices, for example with other computing devices of invention embodiments. Optionally, the computing device also includes one or more input mechanisms such as keyboard and mouse 996, and a display unit such as one or more monitors 995. The components are connectable to one another via a bus 992.
The memory 994 may include a computer readable medium, which term may refer to a single medium or multiple media (e.g., a centralized or distributed database and/or associated caches and servers) configured to carry computer-executable instructions. Computer-executable instructions may include, for example, instructions and data accessible by and causing a general purpose computer, special purpose computer, or special purpose processing device (e.g., one or more processors) to perform one or more functions or operations. For example, the computer-executable instructions may include those instructions for implementing the method of
The processor 993 is configured to control the computing device and execute processing operations, for example executing computer program code stored in the memory 994 to implement some or all of the method steps described herein. For example, the code may implement only step 2 or step 3 or steps 5 to 12 of the method of
The display unit 995 may display a representation of data stored by the computing device, such as images input into the CNN, the rules extracted from the CNN, and/or any other input/output described above, and may also display a cursor and dialog boxes and screens enabling interaction between a user and the programs and data stored on the computing device. The input mechanisms 996 may enable a user to input data and instructions to the computing device, such as selecting a rule extraction algorithm to be used, a criterion to be optimized, and/or a dataset to be used for training.
The network interface (network I/F) 997 may be connected to a network, such as the Internet, and is connectable to other such computing devices via the network. The network I/F 997 may control data input/output from/to other apparatus via the network. Other peripheral devices such as microphone, speakers, printer, power supply unit, fan, case, scanner, trackerball etc may be included in the computing device.
Methods embodying the present invention may be carried out on a computing device/apparatus 10 such as that illustrated in
A method embodying the present invention may be carried out by a plurality of computing devices operating in cooperation with one another. One or more of the plurality of computing devices may be a data storage server storing at least a portion of the data.
The invention can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. The invention can be implemented as a computer program or computer program product, i.e., a computer program tangibly embodied in a non-transitory information carrier, e.g., in a machine-readable storage device, or in a propagated signal, for execution by, or to control the operation of, one or more hardware modules.
A computer program can be in the form of a stand-alone program, a computer program portion or more than one computer program and can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a data processing environment. A computer program can be deployed to be executed on one module or on multiple modules at one site or distributed across multiple sites and interconnected by a communication network.
Method steps of the invention can be performed by one or more programmable processors executing a computer program to perform functions of the invention by operating on input data and generating output. Apparatus of the invention can be implemented as programmed hardware or as special purpose logic circuitry, including e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions coupled to one or more memory devices for storing instructions and data.
The above-described embodiments of the present invention may advantageously be used independently of any other of the embodiments or in any feasible combination with one or more others of the embodiments.
where W(l) and B(l) are the weights and biases for the l-th layer, σ is the non-linearity (ReLu, tanh, sigmoid..), l·l stands for element wise absolute value and
is the output activation of previous layer for sample
denotes the activation of the j-th neuron for the i-th sample at the l-th layer, i.e. the j-th component on
. In the case of ReLU there is no need to take absolute values since the output is already non-negative. Note that in this definition the activation of a filter is non-negative to avoid having to separate between positive and negative correlated filters with respect to the input image. This information may be stored in a separate variable
which is the sign of the j-th neuron activation (before taking absolute values).
Activation
of filter
for the i-th image in the batch at layer l is defined as the spatial average of activations of
(after non-linearity and maxpooling if present)
where (·)rs stands for the (r,s) spatial coordinates (the definition of activation of a feature map is defined as the average of activations, but may be naturally extended to any metric. For example, someone may define the activation in terms of the Lp or Lp,q norm of each feature map).
Number | Date | Country | Kind |
---|---|---|---|
22386002.4 | Jan 2022 | EP | regional |