The present invention relates to: methods for identifying whether a classification system is configured to use a specific machine-learning classification model; methods for generating a test set for use in such method for identifying whether a classification system is using a specific machine-learning classification model; and apparatus and computer programs for carrying out such methods.
Artificial intelligence (AI) or machine learning (ML) techniques are being used in ever more deployment scenarios, particularly as the processing capabilities of devices/systems continue to improve. These techniques involve creating and training a model - the trained model may then be used for whatever purpose the AI/ML is intended for. The training often uses a substantial amount of training data (or training samples), which can be costly and timely to acquire and maintain, and may be commercially sensitive. The training process itself can take a substantial amount of time to complete. As is well-known there are many different types of ML algorithm for creating/building a model -these different types of ML algorithm result in differently structured models that are trained and that operate in different ways, and that have different representations. For example, ML algorithms such as artificial neural networks, decision trees, support vector machines, etc. are all well-known, and may be implemented in many different ways - the trained models implemented/generated by such techniques can be used to generate an output (or a prediction or a decision) based on an input, the model having “learned” how to do this from the set of training data/samples. As ML algorithms, their models, representations, implementations and training are very well-known, they shall not be described in more detail herein except where necessary for understanding. More information on ML and ML models can be found at https://en.wikipedia.org/wiki/Machine_learning#Models, the entire contents of which is incorporated herein by reference.
Once deployed, the trained ML model may be susceptible to unauthorized use or modification. For example, an entity may have licensed some specific use of the trained ML model, but may then use it in an unauthorized manner (e.g. outside the boundaries of permitted use specified by the licence). Similarly, an entity may acquire the trained ML model via an unauthorized channel, and may therefore benefit from the use of the trained ML model without having to have expended the time and cost for: acquiring training data; selecting/configuring/designing the architecture for the ML algorithm and model; and performing training and testing to obtain the resultant model. It would, therefore, be desirable to be able to confirm the identity of an ML model, e.g. so that the originator/generator of the trained ML model can check whether an entity may be using their trained ML model in an unauthorized manner, i.e. whether a “suspect” trained ML model that an entity is using is actually the trained ML model developed by the originator/generator. Indeed, it would be desirable to be able to do this in a manner where access to the “suspect” ML model is restricted, in the sense that internal weights/values/results etc. of the “suspect” ML model may not be accessible but only the resulting outputs of the “suspect” ML model are accessible, e.g. if the deployed “suspect” ML model is operating behind an interface, such as a webpage, and the only interaction with the “suspect” ML model is via the interface (e.g. by providing an input and receiving a final result from the “suspect” ML model without being able to access intermediate weights/values/results).
Similarly, once deployed, it is possible that a modification may be made to an ML model. Such modifications could be made inadvertently, e.g. due to an error/malfunction of software, hardware, memory, etc. Alternatively, such modifications could be malicious (e.g. an attacker deliberately attempting to change the ML model so as to induce erroneous results). Such modifications could have serious consequences - e.g. an ML model may have been trained to identify road signs, with a view to guiding an autonomous vehicle, and a modification to such a trained ML model could result in incorrect vehicle operation, potentially leading to an accident. It would, therefore, be desirable to be able to verify that the ML model that is being used in a particular deployment is, indeed, the intended/desired ML model or, put another way, it would be desirable to be able to verify the integrity of a deployed ML model. Again, depending on the deployment scenario, such verification may, or may not, have access to intermediate weights/values/results of the deployed ML model.
According to a first aspect of the invention, there is provided a method for identifying whether a classification system is configured to use a specific machine-learning classification model, the method comprising: using the classification system to generate, for each test sample in a predetermined test set that comprises a plurality of test samples, a corresponding classification result; and identifying either (i) that the classification system is using the specific machine-learning classification model if, for each test sample in the test set, the corresponding classification result matches a classification result produced for that test sample using the specific machine-learning classification model or (ii) that the classification system is not using the specific machine-learning classification model if there is a test sample in the test set for which the corresponding classification result does not match the classification result produced for that test sample using the specific machine-learning classification model; wherein the test set is associated with the specific machine-learning classification model and, for each test sample in the test set, there is a corresponding small modification for that test sample that causes a change in the classification result produced for that test sample using the specific machine-learning classification model.
According to a second aspect of the invention, there is provided a method of generating a test set for use in the method for identifying whether a classification system is using a specific machine-learning classification model according to the above-mentioned first aspect, the test set associated with the specific machine-learning classification model, wherein the test set comprises a plurality of test samples and, for each test sample in the test set, there is a corresponding small modification for that test sample that causes a change in the classification result produced for that test sample using the specific machine-learning classification model, wherein the method comprises: obtaining a first set that comprises a plurality of candidate samples applicable to the specific machine-learning classification model; and updating the first set, said updating comprising, for each candidate sample, performing a corresponding sequence of one or more update steps, wherein each update step comprises: generating a second candidate sample based on said candidate sample; generating, for each of said candidate sample and the second candidate sample, a corresponding classification measure using the specific machine-learning classification model; and assessing the generated classification measures, wherein said assessing comprises: based on a comparison of the generated classification measures, performing one or more of: (a) terminating the sequence of one or more update steps; (b) setting said candidate sample to be the second candidate sample if the comparison indicates that the second candidate sample is more likely than the said candidate sample to have a corresponding small modification that causes a change in classification result produced using the specific machine-learning classification model; wherein the test set comprises some or all of the updated first set.
In some embodiments of the second aspect, said generating a second candidate sample based on said candidate sample comprises generating the second candidate sample by performing a random change to said candidate sample.
In some embodiments of the second aspect, for each candidate sample, for each update step of a first subsequence of the corresponding sequence of one or more update steps: for each of said candidate sample and the second candidate sample, the corresponding classification measure is a score generated by: using the specific machine-learning classification model to generate a corresponding plurality of values, each value indicative of that sample belonging to a corresponding class; and using a score function to generate the score for that sample based on the corresponding plurality of values, the score indicative of a likelihood that there is a small modification for that sample that causes a change in the classification result produced for that sample using the specific machine-learning classification model; and assessing the generated classification measures comprises: if the classification measure for the second candidate sample is indicative of a higher likelihood than the classification measure for said candidate sample, setting said candidate sample to be the second candidate sample. In some such embodiments, for each candidate sample, for each update step of a second subsequence of the corresponding sequence of one or more update steps after the first subsequence: for each of said candidate sample and the second candidate sample, the corresponding classification measure is an identification of the class for that sample generated using the specific machine-learning classification model; and assessing the generated classification measures comprises: if the classification measure for the second candidate sample is the same as the classification measure for said candidate sample, terminating the second subsequence; if the classification measure for the second candidate sample is not the same as the classification measure for said candidate sample: for each of said candidate sample and the second candidate sample: using the specific machine-learning classification model to generate a corresponding plurality of values, each value indicative of that sample belonging to a corresponding class; and using a score function to generate the score for that sample based on the corresponding plurality of values, the score indicative of a likelihood that there is a small modification for that sample that causes a change in the classification result produced for that sample using the specific machine-learning classification model; and if the score for the second candidate sample is indicative of a higher likelihood than the score for said candidate sample, setting said candidate sample to be the second candidate sample.
In some embodiments of the second aspect, each value represents a probability that the corresponding sample belongs to the corresponding class. In some such embodiments, for each of the first candidate sample and the second candidate sample, the corresponding plurality of values are normalized to have a predetermined total.
In some embodiments of the second aspect, for each candidate sample, for each update step of a first subsequence of the corresponding sequence of one or more update steps: for each of said candidate sample and the second candidate sample, the corresponding classification measure is an identification of class for that sample generated using the specific machine-learning classification model; assessing the generated classification measures comprises: if the classification measure for the second candidate sample is the same as the classification measure for said candidate sample, terminating the first subsequence if a termination condition is met; if the classification measure for the second candidate sample is not the same as the classification measure for said candidate sample: setting said candidate sample to be the second candidate sample; and reducing the size of the random change to be applied to the candidate sample when generating a second candidate sample at the next update step.
In some embodiments of the first or second aspect, one or more of the test samples in the test set are generated as adversarial examples for the specific machine-learning classification.
According to a third aspect of the invention, there is provided an apparatus arranged to carry out a method according to the first or second aspect or any embodiment thereof.
According to a fourth aspect of the invention, there is provided a computer program which, when executed by one or more processors, causes the one or more processors to carry out a method according to the first or second aspect or any embodiment thereof. The computer program may be stored on a computer-readable medium.
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
In the description that follows and in the figures, certain embodiments of the invention are described. However, it will be appreciated that the invention is not limited to the embodiments that are described and that some embodiments may not include all of the features that are described below. It will be evident, however, that various modifications and changes may be made herein without departing from the broader spirit and scope of the invention as set forth in the appended claims.
The storage medium 104 may be any form of non-volatile data storage device such as one or more of a hard disk drive, a magnetic disc, a solid-state-storage device, an optical disc, a ROM, etc. The storage medium 104 may store an operating system for the processor 108 to execute in order for the computer 102 to function. The storage medium 104 may also store one or more computer programs (or software or instructions or code).
The memory 106 may be any random access memory (storage unit or volatile storage medium) suitable for storing data and/or computer programs (or software or instructions or code).
The processor 108 may be any data processing unit suitable for executing one or more computer programs (such as those stored on the storage medium 104 and/or in the memory 106), some of which may be computer programs according to embodiments of the invention or computer programs that, when executed by the processor 108, cause the processor 108 to carry out a method according to an embodiment of the invention and configure the system 100 to be a system according to an embodiment of the invention. The processor 108 may comprise a single data processing unit or multiple data processing units operating in parallel, separately or in cooperation with each other. The processor 108, in carrying out data processing operations for embodiments of the invention, may store data to and/or read data from the storage medium 104 and/or the memory 106.
The interface 110 may be any unit for providing an interface to a device 122 external to, or removable from, the computer 102. The device 122 may be a data storage device, for example, one or more of an optical disc, a magnetic disc, a solid-state-storage device, etc. The device 122 may have processing capabilities – for example, the device 122 may be a smart card. The interface 110 may therefore access data from, or provide data to, or interface with, the device 122 in accordance with one or more commands that it receives from the processor 108.
The user input interface 114 is arranged to receive input from a user, or operator, of the system 100. The user may provide this input via one or more input devices of the system 100, such as a mouse (or other pointing device) 126 and/or a keyboard 124, that are connected to, or in communication with, the user input interface 114. However, it will be appreciated that the user may provide input to the computer 102 via one or more additional or alternative input devices (such as a touch screen). The computer 102 may store the input received from the input devices via the user input interface 114 in the memory 106 for the processor 108 to subsequently access and process, or may pass it straight to the processor 108, so that the processor 108 can respond to the user input accordingly.
The user output interface 112 is arranged to provide a graphical/visual and/or audio output to a user, or operator, of the system 100. As such, the processor 108 may be arranged to instruct the user output interface 112 to form an image/video signal representing a desired graphical output, and to provide this signal to a monitor (or screen or display unit) 120 of the system 100 that is connected to the user output interface 112. Additionally or alternatively, the processor 108 may be arranged to instruct the user output interface 112 to form an audio signal representing a desired audio output, and to provide this signal to one or more speakers 121 of the system 100 that is connected to the user output interface 112.
Finally, the network interface 116 provides functionality for the computer 102 to download data from and/or upload data to one or more data communication networks.
It will be appreciated that the architecture of the system 100 illustrated in
The classification system 200 comprises an input (or interface) 202 for receiving (or obtaining) an input sample 208 (or an amount of input data). The classification system 200 uses, or executes, an ML classification model (or a trained ML classification algorithm) 204 to perform classification on the input sample 208. For example, the ML classification model 204 may be arranged to identify that the input sample 208 corresponds to one class Cj from a plurality (or set) of NC known classes or categories {Ck : k = 1, ... , NC}. Alternatively, the ML classification model 204 may be arranged to identify, for each class Cj of a plurality (or set) of NC known classes {Ck : k = 1, ... , NC}, a corresponding weight or value representing a likelihood or probability that the input sample 208 belongs to that class Cj. The ML classification system 200 comprises an output (or interface) 206 for providing a result (or output) 210 for the classification performed for the input sample 208 using the ML classification model 204. For example, the result 210 may comprise an indication of a class Cj to which, according to the ML classification model 204, the input sample 208 belongs or corresponds. Such an indication may be accompanied by an indication of confidence, or probability, that the input sample 208 does indeed belong to that class Cj.
The classification system 200 may, of course, be a component of (or a sub-system of) a larger system – the larger system may be arranged to provide the input sample 208 to the classification system 200 and use the result 210 generated by the classification system 200. For example: the larger system could be an engine; the input sample 208 may comprise measurements from sensors around/within engine; the classification system 200 may be arranged to use its ML classification model 204 to classify the input sample 208 as either a normal status for the engine or a malfunctioning status for the engine; if the result 210 provided by the classification 200 to the engine indicates that the input sample 208 corresponds to a malfunctioning status, then then engine may be arranged to shut down or raise an alarm.
The input interface 202 may be the same as, or may be different from, the output interface 206. The input interface 202 and/or the output interface 206 may, for example, comprise one or more webpages or user interfaces. Additionally or alternatively, the input interface 202 and/or the output interface 206 may simply be arranged, respectively, to access the input sample 208 from, or store the result 210 to, a storage device or medium of the classification system 200. It will be appreciated that the input interface 202 and the output interface 206 may take other forms, depending on the specific deployment scenario for the classification system 200.
Of course, the nature of the input sample 208 and the classes {Ck : k = 1, ... , NC} may vary widely depending on the application/use of the classification system 200. For example, the classification system 200 may be part of an autonomous vehicle - the input sample 208 may comprise one or more images captured by one or more cameras of the vehicle, and the ML classification model 204 may be arranged to classify the input sample 208 with a view to the result 210 being used to help control the driving of the vehicle. The classes {Ck : k = 1, ... , NC} may then correspond to: images that contain or depict a road sign; images that contain or depict an obstacle (e.g. another vehicle or a pedestrian etc.); and so on. Similarly, the classification system 200 may be arranged to identify a language being spoken in an input audio sample 208 (e.g. audio captured by a microphone), with a view to the result 210 identifying one or more likely languages being spoken within that input audio sample 208. The classes {Ck : k = 1, ... , NC} may then correspond to the different languages being spoken. As another example, the classification system 200 may be arranged to identify whether or not an email input sample 208 is spam. The classes {Ck : k = 1, ... , NC} may then correspond to spam and non-spam emails.
As classification systems are well-known, they shall not be described in more detail herein. Further information on ML classification can be found, for example, at https://en.wikipedia.org/wiki/Statistical_classification, the entire contents of which are incorporated herein by reference.
A modification to the ML classification model 204 would change at least some of the boundaries between some of the classes {Ck : k = 1, ... , NC}. Put another way, a modification to the ML classification model 204 would change the result 210 for at least some samples 208 of the sample space 250. Likewise, a different ML classification model, even if it used the same classes {Ck : k = 1, ... , NC}, would have different boundaries between those classes in the sample space 250 and therefore would, for at least some samples 208 of the sample space 250, produce a different result 210. Indeed, even two ML classification models, trained on the same training data using the same underlying ML algorithm, may end up being different from each other, producing a different result 210 for one or more input samples 208.
As can be seen, some input samples 208 (represented, for example, by points 252 in
Now, one way to identify whether a “suspect” ML classification model corresponds to a specific (or particular/target) ML classification model would be to obtain, for each of a plurality of test samples 208, a first classification result 210 using the “suspect” ML classification model and a second classification result 210 using the specific ML classification model. If, for any of the test samples 208, the corresponding first and second classification results differ, then the “suspect” ML classification model does not match the specific ML classification model. On the other hand, if, for each of the plurality of test samples 208, the corresponding first and second classification results are the same, then one may conclude/infer that the “suspect” ML classification model matches the specific ML classification model – however, without testing all possible input samples 208, this conclusion cannot be ensured. As discussed above, and as can be seen from
The classification system 310 may be a classification system 200 as discussed above with reference to
In summary, the test system 330 is arranged to try to identify whether the ML classification model 312 being used by the classification system 310 is a specific ML classification model 322. As discussed above, this may be achieved by the test system 330 supplying a plurality, or a set, of test samples to the classification system 310 and obtaining corresponding first classification results from the classification system 310 based on the classification model 312 used by the classification system 310. The test system 330 may, for example, provide the test samples to the classification system 310, and receive the corresponding first classification results back from the classification system 310, via the network 350. The test system 330 may compare these first classification results with second classification results for the set of test samples, where the second classification results are, or have been, generated using the specific ML classification model 322. Based on this comparison, the test system 330 may decide/determine whether or not the ML classification model 312 being used by the classification system 310 is the specific ML classification model 322. The purpose of the test set creation system 320 is to generate the set of test samples corresponding to the specific ML classification model 322 based on the principle of “sensitive samples” discussed above with reference to
The test set creation system 320 and the test system 330 may be implemented using one or more computer systems 100 as discussed with reference to
The test set creation system 320 comprises a test set creation module 326. The test set creation module 326 may be implemented as software executing on, say, a processor 108 of a computer system 100 that is part of (or that forms) the test set creation system 320, or as a hardware module of the test set creation system 320, or as a combination of both software and hardware. In operation, the test set creation module 326 is arranged to use the specific ML classification model 322 to generate a test set 324 that comprises a plurality of test samples – thus, the test set 324 is associated with, or corresponds to and has been generated for, the specific ML classification model 322. Of course, the test samples of the test set 324 are samples that are valid inputs for the specific ML classification model 322 (or, to use the analogy with
The test set creation system 320 may already be in possession of the specific ML classification model 322 – e.g. the test set creation system 320 may have generated (or created) the specific ML classification model 322, or may have previously received/obtained the specific ML classification model 322. Alternatively, the test set creation system 330 may receive/obtain the specific ML classification model 322 from a third party (for example, via the network 302).
The test system 330 comprises a test module 336. The test module 336 may be implemented as software executing on, say, a processor 108 of a computer system 100 that is part of (or that forms) the test system 330, or as a hardware module of the test system 330, or as a combination of both software and hardware. In operation, the test module 336 is arranged to use the test set 324 (generated by the test set creation system 320) that is associated with the specific ML classification model 322, to determine whether the ML classification model 312 used by the classification system 310 is the specific ML classification model 322 to which the test set 324 relates.
In some embodiments, the specific ML classification model 322 is provided to the test set creation system 320 in advance (e.g. as part of a registration process), so that the test set creation system 320 may generate the test set 324 in advance, ready for when it is needed. Thus, the test set 324 may be stored in a storage (not shown in
As discussed above, the test set creation system 320 uses the specific ML classification model 322 to generate the test set 324. The test set creation system 320 may also use the specific ML classification model 322 to generate the classification results (the above-mentioned second classification results) for the test samples in the test set 324. These classification results may then be stored (e.g. in a storage, not shown in
In some embodiments, the test set creation system 320 and/or the test system 330 may have access to “raw” data output/generated by the specific ML classification model 322, e.g. intermediate results/values generated by the specific ML classification model 322 or weights used/generated by the specific ML classification model 322. In other embodiments, the test set creation system 320 and/or the test system 330 does not have access to such “raw” data output/generated by the specific ML classification model 322, i.e. the test set creation system 320 and/or the test system 330 may only have access to a final classification result 210 generated as an output using the specific ML classification model 322. For example, a first entity that created/generated the specific ML classification model 322 may engage a second entity that is operating the system 340 to determine whether or not the ML classification model 312 being used by the classification system 310 is their specific ML classification model 322 – however, the first entity may not wish to provide the second entity with full access/details of their specific ML classification model 322 and may, instead, only provide the second entity with access to an input interface 202 and an output interface 206 of a classification system 210 that uses their specific ML classification model 322, so that the second entity may generate classification results 210 for test samples 208 using the specific ML classification model 322.
The network 350 may be any data communication network via which the test system 330 may communicate with the classification system 310. For example, the network 350 may comprise one or more of: a local area network, a wide area network, a metropolitan area network, the Internet, a wireless communication network, a wired or cable communication network, etc.
At a step 402, the test set creation system 320 (or the test set creation module 326 thereof) generates the test set 324. As discussed above, the test set 324 is associated with, or corresponds to, the specific ML classification model 322 – therefore, the test set 324, once generated, is predetermined for the specific ML classification model 322. The test set 324 comprises a plurality of test samples. The test samples are “sensitive” samples, in that, for each test sample in the test set 324, there is a corresponding small modification for that test sample that causes a change in the classification result produced for that test sample using the specific ML classification model 322. Methods of generating the test set 324 shall be described shortly. Let there be NS test samples in the test set 324 (for some integer NS > 1), with the test samples being Tj for j = 1, ..., NS.
At a step 404, the test system 330 (or the test module 336 thereof) uses the classification system 310 to generate, for each test sample in the test set 324, a corresponding first classification result. Thus, the first classification results are classification results generated using the ML classification model 312 for each of the test samples in the test set 324. Thus, for each test sample Tj for j = 1, ..., NS, a corresponding first classification result R1,j for j = 1, ..., NS is generated according to the ML classification model 312.
At a step 406, for each test sample in the test set 324, a corresponding second classification result is produced, or generated, for that test sample using the specific ML classification model 322. As discussed above, the step 404 may be performed by the test set creation system 320 (e.g. by the test set creation module 326 thereof) or by the test system 330 (e.g. by the test module 336 thereof). Thus, for each test sample Tj for j = 1, ... ,NS, a corresponding second classification result R2,j for j = 1, ... ,NS is generated according to the specific ML classification model 322.
At a step 408, the test system 330 (or the test module 336 thereof) determines whether the first and second classification results match for the test samples.
If, at the step 408, it is determined that the first classification result matches the second classification result for each test sample in the test set 324 (i.e. that R1,j = R2,j for j = 1, ... ,NS), then the test system 330 identifies, and therefore generates a conclusion/decision 410, that the classification system 310 is using the specific ML classification model 322 (i.e. that the ML classification model 312 matches, or equals, the specific ML classification model 322). Appropriate measures may then be taken as necessary, for example: (a) if the purpose of the check/investigation of the classification system 310 is to check the integrity of the ML classification model 312, then it may be concluded that the integrity of the ML classification model 312 has not been compromised and a log of this may be made accordingly; (b) if the purpose of the check/investigation of the classification system 310 is to check whether the classification system 310 is using the specific ML classification model 322 in an unauthorized manner, then measures may be taken to try to stop the classification system 310 from operating or, at least, from using the specific ML classification model 322.
If, on the other hand, at the step 408 it is determined that there is a test sample in the test set 324 for which the first classification result does not match the second classification result (i.e. that there exists some k ∈ {1, ... , NS} for which R1,k ≠ R2,k), then the test system 330 identifies, and therefore generates a conclusion/decision 412, that the classification system 310 is not using the specific ML classification model 322 (i.e. that the ML classification model 312 does not match, or equal, the specific ML classification model 322). Appropriate measures may then be taken as necessary, for example: (a) if the purpose of the check/investigation of the classification system 310 is to check the integrity of the ML classification model 312, then it may be concluded that the integrity of the ML classification model 312 has been compromised and a suitable action may be taken, such as preventing the classification system 310 from using the compromised model, raising an alert, etc.; (b) if the purpose of the check/investigation of the classification system 310 is to check whether the classification system 310 is using the specific ML classification model 322 in an unauthorized manner, then no measures need to be taken to at this stage.
It will be appreciated that the test set creation system 320 may generate a test set 324 that has more test samples than the test system 330 actually needs to use at the steps 406 and 408, in which case the test system 330 may use a subset of the test set 324 (although this may still be viewed as the test set creation system 320 having generated that subset).
The method 400 could be used without the specifically generated test set 324 but instead with another test set of “regular” test samples that have not been generated so as to be “sensitive” (e.g. just random samples from the sample space 250). However, by using the specifically generated test set 324 of “sensitive” samples, embodiments of the invention enable more accurate testing, using fewer test samples to provide a more confident test result. Example experimental results for this are set out later.
The method 500 begins with a step 502, at which the test set creation system 320 obtains a first set 510 that comprises a plurality of candidate samples. As shown in
The value of NB may be predetermined. NB may, for example, be of the order of around 10 to 20 although, of course, the higher the value of NB, the more accurate the analysis and the conclusions of the method 400 will be. Example experimental results for this are set out later, illustrating the improvements available as NB varies.
The initial candidate samples Bj (j = 1, ... , NB) may be randomly generated samples from the sample space 250 for the specific ML classification model 322 – for example, with reference to the example of
At a step 504, the test set creation system 320 updates the first set 510. This involves, for each candidate sample Bj (j = 1, ... , NB), performing a corresponding sequence 520j of one or more update steps. The procedure to implement the sequence of one or more update steps may be the same for each candidate sample Bj (j = 1, ... , NB), but the actual sequence of one or more update steps may vary from one candidate sample to another (e.g. if the sequence of one or more update steps 520j for the candidate sample Bj (j = 1, ... , NB) depends on the value(s) assumed by the candidate sample Bj). In some embodiments, however, the procedure to implement the sequence of one or more update steps may be the different for two or more of the candidate samples Bj (j = 1, ... ,NB). The nature of the sequence 520j of one or more update steps shall be described shortly with reference to
As shown in
) for integer –
≥ NB. The second set 530 is, therefore, an updated version of the first set 510. In some embodiments, performing the sequence 520j of one or more update steps for the candidate sample Bj (j = 1, ... , NB) results in just one corresponding updated candidate sample B*j (so that
= NB), but this need not always be the case. The test set 324 comprises some or all of the updated first set 530. In some embodiments, the test set 324 comprises all of the updated candidate samples B*j (j = 1, ... ,
); in some embodiments, the test set 324 comprises a subset (selected according to one or more selection criteria) of the updated candidate samples B*j (j = 1, ... ,
).
The purpose of the sequence 520j of one or more update steps for the candidate sample Bj (j = 1, ..., NB) is to refine, or attempt to optimize or improve, the candidate sample Bj so as to arrive at one or more test samples for the test set 324 that are more “sensitive” than the candidate sample Bj (or that at least, on the evidence available, appear to be more “sensitive” than the candidate sample Bj).
The sequence 520j of one or more update steps for the candidate sample Bj (j = 1, ..., NB) may be implemented in a variety of ways, examples of which are set out below. As mentioned above, in some embodiments, the test set creation system 320 and/or the test system 330 may have access to “raw” data output/generated by the specific ML classification model 322, e.g. intermediate results/values generated by the specific ML classification model 322 or weights used/generated by the specific ML classification model 322. Such “raw” data may be used as part of the sequence 520j of one or more update steps. With such embodiments, any statistical method that can optimize an objective function (such as particle swarm optimization, genetic algorithms, simulated annealing, stochastic gradient descent, etc.) could be used. However, in other embodiments, the test set creation system 320 and/or the test system 330 does not have access to such “raw” data output/generated by the specific ML classification model 322, i.e. the test set creation system 320 and/or the test system 330 may only have access to a final classification result 210 generated as an output using the specific ML classification model 322. With such embodiments, any searching/refining technique (such as a binary search) could be used. Specific examples are set out later.
At the step 602, the test set creation system 320 generates a second candidate based on the current candidate sample Pj. The second candidate sample Qj is a sample on which the specific ML classification model 322 is intended to operate/process (i.e. is a valid sample for the specific ML classification model 322) or, by analogy with
In some embodiments, the test set creation system 320 generates the second candidate sample Qj by performing a random change to the current candidate sample Pj. The random change may be implemented, for example, by performing a random modification or perturbation to one or more components of the current candidate sample Pj – for example, if the current candidate sample Pj is a vector or collection of values, then one or more of those values could be randomly perturbed. It will, of course, be appreciated that such random changes could be implemented in a variety of ways, depending on the nature of the samples themselves. In some embodiments, the size of the random change may be at most a predetermined threshold, e.g. one or more thresholds may be placed on the random change so as to impose a limit on the distance between the current candidate sample Pj and the second candidate sample Qj in the sample space 250.
As shall be discussed later, however, in some embodiments the test set creation system 320 generates the second candidate sample Qj based on both the current candidate sample Pj and some other information (such as another point in the sample space 250, e.g. as a midpoint between the current candidate sample Pj and this other point). Hence, the generation of the second candidate sample Qj based on the current candidate sample Pj may be more deterministic.
At the step 604b, the test set creation system 320 generates, for the second candidate sample Qj, a corresponding classification measure using the specific ML classification model 322. In some embodiments, the classification measure may simply be an indication of the particular class Cj from the plurality of Nc known classes {Ck : k = 1, ... , NC} to which, according to the specific ML classification model 322, the second candidate sample Qj belongs. Alternatively, in some embodiments, the test set creation system 320 may be arranged to obtain from the ML classification model 322, for each class Cj of the plurality of Nc known classes {Ck : k = 1, ...,NC}, a corresponding weight or value representing a likelihood (or probability or confidence) that the second candidate sample Qj belongs to that class Cj – the classification measure may then be a score generated based on a function (or metric) of those weights or values.
The step 604a is the same as the 604b, except that it comprises the test set creation system 320 generating, for the current candidate sample Pj, a corresponding classification measure using the specific ML classification model 322. This is performed in the same way as for the second candidate sample Qj. Now, as mentioned above, some update steps may result in updating the candidate sample Pj – thus, in the subsequent update step, the classification measure for this now-updated candidate sample Pj needs to be generated (and therefore the step 604a is performed in the subsequent update step). However, some update steps may not result in updating the candidate sample Pj – thus, in the subsequent update step, the classification measure for the candidate sample Pj is already known and does not need to be regenerated (and therefore the step 604a may be omitted from the subsequent update step).
At the step 606, the test set creation system 320 assesses the classification measure generated for the current candidate sample Pj and the classification measure generated for second candidate sample Qj. This assessment comprises:
The step 610 terminates the sequence of one or more update steps 502j for the candidate sample Bj (j = 1, ... , NB) if at least one termination condition (or criterion) is met. In some embodiments, at least one termination condition may be based on the comparison of the classification measure generated for the current candidate sample Pj and the classification measure generated for the second candidate sample Qj, e.g. if a difference between the classification measure generated for the current candidate sample Pj and the classification measure generated for the second candidate sample Qj is less than a predetermined threshold (or has been less than this predetermined threshold for a predetermined number of update steps) – i.e. the sequence of one or more update steps 502j may be terminated if the test set creation system 320 assesses that there is unlikely to be further substantial improvement over the current candidate sample Pj. Additionally or alternatively, in some embodiments, at least one of the termination conditions is that the sequence of one or more update steps 502j for the candidate sample Bj (j = 1, ... , NB) is terminated if the current update step is the Vth update step in the sequence for some non-negative integer V – i.e. the sequence of one or more update steps 502j for the candidate sample Bj (j = 1, ... ,NB) may be limited to at most V update steps so as to ensure that the process of generating the test set 324 terminates (or at least terminates within a certain period of time).
Assuming that a termination condition is not met at the step 610, processing continues to the next update step, in which the candidate sample to be processed will be either the current (non-updated) candidate sample Pj from the current update step (i.e. if the step 612 did not set Pj to be Qj) or will be the second candidate sample Qj (i.e. if the step 612 did set Pj to be Qj).
Below are some examples of how the sequence of one or more update steps 502j for the candidate sample Bj (j = 1, ... , NB) may be performed. It will, however, be appreciated that other implementations/approaches are possible.
In this example, the second candidate sample Qj is generated at the step 602 by implementing a (possibly random) change to the current candidate sample Pj.
In this example, the specific ML classification model 322 generates, for each class Cj of the plurality of NC known classes {Ck : k = 1, ... , NC}, a corresponding value (or weight) pj representing a likelihood (or probability or confidence) that a given input sample 208 belongs to that class Cj. Thus, the value pj is indicative of that sample belonging to the corresponding class Cj. In this example, it is assumed that the test set creation system 320 has access to these values pk (k = 1, ... , NC), so that the test set creation system 320 can use the specific ML classification model 322 to generate the plurality of values pk (k = 1, ... , NC) for any given input sample 208.
Without loss of generality, we assume in the following that these values pk (k = 1, ... , NC) are normalized to have a predetermined total, i.e.
for some predetermined constant W. Preferably, W = 1, which is what will be assumed in the following, although it will be appreciated that this is merely an example.
Generating the classification measure at the step 604a for the current candidate sample Pj may therefore involve the test set creation system 320 using the specific ML classification model 322 to generate the plurality of values pk (k = 1, ... , NC) for the current candidate sample Pj, and using a score function to generate a score (which will be the classification measure) for that sample based on the plurality of values pk (k = 1, ... , NC). The score function is arranged so that the score is indicative of a likelihood that there is a small modification for that sample that causes a change in the classification result produced for that sample using the specific ML classification model 322. The score function is, therefore, arranged so that the score is indicative of a degree of how “sensitive” the sample is likely to be.
The step 604b may then use the same score function to generate a score for the second candidate sample Qj in the same way as at the step 604a.
The test set creation system 320 may then, as part of assessing the classification measures at the step 606, compare the scores generated at the steps 604a and 604b and, at the step 612, if the classification measure for the second candidate sample Qj is indicative of a higher likelihood than the classification measure for the current candidate sample Pj, the step 612 sets the current candidate sample Pj to be the second candidate sample Qj.
As an example, suppose that there are two classes, i.e. NC = 2, then the score function may be score = 1 - |p1 - p2|. Thus, if the likelihood of a sample belonging to class C1 is similar to the likelihood of a sample belonging to class C2, then the score for that sample will be high; if the likelihood of a sample belonging to class C1 is substantially different from the likelihood of a sample belonging to class C2, then the score for that sample will be low. This may be viewed another way: if the sample is close to the boundary between class C1 and class C2 (so that the sample is “sensitive”) then the values of p1 and p2 are likely to be similar, resulting in a high score; if the sample is far from the boundary between class C1 and class C2 (so that the sample is not “sensitive”) then the values of p1 and p2 are likely to be very different, resulting in a low score.
For example:
It will be appreciated that other score functions could be used. For example, with NC = 2, the score function could be score = 1 - (p1 - p2)2.
As a further example, for a sample with normalized values pk (k = 1, ... ,NC) such that
let µ be the mean of the normalized values pk (k = 1, ... , NC), i.e.
Then the score function used could be
As a further example, consider a sample with normalized values pk (k = 1, ... , NC) such that
Let P = {p1, p2, ... , pN
As another example, given that there are NC known classes {Ck : k = 1, ... ,NC}, a score function may be based on values pk for just a subset of the classes. For example, there are ½NC(NC - 1) unordered pairs of classes – e.g. if NC = 3 then there are 3 unordered pairs of classes, namely (C1, C2), (C1, C3) and (C2, C3). For one such pair of classes, (Ci, Cj), any of the above example score functions could be used for a sample based just on the values pi and pj for those two classes (so as to help update that sample towards a boundary between those two classes). This could be done likewise for samples for some or all of the other pairs of classes. Additionally, this may be done as well as obtainining an updated candidate sample using a score function that operates on more than just two of the values pk (k = 1, ... ,NC).
In some embodiments, the score function is arranged to provide higher scores for samples 208 that have one or more of the following characteristics: (a) a high, but nearly equal probability, of belonging to two or more classes; (b) a low probability of membership of all classes; (c) a low but non-zero probability of membership in a (proper) subset of classes.
Thus, in this example, the method for generating the test set 324 may involve:
Example 2 builds on Example 1. In particular, the initial techniques of Example 1 may be viewed as steps for a first subsequence of the sequence of one or more update steps 502j for the candidate sample Bj (j = 1, ... , NB) (i.e. a first phase), with Example 2 then adding steps for a second subsequence of the sequence of one or more update steps 502j after the first subsequence (i.e. a second phase). The second phase may be viewed as a refinement of the first phase.
In particular, during the second phase, the classification measures for the current candidate sample Pj and for the second candidate sample Qj are a respective identification of the class for that sample generated using the specific ML classification model 322, i.e. the class to which, according to the specific ML classification model 322, that sample belongs. Assessing these generated classification measures at the step 606 then comprises: if the classification measure for the second candidate sample Qj is the same as the classification measure for the current candidate sample Pj (i.e. the two samples are determined to be in the same class) terminating the second subsequence at the step 610; whereas if the classification measure for the second candidate sample Qj is not the same as the classification measure for the current candidate sample Pj (i.e. the two samples are determined to be in different classes) then: for each the current candidate sample Pj and the second candidate sample Qj, generating a respective score for that sample as discussed above with reference to Example 1 and, if the score for the second candidate sample Qj is indicative of a higher likelihood than the score for the current candidate sample Pj, setting (at the step 612) the current candidate sample Pj to be the second candidate sample Qj.
Thus, in this example, the method for generating the test set 324 involves:
In the above, steps (2)(a)-(e) form the first phase and steps (2)(f)-(i) form the second phase.
In this example, the second candidate sample Qj is generated at the step 602 by implementing a (possibly random) change to the current candidate sample Pj. The classification measures for the current candidate sample Pj and for the second candidate sample Qj are a respective identification of the class for that sample generated using the specific ML classification model 322, i.e. the class to which, according to the specific ML classification model 322, that sample belongs. Assessing these generated classification measures at the step 606 then comprises: at the step 610, if the classification measure for the second candidate sample Qj is the same as the classification measure for the current candidate sample Pj (i.e. the two samples are determined to be in the same class), terminating the sequence of one or more update steps 502j for the candidate sample Bj (j = 1, ... ,NB); if, however, the classification measure for the second candidate sample Qj is not the same as the classification measure for the current candidate sample Pj (i.e. the two samples are determined to be in different classes), then at the step 612, setting current candidate sample Pj to be the second candidate sample Qj, and also reducing the size of the change to be applied to the candidate sample Pj when generating a second candidate sample Qj at the step 602 for the next update step. Thus, the size of the change progressively gets smaller and smaller until both the current candidate sample Pj and the second candidate sample Qj are found to be in the same class. The size of the change may be viewed as a distance between the current candidate sample Pj and the second candidate sample Qj using a distance metric for the sample space 250. Due to this progressively smaller size for the change implemented at the step 602, the identification at the step 608 that the classification measure for the second candidate sample Qj is not the same as the classification measure for the current candidate sample Pj is indicative that the second candidate sample Qj is more likely to be “sensitive” than the current candidate sample Pj (as the second candidate sample Qj was arrived at using a smaller change than for the current candidate sample Pj, and so is more likely to be closer to a class boundary).
Thus, in this example, the method for generating the test set 324 involves:
As will be appreciated, this third example is useful for when the test set creation system 320 does not have access to the values pk (k = 1, ... , NC) that the specific ML classification model 322 would generate. It will, however, be appreciated that this third example could still be used in situations in which the test set creation system 320 does indeed have access to the values pk (k = 1, ...,NC) that the specific ML classification model 322 would generate.
It will be appreciated that, for this third example, the sequence of one or more update steps 502j for the candidate sample Bj (j = 1, ... , NB) may involve additional steps (e.g. before or after) the steps (2)(a)-(e) discussed above, so that the steps (2)(a)-(e) may be viewed as a subsequence of steps in a larger sequence of one or more update steps 502j for the candidate sample Bj (j = 1, ... , NB). Indeed, for example, whilst Example 2 above added steps to the end of Example 1 as a second phase, the steps (2)(a)-(e) of Example 3 could be used as the second phase instead of the second phase steps discussed in Example 2.
One could use any method of generating a so-called “adversarial example” for the model, and treat the adversarial example as a test sample of the test set 324. Adversarial examples and their method of generation (both with and without full access to the model and the intermediate values is generates) are well-known - see, for example, https://openai.com/blog/adversarial-example-research/ and https://towardsdatascience.com/getting-to-know-a-black-box-model-374e180589ce, the entire disclosures of which are incorporated herein by reference in their entireties.
An example experiment to illustrate the efficacy and effectiveness of embodiments of the invention is set out below. In this experiment, the well-known “two-spirals” classification task is used, namely: given the x and y coordinates of a point, the task aims to determine if the point belongs to class (spiral) 1 or class (spiral) 2. Given a coordinate pair, it is assumed that different models may assign it with different classes, whereas identical models always assign the same class to the coordinate pair.
First, a standard two-spiral dataset was created. Next, a noise value was added to the x and y coordinates for each of the points.
The multilayer perceptron (MLP) model was used. As is well-known, MLP is a class of the feedforward artificial neural network. Throughout the experiments, the MLP models that were used consisted of the following architecture:
The number of hidden layers used ranged from 1 to 4. There were 9 different number of epochs used: 48, 49, 50, 98, 99, 100, 148, 149, 150. The batch size used was 10. The total number of models trained was 36 (4 x 9). Next, based on model verification/validation using the validation set, models with reasonable performance (i.e. max (false positive rate, false negative rate) < 5%) were selected. This resulted in 18 models, leading to 153 (18 x 17 / 2) model pairs. Each model pair contains two different models – with the aim that, for each pair, a test could be performed with one of the models being (or acting as) the specific ML classification model 322 and the other being (or acting as) the ML classification model 312.
Throughout the experiments, the models were trained with the generated training set as described above.
The following technique was used to generate the test set 324:
In this experiment, P = N = 10, and M took values of 1.0 or 0.5. For this experiment, 28 test samples were generated for each model’s test set 234.
The above may be viewed as an implementation of Example 3 above. In particular: the sample (xA,xA × M) may be taken to be the candidate sample Bj; the second candidate samples Qj is the point (xMID, xMID × M); and the updated candidate samples B*j are (xA,xA × M) and (xB,xB × M). This could, equally, be viewed as performing the sequence of one or more update steps for two candidate samples in parallel. In particular: the samples (xA,xA × M) and (xB,xB × M) may be taken to be two initial candidate samples Bj; the second candidate sample Qj for both is the point (xMID,xMID × M); and the updated candidate samples B*j are (xA,xA × M) and (xB,xB × M) respectively.
An example of a single test involves:
The sample sizes used range from 1 to 20. For each of the sample sizes, first the “sensitive” samples were randomly selected from the test set 324 for model A, and then “regular” samples were randomly selected from the whole sample space 250. The above-described test steps were then executed on each of the selected samples. For each of the sample sizes, the above process was repeated 10 times. Indeed, when testing using “regular” samples, a larger range of sample sizes was needed (from 1 to 1000) to enable comparisons with embodiments of the invention that use “sensitive” samples.
Error rate was used to show how likely a chosen method (either using “sensitive samples” or “regular” samples) detects two different models as identical. This metric is defined as
The lower the error rate, the better the method differentiates two different models.
If one defines a 1 to N scenario as follows: consider one base model and a list of N models, the task is to detect whether any of the N models are identical with the base model. To achieve a perfect performance (an error rate of 0), the number of queries required for the baseline is 691 x N; the number for the proposed method is 80 + 11 x N (where here, in this experiment, there were 80 queries to generate the test set 324). As shown in
It will be appreciated that the methods described have been shown as individual steps carried out in a specific order. However, the skilled person will appreciate that these steps may be combined or carried out in a different order whilst still achieving the desired result.
It will be appreciated that embodiments of the invention may be implemented using a variety of different information processing systems. In particular, although the figures and the discussion thereof provide an exemplary computing system and methods, these are presented merely to provide a useful reference in discussing various aspects of the invention. Embodiments of the invention may be carried out on any suitable data processing device, such as a personal computer, laptop, personal digital assistant, mobile telephone, set top box, television, server computer, etc. Of course, the description of the systems and methods has been simplified for purposes of discussion, and they are just one of many different types of system and method that may be used for embodiments of the invention. It will be appreciated that the boundaries between logic blocks are merely illustrative and that alternative embodiments may merge logic blocks or elements, or may impose an alternate decomposition of functionality upon various logic blocks or elements.
It will be appreciated that the above-mentioned functionality may be implemented as one or more corresponding modules as hardware and/or software. For example, the above-mentioned functionality may be implemented as one or more software components for execution by a processor of the system. Alternatively, the above-mentioned functionality may be implemented as hardware, such as on one or more field-programmable-gate-arrays (FPGAs), and/or one or more application-specific-integrated-circuits (ASICs), and/or one or more digital-signal-processors (DSPs), and/or one or more graphical processing units (GPUs), and/or other hardware arrangements. Method steps implemented in flowcharts contained herein, or as described above, may each be implemented by corresponding respective modules; multiple method steps implemented in flowcharts contained herein, or as described above, may be implemented together by a single module.
It will be appreciated that, insofar as embodiments of the invention are implemented by a computer program, then one or more storage media and/or one or more transmission media storing or carrying the computer program form aspects of the invention. The computer program may have one or more program instructions, or program code, which, when executed by one or more processors (or one or more computers), carries out an embodiment of the invention. The term “program” as used herein, may be a sequence of instructions designed for execution on a computer system, and may include a subroutine, a function, a procedure, a module, an object method, an object implementation, an executable application, an applet, a servlet, source code, object code, byte code, a shared library, a dynamic linked library, and/or other sequences of instructions designed for execution on a computer system. The storage medium may be a magnetic disc (such as a hard drive or a floppy disc), an optical disc (such as a CD-ROM, a DVD-ROM or a BluRay disc), or a memory (such as a ROM, a RAM, EEPROM, EPROM, Flash memory or a portable/removable memory device), etc. The transmission medium may be a communications signal, a data broadcast, a communications link between two or more computers, etc.
Number | Date | Country | Kind |
---|---|---|---|
21209502.0 | Nov 2021 | EP | regional |