Embodiments of the present invention generally relate to machine learning (ML) models. More particularly, at least some embodiments of the invention relate to systems, hardware, software, computer-readable media, and methods, for identifying drift in machine learning model performance, and identifying a replacement for the drifting machine learning model.
Machine learning algorithms, especially those that are used in production, are trained with certain data distributions that can change in different ways. This phenomenon is known as drift. Consequently, models may become unreliable in the presence of drifted data, requiring constant monitoring to signal drift. Among the drift detection approaches there are many that are performance-based. Such performance-based drift detection approaches may involve observing the quality of results obtained by a deployed model to determine whether the distribution of data and/or relation between features and output have changed, beyond an acceptable threshold in the domain. A typical consequence of detecting drift is to trigger the retraining of the model. If alternative models to the drifting model are available, one of those readily available models may be used instead.
In order to describe the manner in which at least some of the advantages and features of the invention may be obtained, a more particular description of embodiments of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered to be limiting of its scope, embodiments of the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings.
Embodiments of the present invention generally relate to machine learning models. More particularly, at least some embodiments of the invention relate to systems, hardware, software, computer-readable media, and methods, for identifying drift in machine learning model performance, and identifying a replacement for the drifting machine learning model.
In general, an embodiment of the invention may address a particular form of drift detection in which a deployed machine learning (ML) model may, over time, become less accurate than alternative models that were not fine-tuned to the domain in which the drifting model is deployed. An embodiment of the invention may comprise an offline stage, and an online stage, described below. Note that ML models may be referred to herein simply as a ‘model’ or ‘models.’
In general, the offline stage according to one embodiment may obtain baseline behavior of a reference model MA, that is, the model to be deployed for inference and decision-making purposes in a particular domain, environment A in this example. The offline stage may also determine a minimal set of relevant alternative models to the reference model. As well, the offline stage may obtain parameters of the performance of those alternative models, which may number in the hundreds or thousands for example, over the data used to train the reference model, which may be leveraged in the online stage.
The online stage, according to one embodiment, may determine if/when a particular model, among the restricted set of candidate models selected in the offline stage, should be deployed in shadow mode. In an embodiment, the shadow model may supplant the reference model MA, or may be considered in tandem with the reference model MA. If no adequate alternative models are found, retraining of the reference model MA may be triggered.
Embodiments of the invention, such as the examples disclosed herein, may be beneficial in a variety of respects. For example, and as will be apparent from the present disclosure, one or more embodiments of the invention may provide one or more advantageous and unexpected effects, in any combination, some examples of which are set forth below. It should be noted that such effects are neither intended, nor should be construed, to limit the scope of the claimed invention in any way. It should further be noted that nothing herein should be construed as constituting an essential or indispensable element of any invention or embodiment. Rather, various aspects of the disclosed embodiments may be combined in a variety of ways so as to define yet further embodiments. For example, any element(s) of any embodiment may be combined with any element(s) of any other embodiment, to define still further embodiments. Such further embodiments are considered as being within the scope of this disclosure. As well, none of the embodiments embraced within the scope of this disclosure should be construed as resolving, or being limited to the resolution of, any particular problem(s). Nor should any such embodiments be construed to implement, or be limited to implementation of, any particular technical effect(s) or solution(s). Finally, it is not required that any embodiment implement any of the advantageous and unexpected effects disclosed herein.
In particular, one advantageous aspect of an embodiment of the invention is that unacceptable drift in the performance of a model may be detected, and one or more alternative models identified that are able to provide acceptable performance in the relevant domain. An embodiment may avoid the need to retrain a drifting model. An embodiment may provide for a relatively fast hand-off from a drifting model to a different model that has better expected performance than the drifting model. Various other advantages of some example embodiments will be apparent from this disclosure.
It is noted that embodiments of the invention, whether claimed or not, cannot be performed, practically or otherwise, in the mind of a human. Accordingly, nothing herein should be construed as teaching or suggesting that any aspect of any embodiment of the invention could or would be performed, practically or otherwise, in the mind of a human. Further, and unless explicitly indicated otherwise herein, the disclosed methods, processes, and operations, are contemplated as being implemented by computing systems that may comprise hardware and/or software. That is, such methods processes, and operations, are defined as being computer-implemented.
An embodiment of the invention may be employed in a scenario in which multiple alternative versions of a reference model are available, likely each trained under distinct conditions. It may not be practical to have a large number of models deployed and operating at the same time in an edge environment, especially if many instances of the models are deployed. For example, an embodiment of an event detection model may be deployed to each mobile device, such as an Autonomous Mobile Robot (AMR) operating in an environment such as a warehouse, and these devices may have a limited capacity for computation and communication to a near edge infrastructure and, as such, may not be able to support the operation of many instances of models. Accordingly, an embodiment of the invention may operate to determine a minimal set of feasible models to be applied concurrently at one or more edge devices.
An embodiment of the invention may provide for alternative model selection under drift of a reference model. In this scenario, upon detecting drift in the reference model, it may be preferable to replace the reference model by one of the alternative models, since the alternative models may be readily available, and the retraining of the reference model may be expensive and/or take a while. A naïve, or simplistic, approach for this scenario would be to deploy all the alternative models and simultaneously assess their respective performance over time. However, operating many models in parallel may be undesirable given resource constraints at the devices where models are being employed. Furthermore, there may be a need for some orchestration that will assess the performance of the models over time to determine which model outputs are going to be used, and a delay between assessment and actuation may mean that any model swapping takes place too late. Thus, an embodiment of the invention may comprise a method for dealing with multiple alternative models when the reference model is subject to drift.
With particular attention now to
An embodiment of the invention may comprise an offline stage and an online stage. In brief, the offline stage may comprise various functions, including, but not limited to: [1] obtain baseline behavior of a reference model MA, that is, the model to be deployed for inference and decision-making purposes at environment A; [2] determine a minimal set of relevant alternative models to the reference model so as to obtain respective parameters of the performance of those alternative models over the same data that as used to train the reference model—which may be leveraged in an online stage. In an embodiment, the online stage may comprise various functions, including, but not limited to, determine if/when a particular model, among the restricted set of candidate models selected in the offline stage, should be deployed in shadow mode.
With the various alternative models available locally in the environment, or domain, A, which may comprise one or more edge devices for example, an embodiment may leverage the fact that the reference model may outperform the alternative ones until a significant drift occurs. The intuitive notion of an embodiment is that, typically, model MA trained with local data will perform better at the domain A until significant drift occurs. Then, one or more of the alternative models may turn out to be appropriate model, that is, a better performing model, from that point on.
More formally, an embodiment may denote θ(Mi) to refer to the distribution of the data used for the training model Mi, while θ(Dj) refers to the distribution of dataset Di. So, for model MA for example, the distribution of the data used for training that model MA may be referred to by θ(MA). By definition, θ(MA)≡θ(DA) at first—but an embodiment may consider that the domain A may, over time, be subject to changes in the data distribution which may characterize the concept of drift. That is, with D denoting the data at inference time, θ(D) may drift from θ(MA) and become closer to one of θ(MB), θ(MC) . . . .
A method according to one embodiment may be used to effectively determine a minimal set alt of models, such that Mi∈ are both: [1] distinct to MA with respect to their training data, that is, θ(MA)≠θ(Mi); and [2] and also likely to be accurate and useful for decision making from then on, that is, θ(Mi)˜θ(D)).
An embodiment may further determine which one, or more, of alt is/are most likely to perform better than MA under the current drift conditions, as established by the change in performance of MA. Such model(s) may then be tested, for a limited period, operating in “shadow mode” with respect to MA. With one or more models operating in shadow mode, any number of orchestration approaches may take place, depending on the domain. Both the offline stage and online stage, and respective associated operations, are described in further detail hereafter.
Reference is now made to
C.2.1 Training Reference Model MA with DA and Obtaining Alternative Models
The historical data may be used to train 202 a model, which may be referred to as a reference model and may be deployed in the environment A. In an embodiment, this model may comprise an event detection model, but the scope of the invention is not limited to event detection models and, more generally, any other type of model, such as a model that uses an unsupervised learning approach for example, may be employed in other embodiments. An example of the deployed model is denoted MA herein. As shown in
With continued attention to
In an embodiment, and as shown at 306, all of the models M∈ may be instances of the same structure, with input I and output O.
One example scenario according to an embodiment may be as follows:
An embodiment may be applied to other kinds of ML models, though one particular embodiment may be directed to an autoencoder model for event detection, due to the desirable characteristics of an autoencoder model, such as that the autoencoders may enable an embodiment of the invention to perform an assessment of the quality of the models, without the need for supervised data, in a straightforward and immediate manner, such as by computing the reconstruction errors of the autoencoder models with regard to the input samples received by the autoencoder models.
An embodiment may consider that the, presumed, datasets DB, DC . . . DZ respectively used for training models MB, MC . . . MZ may not be able to be communicated to the computational structure at the domain 300, such as due to privacy or communication costs, or due to not being readily available, such as may be the case if those datasets are discarded after the training of the respective models with which they are associated.
The models MB, MC . . . MZ, that is, the models in set alt 304, may on the other hand, may be reasonably communicated to the domain 300, since those models may likely be available in production at environments B, C . . . Z, respectively. This is disclosed in
With these models in set alt 304 locally available, that is, at the domain 300, an embodiment may operate to obtain baseline performances of the models under the conditions in the domain 300. This may comprise, as shown in
Note that the scores E in the
Note that these performances, as reflected in the scores E, are relative to the historical data at the environment—that is, the data DA that were used to train model MA. An embodiment may ultimately consider a situation of drift—that is, the underlying distribution of those data will change in the domain during online operation. Hence, a difference in performance from the baseline performance of the reference model may be used to determine alternative models likely to perform better under a drifted condition.
With continued reference to
Notice that this considers the case of the example in which the performance of the models is given by an error score E and, therefore, the threshold may represents a maximum permissible reconstruction error. In alternative embodiments, if the evaluation of the models is given by a positive score E, the threshold ki may be adapted to represent a minimum allowed score.
Further, for each alternative model Mi∈alt, an embodiment may determine 212 (see
At 214 (see
An embodiment may identify, based on the analysis of the performance of the reference model, which alternative models are likely to be applicable in a particular domain or domains, that is, a set of candidate models. This step may be important to reduce the number of alternative models alt to enable the online orchestration operation to be implemented with minimal computing resource overhead.
In an embodiment, two criteria may be used to discard 210 (see
The underlying reasoning is that these models are most likely trained under similar conditions to the reference model, and therefore are likely to be subject to the same performance decay under similar drift conditions. More generally, all alternative models Mi∈alt whose baseline performance is too similar to another reference model Mi may be discarded. The reasoning is similar as for the case of MB, but with respect to two alternative models which are likely redundant with respect to each other.
One approach for this determination may comprise obtaining the distributions of the performance metrics E for each model, such as are represented by the histograms 504 in
An example online stage according to one embodiment is disclosed in
An embodiment of the method 600 may begin with initializations of control structures and variables, namely, in this example: [1] a series A of performance evaluations for inferences of the reference model with current data D; [2] a reference point p, later used for forecasting drift changes via interpolation, initially set 604 at the origin of the series—particularly, the reference point p may be initialized as an average performance of MA, so as to serve as a starting point for an interpolation process, examples of which are discussed below in connection with
Particularly,
Note that in the example method 600 of
In particular,
That is, in an embodiment, when the counter reaches z, a forecasting process may be triggered, relying on a simple interpolation 618 of p and the z last evaluations. One example approach for interpolation is disclosed in the graph 900 of
In an embodiment, the interpolation approach may comprise a simple linear interpolation between p and the last of the z points 901. This is the case shown in the line denoted at 902 the
In general, an interpolation process, as exemplified in
Regardless of the interpolation function employed, an embodiment may determine a forecast evaluation {right arrow over (E)}. This value may represent an expected evaluation of the reference model in the future if the ongoing drift continues to build. Therefore, an embodiment may compare 622 (see
As such, an embodiment may deploy 624 (see
The shadow model(s) may be employed in various ways. For example, a shadow model Mi may supplant MA completely. In this case, the reference model may be removed, and the inferences provided by the alternative model may be considered for decision making purposes, possibly with the consideration that the results are unreliable, since the model Mi is not necessarily guaranteed to perform accurately.
In another approach, the shadow model(s) may be considered in tandem with the reference model, such as in a weighted ensemble of models, in which each model may be weighted, for example, according to the evaluation(s) of that model. In this case, the edge nodes at which the model(s) operate may be required to possess sufficient processing capabilities to hold multiple models in operation.
In a final example approach, it may be determined that none of the shadow models will provide acceptable performance, and an embodiment may thus trigger the re-training and re-deployment of a new model such as a modified version of MA, that is, in the case in which MA degrades but no alternative models prove to be useful. The re-training and re-deployment may be applied in combination with the supplanting approach or the ensemble approach, as the triggering of a new training round is a separate concern to the usage of the models for inferencing.
As will be apparent from this disclosure, one or more example embodiments may possess various useful features and advantages. A non-exhaustive list of such features and advantages follows.
An embodiment may implement the orchestration of alternative models for efficient determination of one or more shadow models, which may be based on the baseline evaluation of the model performances over available historical dataset. An embodiment may provide for consideration of a non-redundant set of alternative models in online fashion thus leveraging an online process for determining a most-likely accurate alternative model to be deployed as a shadow model. This determination may be based, for example, on a comparison of baseline performances and a forecast of the evaluation of the reference model in the near future.
It is noted with respect to the disclosed methods, including the example methods of
In an embodiment, the methods 200 and/or 600 may be performed by a central node that is configured to communicate with one or more edge nodes. Briefly, for example, a central node may evaluate and/or modify one or more models, and then deploy the models to one or more edge nodes. The central node and the edge nodes may each comprise hardware and/or software. In an embodiment, an edge node may comprise a sensor configured to obtain information about a physical operating environment, such as a warehouse for example. An edge node may comprise an autonomous vehicle. No particular configuration is required to implement any embodiment however, and the foregoing are provided only by way of example, and not limitation of the scope of the invention.
Following are some further example embodiments of the invention. These are presented only by way of example and are not intended to limit the scope of the invention in any way.
Embodiment 1. A method, comprising: obtaining, by a central node, an evaluation of a performance of a reference model deployed at an edge node; determining if the evaluation exceeds a threshold associated with the reference model, and incrementing a counter when the evaluation exceeds the threshold; when a counter value equals or exceeds a specified limit, performing an interpolation process to identify a new model having better expected performance than performance of the reference model; and deploying the new model in a shadow mode at the edge node.
Embodiment 2. The method as recited in any preceding embodiment, wherein the reference model is configured to perform anomaly detection with respect to operation of an edge device that comprises the edge node.
Embodiment 3. The method as recited in any preceding embodiment, wherein the reference model and the new model were trained with different respective domain-specific datasets.
Embodiment 4. The method as recited in any preceding embodiment, wherein a threshold associated with the new model is different from the threshold associated with the reference model.
Embodiment 5. The method as recited in any preceding embodiment, wherein deploying the new model in shadow mode comprises running the new model together with the reference model at the edge node.
Embodiment 6. The method as recited in any preceding embodiment, wherein a group of alternative models reside at the edge node, and the new model is taken from the group of alternative models.
Embodiment 7. The method as recited in any preceding embodiment, wherein the evaluation does not exceed the threshold, a series is updated to include the evaluation, and the counter is set to zero.
Embodiment 8. The method as recited in any preceding embodiment, wherein the interpolation process comprises interpolating from a reference point p over a number z of most recent evaluations that have exceeded the threshold.
Embodiment 9. The method as recited in any preceding embodiment, wherein the interpolation process generates a forecast of an evaluation that is above the threshold associated with the reference model.
Embodiment 10. The method as recited in any preceding embodiment, wherein drift in the reference model is indicated when the counter value equals or exceeds the specified limit.
Embodiment 11. A system, comprising hardware and/or software, operable to perform any of the operations, methods, or processes, or any portion of any of these, disclosed herein.
Embodiment 12. A non-transitory storage medium having stored therein instructions that are executable by one or more hardware processors to perform operations comprising the operations of any one or more of embodiments 1-10.
The embodiments disclosed herein may include the use of a special purpose or general-purpose computer including various computer hardware or software modules, as discussed in greater detail below. A computer may include a processor and computer storage media carrying instructions that, when executed by the processor and/or caused to be executed by the processor, perform any one or more of the methods disclosed herein, or any part(s) of any method disclosed.
As indicated above, embodiments within the scope of the present invention also include computer storage media, which are physical media for carrying or having computer-executable instructions or data structures stored thereon. Such computer storage media may be any available physical media that may be accessed by a general purpose or special purpose computer.
By way of example, and not limitation, such computer storage media may comprise hardware storage such as solid state disk/device (SSD), RAM, ROM, EEPROM, CD-ROM, flash memory, phase-change memory (“PCM”), or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other hardware storage devices which may be used to store program code in the form of computer-executable instructions or data structures, which may be accessed and executed by a general-purpose or special-purpose computer system to implement the disclosed functionality of the invention. Combinations of the above should also be included within the scope of computer storage media. Such media are also examples of non-transitory storage media, and non-transitory storage media also embraces cloud-based storage systems and structures, although the scope of the invention is not limited to these examples of non-transitory storage media.
Computer-executable instructions comprise, for example, instructions and data which, when executed, cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. As such, some embodiments of the invention may be downloadable to one or more systems or devices, for example, from a website, mesh topology, or other source. As well, the scope of the invention embraces any hardware system or device that comprises an instance of an application that comprises the disclosed executable instructions.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts disclosed herein are disclosed as example forms of implementing the claims.
As used herein, the term ‘module’ or ‘component’ may refer to software objects or routines that execute on the computing system. The different components, modules, engines, and services described herein may be implemented as objects or processes that execute on the computing system, for example, as separate threads. While the system and methods described herein may be implemented in software, implementations in hardware or a combination of software and hardware are also possible and contemplated. In the present disclosure, a ‘computing entity’ may be any computing system as previously defined herein, or any module or combination of modules running on a computing system.
In at least some instances, a hardware processor is provided that is operable to carry out executable instructions for performing a method or process, such as the methods and processes disclosed herein. The hardware processor may or may not comprise an element of other hardware, such as the computing devices and systems disclosed herein.
In terms of computing environments, embodiments of the invention may be performed in client-server environments, whether network or local environments, or in any other suitable environment. Suitable operating environments for at least some embodiments of the invention include cloud computing environments where one or more of a client, server, or other machine may reside and operate in a cloud environment.
With reference briefly now to
In the example of
Such executable instructions may take various forms including, for example, instructions executable to perform any method or portion thereof disclosed herein, and/or executable by/at any of a storage site, whether on-premises at an enterprise, or a cloud computing site, client, datacenter, data protection site including a cloud storage site, or backup server, to perform any of the functions disclosed herein. As well, such instructions may be executable to perform any of the other operations and methods, and any portions thereof, disclosed herein.
The present invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described embodiments are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is, therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.