Embodiments of the present disclosure mainly relate to a field of computers, and more specifically, to a data processing method, a model training method, an electronic device, a computer-readable storage medium, and a computer program product.
Anomaly detection aims at detecting exceptional data instances that significantly deviated from the normality data distributions. Anomaly detection has been widely used in medical diagnosis, fraud detection, structural defects and many other fields. Since supervised anomaly detection models require a large amount of labeled training data and are costly, currently commonly-used anomaly detection models are obtained in an unsupervised, semi-supervised, or weakly-supervised manner.
However, the current anomaly detection model will detect a lot of normal data as anomalous, and detect some real but complex anomalous data as normal. Therefore, the current anomaly detection model has a problem of low recall rate. In particular, in the case of scarce samples of anomaly data, the recall rate of an anomaly detection model may be lower, which is undesirable.
Exemplary embodiments of the present disclosure provide a solution for data processing, where a trained anomaly detection model can be used to determine whether data to be detected is anomalous.
According to a first aspect of the present disclosure, there is provided a data processing method, comprising: obtaining data to be detected; and determining an attribute of the data to be detected using a trained anomaly detection model, the attribute indicating whether the data to be detected is anomalous data, wherein the anomaly detection model is trained based on a difference between a reconstructed data item and a normal data item and a difference between a first output data item and the reconstructed data item, wherein during training, the normal data item is input into a generative sub-model of the anomaly detection model to obtain the reconstructed data item, and the reconstructed data item is input into the generative sub-model to obtain the first output data item.
According to a second aspect of the present disclosure, there is provided a method of training an anomaly detection model, comprising: inputting a normal data item in a training set into a generative sub-model of the anomaly detection model to obtain a reconstructed data item; inputting the reconstructed data item into the generative sub-model to obtain a first output data item; and training the anomaly detection model based on a difference between the reconstructed data item and the normal data item and a difference between the first output data item and the reconstructed data item.
According to a third aspect of the present disclosure, there is provided an electronic device, comprising: at least one processing unit; and at least one memory being coupled to the at least one processing unit and configured to store instructions for being executed by the at least one processing unit, the instructions, when executed by the at least one processing unit, causing the device to perform actions comprising: obtaining data to be detected; and determining an attribute of the data to be detected using a trained anomaly detection model, the attribute indicating whether the data to be detected is anomalous data, wherein the anomaly detection model is trained based on a difference between a reconstructed data item and a normal data item and a difference between a first output data item and the reconstructed data item, wherein during training, the normal data item is input into a generative sub-model of the anomaly detection model to obtain the reconstructed data item, and the reconstructed data item is input into the generative sub-model to obtain the first output data item.
According to a fourth aspect of the present disclosure, there is provided an electronic device, comprising: at least one processing unit; and at least one memory being coupled to the at least one processing unit and configured to store instructions for being executed by the at least one processing unit, the instructions, when executed by the at least one processing unit, causing the device to perform actions comprising: inputting a normal data item in a training set into a generative sub-model of the anomaly detection model to obtain a reconstructed data item; inputting the reconstructed data item into the generative sub-model to obtain a first output data item; and training the anomaly detection model based on a difference between the reconstructed data item and the normal data item and a difference between the first output data item and the reconstructed data item.
According to a fifth aspect of the present disclosure, there is provided an electronic device, comprising: a memory and a processor; wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to perform the method described according to the first or second aspect of the present disclosure.
According to a sixth aspect of the present disclosure, there is provided a computer readable storage medium having machine-executable instructions stored thereon, the machine-executable instructions, when executed by a device, cause the device to perform the method described according to the first or second aspect of the present disclosure.
According to a seventh aspect of the present disclosure, there is provided a computer program product comprising computer-executable instructions, the computer-executable instructions, when executed by a processor, implement the method described according to the first or second aspect of the present disclosure.
According to an eighth aspect of the present disclosure, there is provided an electronic device, comprising a processing circuitry apparatus configured to perform the method described according to the first or second aspect of the present disclosure.
The Summary is to introduce a series of concepts in a simplified form which will be further described in the Detailed Description. The Summary is not intended to identify key features or essential features of the present disclosure, nor is it intended to be used to limit the scope of the present disclosure. Other features of the present disclosure will be made apparent by the following depictions.
The above and other features, advantages and aspects of the present disclosure will become more apparent through the detailed description below with reference to the accompanying drawings. Throughout the drawings, same or similar reference numerals represent same or similar elements, wherein:
Hereinafter, embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. Although the drawings illustrate some embodiments of the present disclosure, it is to be understood that the present disclosure can be implemented in various ways, and the illustrated embodiments should not be construed as being limited to the embodiments set forth herein. On the contrary, these embodiments are provided to enable a more thorough and complete understanding of the present disclosure. It is to be appreciated that the drawings and embodiments of the present disclosure are only used for exemplary purposes, and are not intended to limit the protection scope of the present disclosure.
As used herein, the term “comprises” and its equivalents are to be read as open terms that mean “comprises, but is not limited to.” The term “based on” is to be read as “based at least in part on.” The terms “one embodiment” or “the embodiment” is to be read as “at least one example embodiment.” The term “first,” “second,” and the like may refer to different or the same objects. Other definitions, either explicit or implicit, may be included below.
Various methods and processes described in the embodiments of the present disclosure may also be applied to various kinds of electronic devices, e.g., terminal devices, network devices, etc. The embodiments of the present disclosure may also be executed in a test device, such as a signal generator, a signal analyzer, a spectrum analyzer, a network analyzer, a test terminal device, a test network device, and a channel simulator, etc.
The term “circuitry” used herein may refer to hardware circuits and/or combinations of hardware circuits and software. For example, the circuitry may be a combination of analog and/or digital hardware circuits with software/firmware. As an alternative example, the circuitry may be any portions of hardware processors with software including digital signal processor(s), software, and memory(ies) that work together to cause an apparatus, such as a computing device and the like, to perform various functions. In a still further example, the circuitry may be hardware circuits or processors, such as a microprocessor or a portion of a microprocessor, that requires software/firmware for operation, but the software may not be present when it is not needed for operation. As used herein, the term “circuitry” also covers implementation of merely a hardware circuit or processor(s), or a fraction of a hardware circuit or processor(s) in conjunction with the software and/or firmware affixed thereto.
Anomaly detection, also known as outlier detection, novelty detection, out-of-distribution detection, noise detection, deviation detection, exception detection, or other names, is an important technical branch in machine learning, is widely used in various applications involving artificial intelligence (AI), such as computer vision, data mining, natural language processing, etc. Anomaly detection may be understood as a technique for identifying anomalous situations and mining illogical data, which aims to detect exceptional data instances that significantly deviated from the normality data distributions.
Anomaly detection has been widely used in many fields such as medical diagnosis, fraud detection, and structural defects. For example, a doctor may be assisted in diagnosis and treatment by detecting whether medical images are anomalous data. For example, whether there is a telecommunications fraud can be determined by detecting whether the data corresponding to a bank card swiping behavior is anomalous data. For example, whether a driver has a behavior against traffic rules can be determined by detecting whether there is anomalous data in traffic surveillance video. An algorithm for anomaly detection may generally include a supervised anomaly detection method and an unsupervised anomaly detection method.
Supervised anomaly detection was mostly formulated as an imbalanced classification problem through different classification approaches and sampling strategies. A training set on which the supervised anomaly detection method is based comprises labeled data items. However, considering the lack of labels or the pollution of some data items, there is also a semi-supervised anomaly detection method to solve anomaly detection under little labeled or polluted data. For example, Deep Supervised Anomaly Detection (Deep-SAD) proposes a two-stage training with an information-theoretic framework.
Due to the scarcity and variety of anomaly data, the unsupervised anomaly detection method gradually began to be a dominant method for anomaly detection. For example, in an algorithm of Unsupervised Anomaly Detection with Generative Adversarial Networks (AnoGAN), a Generative Adversarial Network (GAN) is used for anomaly detection. The algorithm uses GAN to learn the distribution of normal data and attempts to reconstruct the most similar images by optimizing a latent noise vector iteratively.
However, the current anomaly detection method has the problem of low recall rate, and will detect a lot of normal data as anomalous, and detect some real but complex anomalous data as normal.
In view of this, embodiments of the present disclosure provide a data processing solution to solve one or more of the above-mentioned problems and/or other potential problems. In this solution, normal data or anomalous data may be used to obtain a trained anomaly detection model through training based on reconstruction. This model can generate contextual adversarial data based on normal data, and can perform supervised learning based on anomalous data. Thus the model can be used for anomaly detection with a high recall rate.
As illustrated in
The computing device 110 may be configured to acquire data to be detected 120, and output a detection result 140. A determination of the detection result 140 can be implemented by a trained anomaly detection model 130.
The data to be detected 120 may be input by a user, or may be acquired from a storage device, which is not limited in the present disclosure.
The data to be detected 120 may be determined based on actual needs, and the data to be detected 120 may be of various types, which is not limited in the present disclosure. Exemplarily, the data to be detected 120 may belong to any of the following categories: audio data, Electro Cardio Graph (ECG) data, Electro Encephalo Graph (EEG) data, image data, video data, point cloud data, or volume or volumetric data. Optionally, the volume data may be, for example, Computer Tomography (CT) data or Optical Computer Tomography (OCT) data.
As another understanding, the data to be detected 120 may be one-dimensional data, such as bioelectric signals such as audio, ECG data, or EEG data. The data to be detected 120 may be 2-dimensional data, such as an image. The data to be detected 120 may be 2.5-dimensional data, such as video. The data to be detected 120 may be 3D data, such as video, volume data such as CT and OCT data, and the like. It may be understood that the description of the type of the data to be detected 120 in the present disclosure is only for illustration, and in actual scenarios, the data to be detected 120 may also be any of other types, which is not limited in the present disclosure.
The detection result 140 may represent an attribute of the data to be detected 120, specifically may indicate whether the data to be detected 120 is anomalous data.
In some examples, the embodiments of the present disclosure may be applied to various fields. For example, the embodiments of the present disclosure may be applied in the medical field, and the data to be detected 120 may be ECG data, EGG data, CT data, OCT data, etc. It should be understood that the scenarios listed here are for illustrative purposes only and are not intended to limit the scope of the present invention in any way. Embodiments of the present disclosure may be applied to various fields with similar problems, which will not be listed herein. In addition, “detection” in the embodiments of the present disclosure may also be referred to as “recognition”, etc., which is not limited in the present disclosure.
In some embodiments, the anomaly detection model 130 may be trained before implementing the process described above. It should be understood that anomaly detection model 130 may be trained by the computing device 110 or by any other suitable device external to the computing device 110. The trained anomaly detection model 130 may be deployed in the computing device 110 or may be deployed external to the computing device 110. An example training process will be described below with reference to
At block 210, a normal data item in a training set is input into a generative sub-model of an anomaly detection model to obtain a reconstructed data item.
At block 220, the reconstructed data item is input into the generative sub-model to obtain a first output data item.
At block 230, an anomaly detection model is trained based on a difference between the reconstructed data item and the normal data item and a difference between the first output data item and the reconstructed data item.
It may be understood that before block 210 shown in
As an example, the training set may be denoted as X, and any data item in the training set may be denoted as x, then X = {x: x~px}. Optionally, in some examples, each data item in the training set is a normal data item. Optionally, in some examples, each data item in the training set is an anomalous data item. Optionally, in some examples, a part of the data items in the training set are normal data items, and a remaining part of the data items are anomalous data items. It should be appreciated that the term “data item” in the embodiments of the present disclosure may be replaced with “data” in some scenarios.
In embodiments of the present disclosure, a set in which the data items are all normal data items may be represented as a normal training set Xn, Xn = {xn:xn∼px,}, and a set in which the data items are all anomalous data items may be represented as an anomalous training set Xa = {xa: xa~Pxa}. That is, the training set at block 210 may be denoted as X and include Xn and/or Xa.
Optionally, in some examples, Xn comprises a plurality of (e.g., N1) normal data items, Xa comprises a plurality of (e.g., N2) anomalous data items, N1 and N2 are positive integers, and generally N1 is much larger than N2, e.g., N1 is more than ten thousand times N2. It should be appreciated that the depiction of N1 and N2 here is only illustrative, for example, in some scenarios, N1 is smaller than N2. This is not limited in the present disclosure.
It may be appreciated that the embodiments of the present disclosure do not limit the types of data items in the training set. For example, training may be performed respectively for different types of training sets to obtain anomaly detection models that can be applied to different types of data.
As an example, the data items in the training set may be ECG data. Then the anomaly detection model obtained through the training set may be used to detect whether the input into the model is normal ECG data.
Exemplarily, in the embodiments of the present disclosure, a first loss function may be constructed based on a difference between the reconstructed data item and the normal data item and a difference between the first output data item and the reconstructed data item, where the first loss function comprises a first sub-function and a second sub-function, the first sub-function is obtained based on the difference between the reconstructed data item and the normal data item, and the second sub-function is obtained based on the difference between the first output data item and the reconstructed data item; and the anomaly detection model is trained based on the first loss function, where training objectives of the first sub-function and the second sub-function are opposite.
In some embodiments of the present disclosure, the anomaly detection model may comprise a generative sub-model and a discriminative sub-model, where the generative sub-model may be used to reconstruct based on the input data, and the discriminative sub-model may be used to determine whether the data reconstructed by the generative sub-model is true. That is, the discriminative sub-model may be used to determine whether the reconstructed data item obtained by the generative sub-model at block 210 is true or false.
Optionally, the generative sub-model may also be referred to as a generator, denoted as G for example. Optionally, the discriminative sub-model may also be referred to as a discriminator, denoted as D for example.
Exemplarily, the first sub-function may be obtained based on the difference between the reconstructed data item and the normal data item, for example, the first sub-function is represented as Fdist (Xn,
Moreover, during the training process, the training objectives of the first sub-function and the second sub-function are opposite, for example, the first sub-function may be expected to be a minimum (min), while the second sub-function may be expected to be a maximum (max), and a hyperparameter in the model may be obtained by learning on the basis of the training objective, as denoted by the following Equations (1) and (2):
In Equations (1) and (2), Λ represents “and”, log represents a natural logarithm, θG represents a hyperparameter for the generative sub-model G in the model, and θD represents a hyperparameter for the discriminative sub-model D in the model. Furthermore, it may be understood that in Equation (2),
means training the generative sub-model G and the discriminative sub-model D in an adversarial manner, for example, the generative sub-model G may be fixed to train the discriminative sub-model D, and the discriminative sub-model D may be fixed to train the generative sub-model G.
As shown in
Optionally, the generative sub-model may comprise an encoder and a decoder. As shown in
The anomaly detection model may be trained based on the training set of normal data items and based on the first loss function, for example, the first loss function may be represented as Ln. Exemplarily, a contextual loss function represented as Lcon may be determined based on the first sub-function, so as to make the reconstructed data item closer to the input normal data item, that is, try not to lose contextual information of the normal data item. Exemplarily, a contextual adversarial loss function represented as Ladcon may be determined based on the second sub-function.
In some embodiments, in order to ensure that the reconstructed data item generated by the generative sub-model G is true, an adversarial loss function may also be determined, denoted as Ladv, thereby increasing robustness. Additionally, a Latent Loss function, denoted as Llat, may also be determined to ensure the solid reconstruction of latent representations.
Exemplarily, the first loss function Ln may be represented as a weighted sum of the contextual loss function Lcon, the contextual adversarial loss function Ladcon, the adversarial loss function Ladv, and the latent loss function Llat, as shown in the following Equations (3) to (7).
It may be understood that in the process of training based on normal data items, x~px in Equations (3) to (6) is xn~px
It may be appreciated that in Equation (6), the training objective is represented by a minus sign (i.e., “-”). Referring to
As another example, in the training process of the present disclosure it is expected that the difference between the reconstructed data item 320 and the normal data item 310 should be as small as possible, whereas the difference between the first output data item 330 and the reconstructed data item 320 should be as large as possible.
For example, the difference between the reconstructed data item 320 and the normal data item 310 may be smaller than a first threshold, and the difference between the first output data item 330 and the reconstructed data item 320 may be greater than a second threshold. Exemplarily, the difference may be represented as a distance, for example, the data item is an image type, and the difference may be a Euclidean distance between two images, etc. Optionally, the second threshold is greater than the first threshold, for example, the second threshold may be a predetermined multiple of the first threshold, such as 10 times, 100 times or other values.
In this way, referring to the process described with reference to
Optionally, in some embodiments of the present disclosure, the anomaly detection model may be further trained based on the set (Xa) of anomalous data items. Specifically, the anomalous data items in the training set may be input into the generative sub-model to obtain the second output data item; the anomalous detection model is trained based on a second loss function, where the second loss function comprises a third sub-function, the training objective of the third sub-function is consistent with the training objective of the second sub-function, and the third sub-function is obtained based on a difference between the second output data item and the anomalous data item.
Exemplarily, the third sub-function may be obtained based on the difference between the second output data item and the anomalous data item, for example, the third sub-function is denoted as Fdist(Xa, G(Xa)). Based on the above description about the second sub-function, in the training process, the third sub-function is also expected to be maximum (max), and the hyperparameters in the model may be obtained by learning on the basis of the training objective, as shown in the following Equations (8) and (9):
As shown in
Optionally, the generative sub-model may include an encoder and a decoder. As shown in
The anomaly detection model may be trained based on the training set of anomalous data items and based on the second loss function, for example, the second loss function may be represented as La. Exemplarily, a contextual adversarial loss function may be determined based on the third sub-function, and denoted as Ladcon, as shown in the above Equation (6).
Exemplarily, the second loss function La may be represented as a weighted sum of the contextual adversarial loss function Ladcon, the adversarial loss function Ladv, and the latent loss function Llat, as shown in the following Equation (10):
Furthermore, it may be appreciated that in the training process based on the anomalous data items, x~px in Equations (4) to (6) is xa~px
In this way, the training of the anomaly detection model may be implemented in a supervised manner based on the anomalous data items, with reference to the process described in
It should be appreciated that the loss functions shown in the above Equations (3) to (7) and Equation (10) are only illustrative, and in practical applications, various modifications may be made to the expression of the loss function. For example, Equation (6) may be expressed as W(d(X)), X represents the input data of the generative sub-model G, d(X) represents a distance between the output and the input of the generative sub-model G, and satisfies that the larger the d(X) is, the smaller the W(d(X)) is. Optionally, the distance between the output and input of the generative sub-model G may be expressed as the an L1 distance in Equation (6), or as a higher-order distance, or as a Structure Similarity Index Measure (SSIM). This is not limited in the present disclosure.
Exemplarily, the above first loss function for normal data items and the second loss function for anomalous data items may be collectively represented as a total loss function L, represented as the following Equation (11):
In Equation (11), y is a coefficient, and y ∈ 0,1. It may be understood that if the input data item is a normal data item, then y=0; otherwise, y = 1.
As an illustration, Table 1 below shows a computer pseudocode for training the anomaly detection model 130.
In Table 1, the anomaly detection model is represented as fθ, and Algorithm 1 for training the anomaly detection model is referred to as adversarial training of Adversarial Generative Anomaly Detection (AGAD).
In order to train, it is possible to first acquire a Require, comprising: a training set S, a model fθ parameterized by θ, and a threshold δ used to reset the parameter θd, where the training set comprises a set Sn of normal data items and a set Sa of anomalous data items. Furthermore, it is assumed that the data items in the training set are all in an image format.
In the pseudocode of Table 1, rows 2 to 4 represent input data items and definitions of the data items at stages. Rows 5 to 8 represent the processing of normal data items, rows 9 to 12 represent the processing of anomalous data items, and rows 12 to 13 represent iteration of parameters. In this way, the supervision-based and semi-supervision-based anomaly detection solutions can be unified, so that fewer anomalous data items may be used to improve the performance of the anomaly detection model.
In this way, in the embodiments of the present disclosure, the anomaly detection model may be obtained based on the set of normal data items and/or the set of anomalous data items.
In this way, the trained anomaly detection model is obtained through training in the embodiments of the present disclosure. Furthermore, during training, it is possible to reconstruct again by using the reconstructed data items generated from the normal data items and by considering pseudo-anomaly features of the reconstructed data items, to try to make re-reconstruction fail. In this way, the training process can learn contextual adversarial information, so that the trained anomaly detection model has a higher precision and a higher recall rate.
In this way, in the training process in the embodiments of the present disclosure, the contextual adversarial information (e.g., Ladv) is introduced to generate pseudo-anomaly data in an adversarial manner, and thereby better learn discriminant features between the normal data items and anomalous data items. The anomaly detection model with higher model performance can be effectively obtained even when the anomalous data items do not exceed 5%.
An example training process for the anomaly detection model 130 is described above with reference to
At block 510, data to be detected is obtained.
At block 520, an attribute of the data to be detected is determined using the trained anomaly detection model, the attribute indicates whether the data to be detected is anomalous data.
Optionally, as shown in
In the embodiments of the present disclosure, the data to be detected may be input by a user, or may be acquired from a storage device. The data to be detected may belong to any of the following categories: audio data, ECG data, EEG data, image data, video data, point cloud data, or volume data. Optionally, the volume data may be, for example, CT data or OCT data.
It may be appreciated that the trained anomaly detection model may be obtained by training through the training process as shown in
Exemplarily, at block 520, a score value of the data to be detected may be obtained by using the trained anomaly detection model; and furthermore, the attribute of the data to be detected is determined based on the score value. Specifically, the score value may represent a difference between data obtained by the anomaly detection model by reconstructing the data to be detected and the data to be detected. Then if the score value is not higher than (that is, less than or equal to) a preset threshold, a first attribute of the data to be detected is determined, the first attribute indicates that the data to be detected is normal data. If the score value is higher than the preset threshold, a second attribute of the data to be detected is determined, the second attribute indicates that the data to be detected is anomalous data.
The preset threshold may be preset based on at least one of the following factors: detection accuracy, data type, and the like.
Optionally, in some examples, the detection result may comprise the score value, thereby indirectly indicating the attribute of the data to be detected. In some examples, the detection result may comprise indication information indicating whether the data to be detected is anomalous data.
In addition, the solution provided by the embodiments of the present disclosure has significant advantages over existing anomaly detection models. For example, it is assumed that AnoGAN is compared with the solution provided by the embodiments of the present disclosure based on a public data set MNIST. Taking an Area Under the Curve (AUC) as a comparative measure, an average AUC obtained by AnoGAN is 93.7%, while the average AUC obtained by the solution provided by the embodiments of the present disclosure is 99.1%. It may be seen that the solution provided by the embodiments of the present disclosure can achieve a better result.
In some embodiments, a computing device comprises a circuit configured to perform the following operations: obtaining data to be detected; and determining an attribute of the data to be detected using a trained anomaly detection model, the attribute indicating whether the data to be detected is anomalous data, wherein the anomaly detection model is trained based on a difference between a reconstructed data item and a normal data item and a difference between a first output data item and the reconstructed data item, wherein during training, the normal data item is input into a generative sub-model of the anomaly detection model to obtain the reconstructed data item, and the reconstructed data item is input into the generative sub-model to obtain the first output data item.
In some embodiments, the anomaly detection model is trained based on a first loss function, wherein the first loss function is constructed based on the difference between the reconstructed data item and the normal data item and the difference between the first output data item and the reconstructed data item, wherein the first loss function comprises a first sub-function and a second sub-function, the first sub-function is obtained based on the difference between the reconstructed data item and the normal data item, and the second sub-function is obtained based on the difference between the first output data item and the reconstructed data item, wherein training objectives of the first sub-function and the second sub-function are opposite.
In some embodiments, the anomaly detection model is further trained based on a second loss function, wherein the second loss function comprises a third sub-function, a training objective of the third sub-function is consistent with a training objective of the second sub-function, the third sub-function is obtained based on a difference between a second output data item and an anomalous data item in the training set, and the second output data item is obtained by inputting the anomalous data item in the training set into the generative sub-model.
In some embodiments, the trained anomaly detection model further comprises a discriminative sub-model for determining whether the reconstructed data item is true or false.
In some embodiments, the computing device comprises a circuit configured to perform the following operations: determining a score value of the data to be detected by using the trained anomaly detection model, the score value representing a difference between data obtained by the anomaly detection model by reconstructing the data to be detected and the data to be detected; if the score value is not higher than a preset threshold, determining a first attribute of the data to be detected, the first attribute indicating that the data to be detected is normal data; if the score value is higher than the preset threshold, determining a second attribute of the data to be detected, the second attribute indicating that the data to be detected is anomalous data.
In some embodiments, the data to be detected belongs to any of the following categories: audio data, electrocardiogram data, electroencephalogram data, image data, video data, point cloud data, or volume data.
In some embodiments, the computing device comprises a circuit configured to perform the following operations: inputting a normal data item in the training set into a generative sub-model of the anomaly detection model to obtain a reconstructed data item; inputting the reconstructed data item into the generative sub-model to obtain a first output data item; and training the anomaly detection model based on a difference between the reconstructed data item and the normal data item and a difference between the first output data item and the reconstructed data item.
In some embodiments, the computing device comprises a circuit configured to perform the following operations: constructing a first loss function based on the difference between the reconstructed data item and the normal data item and the difference between the first output data item and the reconstructed data item, wherein the first loss function comprises a first sub-function and a second sub-function, the first sub-function is obtained based on the difference between the reconstructed data item and the normal data item, and the second sub-function is obtained based on the difference between the first output data item and the reconstructed data item; and training the anomaly detection model based on the first loss function, wherein training objectives of the first sub-function and the second sub-function are opposite.
In some embodiments, the difference between the reconstructed data item and the normal data item is smaller than a first threshold, and the difference between the first output data item and the reconstructed data item is larger than a second threshold.
In some embodiments, the computing device comprises a circuit configured to perform the following operations: inputting the anomalous data item in the training set into a generative sub-model to obtain a second output data item; and training an anomaly detection model based on a second loss function, wherein the second loss function comprises a third sub-function, a training objective of the third sub-function is consistent with a training objective of the second sub-function, and the third sub-function is obtained based on a difference between the second output data item and the anomalous data item.
In some embodiments, the anomaly detection model further comprises a discriminative sub-model for determining whether the reconstructed data item is true or false.
In some embodiments, the computing device comprises a circuit configured to perform the following operation: training the generative sub-model and the discriminative sub-model in an adversarial manner based on the first loss function.
Various components in the device 800 are connected to the I/O interface 805, including: an input unit 806 such as a keyboard, a mouse and the like; an output unit 807 such as various types of displays and loudspeakers, etc.; a memory unit 808 such as a magnetic disk, an optical disk, and etc.; and a communication unit 809 such as a network card, a modem, and a wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/ data with other devices via a computer network such as the Internet and/or various types of telecommunications networks. It is understood that the present disclosure may display, via the output unit 807, real-time dynamic change information of the customer satisfaction, key factor identification information of a group of customers or individual customers subjected to the satisfaction, optimized strategy information, and strategy implementation effect assessment information, etc.
The processing unit 801 may be implemented by one or more processing circuits. The processing unit 801 may be configured to perform various processes and processing described above. For example, in some embodiments, the process described above may be implemented as a computer software program that is tangibly embodied on a machine readable medium, e.g., the memory unit 808. In some embodiments, part or all of the computer program may be loaded and/or mounted onto the device 800 via ROM 802 and/or communication unit 809. When the computer program is loaded to the RAM 803 and executed by the CPU 801, one or more steps of the process as described above may be executed.
The present disclosure may be implemented a system, a method and/or a computer program product. The computer program product may comprise a computer-readable storage medium on which computer-readable program instructions for executing various aspects of the present disclosure are loaded.
The computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium comprises the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions may execute entirely on the user’s computer, partly on the user’s computer, as a stand-alone software package, partly on the user’s computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user’s computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform various aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It is also to be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to optimal explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Number | Date | Country | Kind |
---|---|---|---|
202210220827.8 | Mar 2022 | CN | national |