This U.S. patent application claims priority under 35 U.S.C. § 119 to: Indian Patent Application No. 202221011587, filed on Mar. 3, 2022. The entire contents of the aforementioned application are incorporated herein by reference.
The disclosure herein generally relates to congeniality-preserving Generative Adversarial Networks (cpGAN), and, more particularly, to congeniality-preserving Generative Adversarial Networks (cpGAN) for imputing low-dimensional multivariate industrial time-series data.
Complex industrial units are big data digital behemoths, typically the sensors record in the scale of several Gigabytes of the plant operation data. These sensory observations are incomplete due to sensor failure including other various reasons and are of less utility. The quality of the data is of high priority for deploying data-driven artificial intelligence (AI) algorithms for process control, optimization, etc., as inaccurate decision-making impacts product quality, and plant/industrial units' safety. The imputation methods are in general classified as either discriminative or generative. Recent advances in missing data imputation through generative adversarial network (GAN) architectures suffer from inherent limitations of preserving the relationship among the input feature variables and the target variable and temporal relations between observations spanning across timeframes because of which it is also challenging to reconcile missing data for any downstream tasks.
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one aspect, there is provided a processor implemented method for imputing low-dimensional incomplete multivariate industrial time-series data using a congeniality-preserving Generative Adversarial Networks (cpGAN). The method comprises obtaining, via one or more hardware processors, an input training dataset, a cluster independent random noise and one or more associated cluster labels corresponding to the input training dataset; transforming, via the one or more hardware processors, the cluster independent random noise by using the one or more associated cluster labels corresponding to the input training dataset to obtain a cluster dependent random noise; generating, via the one or more hardware processors, an imputed synthetic noise based on (i) a mask variable, (ii) one or more feature embeddings of the training dataset, (iii) a flipped mask variable, and (iv) the obtained cluster dependent random noise; generating, via the one or more hardware processors, one or more imputed high-dimensional feature embeddings using the generated imputed synthetic noise; predicting, via the one or more hardware processors, one or more imputed high-dimensional target feature embeddings of the training dataset using the one or more imputed high-dimensional feature embeddings; generating, via the one or more hardware processors, one or more single-step ahead imputed feature embeddings using the one or more imputed high-dimensional feature embeddings; and generating, via the one or more hardware processors, an imputed training data using the one or more single-step ahead imputed high-dimensional feature embeddings.
In an embodiment, the cluster dependent random noise is obtained from the cluster independent random noise that is sampled from a Gaussian distribution and the one or more associated cluster labels corresponding to the input training dataset.
In an embodiment, the flipped mask variable is obtained based on a difference between a pre-defined value and the mask variable.
In an embodiment, the method further comprises minimizing a difference between one or more target feature embeddings of the input training dataset and the one or more predicted imputed high-dimensional target feature embeddings.
In an embodiment, the method further comprises validating the imputed training data based on a comparison of the imputed training data and the input training dataset.
In an embodiment, the method further comprises classifying the one or more imputed high-dimensional feature embeddings into at least one class type.
In another aspect, there is provided a processor implemented system for imputing low-dimensional incomplete multivariate industrial time-series data using a congeniality-preserving Generative Adversarial Networks (cpGAN). The system comprises a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to: obtain an input training dataset, a cluster independent random noise and one or more associated cluster labels corresponding to the input training dataset; transform the cluster independent random noise by using the one or more associated cluster labels corresponding to the input training dataset to obtain a cluster dependent random noise; generate an imputed synthetic noise based on (i) a mask variable, (ii) one or more feature embeddings of the training dataset, (iii) a flipped mask variable, and (iv) the obtained cluster dependent random noise; generate, by using a generator module comprised in the cpGAN, one or more imputed high-dimensional feature embeddings using the generated imputed synthetic noise; predict, by using a critic module comprised in the cpGAN, one or more imputed high-dimensional target feature embeddings of the training dataset using the one or more imputed high-dimensional feature embeddings; generate, by using a supervisor module comprised in the cpGAN, one or more single-step ahead imputed high-dimensional feature embeddings using the one or more imputed high-dimensional feature embeddings; and generate, by using a recovery module comprised in the cpGAN, an imputed training data using the one or more single-step ahead imputed high-dimensional feature embeddings.
In an embodiment, the cluster dependent random noise is obtained from the cluster independent random noise that is sampled from a Gaussian distribution and the one or more associated cluster labels corresponding to the input training dataset.
In an embodiment, the flipped mask variable is obtained based on a difference between a pre-defined value and the mask variable.
In an embodiment, the one or more hardware processors are further configured by the instructions to minimize a difference between one or more target feature embeddings of the input training dataset and the one or more predicted imputed high-dimensional target feature embeddings.
In an embodiment, the one or more hardware processors are further configured by the instructions to validate, by using a discriminator comprised in the cpGAN, the imputed training data based on a comparison of the imputed training data and the input training dataset.
In an embodiment, the one or more hardware processors are further configured by the instructions to classify, by using a discriminator comprised in the cpGAN, the one or more imputed high-dimensional feature embeddings into at least one class type.
In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors cause imputing low-dimensional incomplete multivariate industrial time-series data using a congeniality-preserving Generative Adversarial Networks (cpGAN) by: obtaining an input training dataset, a cluster independent random noise and one or more associated cluster labels corresponding to the input training dataset; transforming the cluster independent random noise by using the one or more associated cluster labels corresponding to the input training dataset to obtain a cluster dependent random noise; generating an imputed synthetic noise based on (i) a mask variable, (ii) one or more feature embeddings of the training dataset, (iii) a flipped mask variable, and (iv) the obtained cluster dependent random noise; generating one or more imputed high-dimensional feature embeddings using the generated imputed synthetic noise; predicting one or more imputed high-dimensional target feature embeddings of the training dataset using the one or more imputed high-dimensional feature embeddings; generating one or more single-step ahead imputed high-dimensional feature embeddings using the one or more imputed high-dimensional feature embeddings; and generating an imputed training data using the one or more single-step ahead imputed high-dimensional feature embeddings.
In an embodiment, the cluster dependent random noise is obtained from the cluster independent random noise that is sampled from a Gaussian distribution and the one or more associated cluster labels corresponding to the input training dataset.
In an embodiment, the flipped mask variable is obtained based on a difference between a pre-defined value and the mask variable.
In an embodiment, the one or more instructions which when executed by the one or more hardware processors further cause minimizing a difference between one or more target feature embeddings of the input training dataset and the one or more predicted imputed high-dimensional target feature embeddings.
In an embodiment, the one or more instructions which when executed by the one or more hardware processors further cause validating the imputed training data based on a comparison of the imputed training data and the input training dataset.
In an embodiment, the one or more instructions which when executed by the one or more hardware processors further cause classifying the one or more imputed high-dimensional feature embeddings into at least one class type.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
As mentioned earlier, conventional imputation methods are in general classified as either discriminative or generative. Recent advances in missing data imputation through generative adversarial network (GAN) architectures suffer from inherent limitations of preserving the relationship among the input feature variables and the target variable and temporal relations between observations spanning across timeframes because of which it is also challenging to reconcile missing data for any downstream tasks. To overcome these drawbacks, embodiments of the present disclosure provide system and method that implement a Congeniality-Preserving Generative Adversarial Networks (cpGAN) that enables reconcile missing data by preserving the temporal dependencies, probability distributions of the original data and retain its utility for any downstream tasks.
More specifically, system and method of the present disclosure implement an implicit probabilistic model-based congeniality-preserving Generative Adversarial Networks (cpGAN) for imputing low-dimensional incomplete multivariate industrial time-series data with an adversarial trained generator neural network. The cpGAN architecture is presented as an alternative paradigm of research for numerical modeling of continuum mechanics and transport phenomena based on imputation techniques. The cpGAN architecture as implemented by the system and method of the present disclosure is established on two-player non-cooperative zero-sum adversarial game, based on game theory, and minimax optimization-approach. The system and method of the present disclosure also leverage artificial intelligence systems to demonstrate the random missing data imputation-utility efficacy tradeoff for downstream tasks on the open-source industrial benchmark datasets.
In other words, the congeniality-preserving Generative Adversarial Networks (cpGAN) is an architecture comprising of embedding, recovery, critic, supervisor, generator and discriminator and is implemented by the present application for imputing low-dimensional incomplete multivariate industrial time-series data. The method described herein minimizes the rubric based on the information theory for Machine Learning (ML) between the empirical probability distributions of the reconcile data and the non-linear original data to preserve the temporal dependencies and retain the input feature-attributes and target-variable relationship and probability distributions of the original data.
Referring now to the drawings, and more particularly to
The I/O interface device(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface device(s) can include one or more ports for connecting a number of devices to one another or to another server.
The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random-access memory (SRAM) and dynamic-random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, a database 108 is comprised in the memory 102, wherein the database 108 comprises information training dataset, associated cluster independent random noise and cluster labels, associated cluster dependent random noise, mask variable, associated flipped mask variable, imputed synthetic noise, one or more imputed high-dimensional feature embeddings, one or more imputed target feature embeddings, one or more single-step ahead imputed feature embeddings, imputed training data, Gaussian distribution of the cluster independent random noise corresponding to the input training dataset, validated imputed training data, and the like. The database 108 further comprises information related to error minimization between imputed training data and the input training data, class type associated with the one or more imputed high-dimensional feature embeddings, and the like. The memory 102 further comprises (or may further comprise) information pertaining to input(s)/output(s) of each step performed by the systems and methods of the present disclosure. In other words, input(s) fed at each step and output(s) generated at each step are comprised in the memory 102 and can be utilized in further processing and analysis.
The complex industrial units are big data digital behemoths. The sensory observations are incomplete due to sensor failure and thus are of less utility. The quality of the data is of high priority for deploying data-driven Artificial Intelligence (AI) algorithms for process control, optimization, etc. as inaccurate decision-making impacts product quality, and plant safety. The imputation methods are in general classified as either discriminative or generative. Recent advances in missing data imputation through GAN architectures have suffered from inherent limitations of preserving the relationship among the input feature variables and the target variable and temporal relations between observations spanning across timeframes. To overcome these drawbacks, embodiments of the present disclosure implement a congeniality-preserving Generative Adversarial Networks (cpGAN) framework to reconcile missing data by preserving the temporal dependencies, probability distributions of the original data and retaining its utility for downstream tasks. To put it briefly, congeniality-preserving Generative Adversarial Networks (cpGAN) is implemented for imputing low-dimensional incomplete multivariate industrial time-series data. The algorithmic approach minimizes the rubric based on the Information theory for Machine Learning between the empirical probability distributions of the reconciled data and the non-linear original data to preserve the temporal dependencies, retain the input feature variables and target-variable relationship, and probability distributions of the original data. The architecture as depicted in
Consider representing a f(∈N)-dimensional euclidean real space (the totality of f-space) as =Πi=11f i× . . . ×f. Assume that I is a continuous stochastic real-valued variable modeled by a finite-dimensional space, . D is termed as the realizations of I and D is referred to as the set of the f-tuples with the alphabet, Tn×2N sampled from the domain, (f). The euclidean probability distribution of D is described by (D). The dataset, D∈(T
D={(D1(t), . . . ,Df(t))},∀t E{1,2, . . . ,Tn×2N} (1)
D(t)∈I(f) consists of f-continuous feature variables observations at t-th time point. D(t)={(D1(t), . . . , Df(t))} represents the data vector. Dj(t) denotes the observed value of the j-th (∈f) feature variable at t-th time point. M∈(T
M={(M1(t), . . . ,Mf(t)},∀t∈{1,2, . . . ,Tn×2N} (2)
M∈{0,1}(T
(M(t))˜(M(t)|D(t)) (3)
Dj(t)|Mj(t)=1 describes observed values of Dj for a jth-feature variable at t-th time point. Dj(t) is observed if Mj(t)=1 otherwise Dj(t)|Mj(t)=0 s absent in the recorded data. The observed data of D is described by,
D
obs={(D1(t)⊙M1(t), . . . ,Df(t)⊙Mf(t)},∀t∈{1,2, . . . ,Tn×2N} (4)
The unobserved data of D is described by,
D
mis={(D1(t)⊙(1−M1(t)), . . . ,Df(t)⊙(1−Mf(t)))},∀t∈{1,2, . . . ,Tn×2N} (5)
The incomplete dataset, D is rearranged as,
{tilde over (D)}
n,1:T
∈I
T
×f
,∀n∈{1,2, . . . ,2N} (6)
|2N| denotes the dataset cardinality. It is to be considered for n=1, {tilde over (D)}1,1:T
The missingness mask, M is also rearranged as,
M
n,1:T
∈T
{(Mn,1:T
The traditional Generative Adversarial Missing Data Imputation Networks consisted of generator and discriminator modules which are trained simultaneously in competing minimax game to generate imputed samples of having the same distribution as that of the fully observed data, D. Embodiments of the present disclosure implement a cpGAN algorithmic architecture to overcome the inherent limitations of the generative imputation networks to preserve the characteristics of a multidimensional fully observed time-series data such as joint distributions, temporal dynamics, the relationship between independent variables and the dependent target variable in the imputed data by operating on rearranged data, {tilde over (D)}n,1:T
∀n∈{1, 2, . . . , N} and the imputed data is denoted by, {circumflex over (D)}n,1:T
∀n∈{1, 2, . . . , N}. Given,
the imputation network of the cpGAN framework learns a density {circumflex over (P)}({circumflex over (D)}n,1:T
such that it minimizes the weighted sum of the Kullback-Leibler (KL) divergence and the Wasserstein distance (W) of order-1 between the original,
and imputed data, {circumflex over (D)}n,1:T
The imputed data, {circumflex over (D)}n,1:T
for the imputed data to be of substantial utility in downstream forecasting tasks.
The objective of missing data imputation generative neural network is also to preserve the relationship between the independent feature variables, fc⊂f, and target variable, fT∈f of the observed data,
The unbiased imputed data, {circumflex over (D)}n,1:T
Referring to steps of
a cluster independent random noise (Zn,1:T
corresponding to the input training dataset (e.g., low-dimensional multivariate time-series data as known in the art).
In an embodiment, at step 304 of the present disclosure, the one or more hardware processors 104 transform the cluster independent random noise (Zn,1:T
corresponding to the input training dataset
to obtain a cluster dependent random noise (Zn,1:T
It is assumed by the present disclosure and its system and method that, Zn,1:T
The cluster labels are determined by an iterative distance-based algorithm to partition the unlabeled dataset,
into k-predetermined distinct non-overlapping, non-homogeneous clusters. Label embedding vectors, ec∈Rf′, ∀c∈{1, . . . , k} are obtained from the learnable label embedding matrix, W∈Rk×f′ based on the labels,
f′ is the characteristic dimension (a hyperparameter) of the embedding matrix, W. The label embedding vectors, ec corresponding to the labels,
are concatenated to obtain the label matrix, Ln,1:T
corresponding to the fully observed dataset,
are obtained by performing vector quantization through the unsupervised learning technique. It is to be understood by a person having ordinary skill in the art that matrix-matrix product of Zn,1:T
Referring to steps of
based on (i) a mask variable (Mn,1:T
(iii) a nipped mask variable (1−Mn,1:T
The above steps 306 and 308 that describe generation of the imputed synthetic noise
and the one or more imputed high-dimensional feature embeddings (Ĥn,1:T
The generator module 202 comprised in the cpGAN takes as input the realizations of
and outputs a high-dimensional latent variable, H and the same is expressed way of equation below:
The generative imputation neural network function can also be viewed as, GcpGAN:I(T
Wθ, denotes the trainable parameter and it is shared across the sequences, n, ∀n∈{1, 2, . . . , N}. ⊙ denotes Hadamard product and 1∈RT
It is to be understood by a person having ordinary skill in the art or person skilled in the art that the generator architecture as described above shall not be construed as limiting the scope of the present disclosure.
In an embodiment, at step 310 of the present disclosure, the one or more hardware processors 104 predict one or more imputed high-dimensional target feature embeddings (Ĥn,1:T
of the training dataset using the one or more imputed high-dimensional feature embeddings (Ĥn,1:T
and outputs,
The variable subset selection includes the features attributes from the set, {1, . . . , f−1} ⊂f in Hn,1:T
The critic module preserves the relationship between independent feature columns and the target variable in the real dataset during the adversarial training of GcpGAN to generate the relationship preserving synthetic data, {circumflex over (D)}n,1:T
In an embodiment, at step 312 of the present disclosure, the one or more hardware processors 104 generate one or more single-step ahead imputed high-dimensional feature embeddings (H*n,1:t) using the one or more predicted imputed high-dimensional target feature embeddings. In an embodiment, at step 314 of the present disclosure, the one or more hardware processors 104 generate via a recovery module 208 an imputed training data (D*n,1:T
S=[Σn=1NΣt∥Hn,1:T*−ScpGAN(Hn,1:T−1*)∥2] (18)
The GcpGAN by operating in the closed-loop receives the ground-truth,
from the embedding module comprising in the cpGAN. It minimizes LS by forcing the Ĥn,1:T
The embedding module 210 comprised in the cpGAN, takes as input the realizations of
and outputs feature embeddings,
denotes the latent embedding space, Πj=ifj. The supervisor module 206 takes as input the temporal latent feature embeddings, Hn,1:T
S
cpGAN
:H
n,1:T
*∈ΠtΠj−ifj→Hn,1:T
The recovery module 208 (also referred as recovery function and interchangeably used herein) takes as input the high-dimensional latent embeddings, Hn,1:T
or for synthetic imputed variables, Ĥn,1:T
R
cpGAN
:H
n,1:T
* of Hn,1:T
The learnable parameters of the embedding and recovery modules are transformed by the joint training of the modules in the supervised-learning approach of reconstructing the input fully observed temporal data,
through by minimizing a supervised loss as described below,
In joint training of the generator module 202, the supervisor module 206 and the recovery module 208 in unsupervised learning approach, the first moment, |D1−D2| and second-order moment, |√{square root over ({circumflex over (σ)}12)}−{circumflex over (σ)}22|differences, defined between the original data,
and the imputed data, {circumflex over (D)}n,1:T
The sample means for fully-observed data,
and imputed data, {circumflex over (D)}n,1:T
The sample variances, {circumflex over (σ)}12, {circumflex over (σ)}22∈I(f) are evaluated by,
The goal of the generator module 202, the supervisor module 206, and recovery module 208 is to minimize the first and the second-order moment's differences between the fully-observed input date,
and the imputed temporal data, {circumflex over (D)}n,1:T
Each or the embedding and recovery modules is realized by leveraging a sequential operation on a 3-layer stack of neural-network architectures comprising of uni-directional Long-Short-Term-Memory (LSTM) and a feed-forward neural network layer.
The one or more hardware processors 104 further validate the imputed training data based on a comparison of the imputed training data and the input training dataset. The validation is described above as performed by the recovery module 208. The validation outcome which is referred as a validation dataset is utilized for the hyper-parameter tuning of the cpGAN/system 100.
The one or more hardware processors 104 further classify the one or more imputed high-dimensional feature embeddings into at least one class type. In the present disclosure, the system and method invoke a discriminator (also referred as a discriminator module 212 or a discriminator network and interchangeably used herein) comprised in the cpGAN for performing the classification. The above step of classification may be better understood by way of following description. The objective of the discriminator network, DcpGAN in cpGAN architecture is to distinguish the observed and imputed values in Ĥn,1:T
DcpGAN takes as input the realizations of Ĥn,1:T
and Hn,1:T
The superscript, corresponds to real,
or synthetic, {circumflex over (p)}n,1:T
The binary adversarial cross-entropy loss for classification of the sequence observation as real or fake is described by,
γn,1:T
is the predicted probability of the sequence is real and
is the predicted probability of the sequence being fake. DcpGAN tries to minimize, U. The GcpGAN tries to minimize, −US which helps to learn {circumflex over (P)}({circumflex over (D)}n,1:T
the cross-entropy loss in binary classification for predicting the input random mask matrix is described by,
The DcpGAN attempts to maximize the probability of accurately predicting, Mn,1:T
with unbiased estimates as given by, (1−Mn,1:T
k denotes the number of predetermined cluster labels. γn,1:T
is the predicted probability for real observation at time point t of a data sequence, n belongs to cluster label, m. {circumflex over (p)}n,1:T
The DcpGAN tries to minimize, £P whereas GcpGAN tries to minimize,
denote the predicted cluster labels by the neural-network architecture in comparison with the ground-truth,
corresponding to real data,
denote the predicted cluster labels for the imputed temporal data, {circumflex over (D)}n,1:T
is also minimized. The Wasserstein loss, W is described by,
is the set of all possible joint probability distributions between
The DcpGAN tries to maximize, W whereas the GcpGAN tries to minimize, W. The discriminator module is realized by leveraging a sequential operation on a stack of neural-network architectures comprising of unidirectional LSTM and a feed-forward neural network layer.
Training of cpGAN/System of
The embedding module (EcpGAN) and the recovery module (RcpGAN) were jointly trained on the task of reconstructing the fully observed temporal data,
Initially, the supervisor module (ScpGAN) was trained in the supervised-learning approach on the single-step ahead prediction task of the fully observed temporal latent variable,
∀n∈{1,2, . . . N} by operating in the latent space, In the beginning, the critic module (FcpGAN) was trained on the original fully observed data to minimize the target variable prediction loss, F. Here, the objective is to minimize, minΦ
Here, α∈R+ is a hyper-parameter. In the experiments conducted by the present disclosure, α=100 and γ=10. DcpGAN was trained by four distinct loss functions U, W, M, and LP. D was trained to minimize the weighted sum of the loss functions,
GcpGAN, DcpGAN were trained adversarially by deceptive input as follows, minG
In conclusion, the cpGAN architecture was trained with both the supervised and unsupervised losses. Performance of the cpGAN was evaluated on
and is reported herein accordingly. The unbiased imputed data, {circumflex over (D)}n,1:T
Electricity Transformer Temperature (ETT) datasets used during experiments contained 2-year data of electricity transformer usage and reported in, Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Each dataset (ETTm1, and ETTm2) contained 70,080 (2 year*365 days*24 hours*4 times (observation at 15 min interval)) temporal observations. In addition, hour-wise recorded dataset (ETTh1, ETTh2) has been leveraged to evaluate performance of the method of the present disclosure. The train/validation/test splits are 60/20/20%. Table 1 reports the Root Mean Square Error (RMSE) in imputing the ETT test dataset by the cpGAN architecture. The cumulative imputation error increases with an accretion of the random missing percentage for each column feature attribute on all the datasets. As reported in Table 2, LSTM neural network is utilized as a standard benchmark model trained on original train data on the basis of supervised-learning driven downstream tasks of the target variable (“oil temperature” (° C.)) and one-step-ahead prediction with distinct learnable parameters respectively. The evaluation of the prediction models is performed on the original test dataset and the imputed test datasets. The performance is reported in terms of the RMSE metric. As observed in Table 2, the first column reports the prediction error on the original test dataset. The subsequent columns report prediction error on the imputed test datasets. The adversarial training of the generator imputation network (or the generator) for imputing missing data with the random missing (%) of each column attribute in the range, [2.5%, 20%] resulted in an on-par performance of the prediction model on the imputed test dataset in comparison with the original test dataset. For the single-step ahead prediction, as shown in Table 3, the error in forecasting rises as the missing (%) of feature attributes increases. In Table 4, the experimental results demonstrate the cpGAN algorithmic framework (or cpGAN as implemented by the present disclosure) efficacy in the downstream application task of the target variable prognostics and it outperforms several strong baselines in the literature, as reflected in the lower prediction error. The results reported in Tables 1, 2, 3, and 4 on the respective tasks are obtained from the arithmetic mean of five computational experimental runs. The error of deviation is at most 5% from the statistical mean value reported in Table 2, Table 3, and Table 4. The system and method of the present disclosure leveraged NVIDIA® T4 GPU for the training of deep learning models built upon the PyTorch framework.
As described above, the cpGAN comprises of embedding, critic, supervisor, generator, discriminator, and recovery neural network functions (or also referred as modules and interchangeably used herein) to tackle the temporal non-uniformity of the incomplete observed multidimensional continuous-variable time series data. The input to the cpGAN is the multidimensional continuous-variable time series data, random mask variable, and cluster-independent random noise. The input data preprocessing involves feature scaling by transforming the scale of continuous feature variables by utilizing the min-max normalization technique(s) as known in the art to obtain the preprocessed data. The preprocessed data is split into training, validation, and test dataset respectively. Training of the cpGAN consisted of two phases. In the first phase, the following modules were trained namely, the embedding, recovery, critic, and supervisor modules of the imputation network by utilizing the training dataset. More specifically, the training dataset is fed as an input to the embedding module to obtain high-dimensional feature embeddings. The high-dimensional feature embeddings are fed as input to the recovery module to reconstruct the training dataset. The embedding and recovery modules are trained jointly in a supervised learning approach to reconstruct the input training dataset. The high-dimensional feature embeddings are fed as input to the supervisor module to obtain single-step ahead predictions of the feature embeddings. The supervisor module is trained in a supervised learning approach as a forecasting model to minimize the forecasting error predictions on the training dataset. The high-dimensional feature embeddings are fed as input to the critic module. The high-dimensional feature embeddings consist of independent feature embeddings and dependent target feature embedding. The critic module operates on the independent feature embeddings to predict the target feature embedding. The critic module is trained in a supervised learning approach as a prediction model to minimize the target variable prediction error on the training dataset.
In the second phase, the following modules namely, the generator, discriminator, recovery, critic, and supervisor modules of the imputation network/cpGAN are jointly trained by utilizing the training dataset and cluster-independent random noise and the mask variable. For instance, cluster-dependent random noise is obtained by transforming the cluster-independent random noise by using the cluster-labels corresponding to the training dataset. A linear transformation is performed on the summation of the product of the mask variable with the feature embeddings of the training dataset and the flipped mask variable with the cluster-dependent random noise to obtain the imputed synthetic noise. The imputed synthetic noise is fed as input to the generator module to output the imputed high-dimensional feature embeddings. The generator is trained to fool the discriminator to classify the imputed high-dimensional feature embeddings as real. The imputed high-dimensional feature embeddings are fed as input to the critic module to predict the imputed target feature embeddings. The critic module is trained to minimize the difference between the target feature embedding of the training dataset and the predicted imputed target feature embeddings in a supervised learning approach. The imputed feature embeddings are fed to the discriminator to assign a label as real or fake, wherein the discriminator tries to classify the imputed high-dimensional feature embeddings as fake. The imputed high-dimensional feature embeddings are fed as input to the supervisor module to generate the single-step ahead predictions of the imputed feature embeddings. The supervisor module is trained to minimize the difference between the single-step ahead predictions of the imputed feature embeddings and the feature embeddings of the training dataset in a supervised learning approach. The single-step ahead imputed feature embeddings are fed to the recovery module to obtain the imputed training data. The recovery module is trained in an unsupervised learning approach to minimize the difference between the imputed training data and the input training data. Though the experimental results depict for a specific use scenario (e.g., Electricity Transformer Temperature (ETT)) or application, such use scenario or application shall not be construed as limiting the scope of the present disclosure. For instance, the cpGAN system 100 used for imputing low-dimensional multivariate industrial time-series data may be used in Digital Twin, simulation of Industry Plants/machines/sensors (or sensor data), production units and/or manufacturing units.
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
202221011587 | Mar 2022 | IN | national |