The present invention relates to unsupervised disaggregation apparatus, method and computer-readable medium.
Disaggregation technology is used to estimate a state(s), e.g., operation state(s) of an individual electric appliance from an aggregate (synthetic) signal such as power consumption, of a plurality of electric appliances (hereinafter termed as “appliances”) that is acquired by non-intrusive load monitoring (NILM).
In NILM (Non-Intrusive Load Monitoring), the current waveform measured for example, in a household or factory, from a power distribution board, the power consumption or the like of each household or factory appliance is estimated. For example, using current measured in a location, power consumption of each household electrical appliance connected ahead therefrom can be obtained without measurement of individual appliances.
A disaggregation system performs disaggregation of the aggregate signal into an individual signal of each appliance, for example, in case of supervised disaggregation, pattern matching is performed with respective learned models of waveform data of each appliance.
The unsupervised disaggregation of electric current waveform of a single appliance disaggregates into individual units of appliance. The units of appliance may refer to internal parts of the appliance that mainly consist of resistors, inductors, capacitors and thyristors and alike components. In a working electric appliance, there are combinations of all these internal parts. Disaggregation of these combinational units of electric appliance from a source current waveform consumption with an unsupervised approach is disclosed in this application. The disaggregation can be applied to any electric facility like home, building, factory and so on; though not limited thereto.
As an algorithm for disaggregation, Factorial Hidden Markov Model (FHMM), Combinatorial Optimization, Blind Source Separation and so forth may be utilized.
For example, NPTL 1 discloses an NILM technique using Factorial Hidden Markov Model (FHMM). In a FHMM based disaggregation (supervised), a state model structure with a fixed number of nodes (states) and fixed number of edges is usually adopted. As a simple case of FHMM, one appliance corresponds to one factor, wherein each factor represents a state model structure.
Latent Feature Model (LFM) is a direct generalization of mixture model where each observation is an additive combination of several latent features. In the latent feature model (LFM), each instance is generated not from a single latent class but from a combination of latent features and each instance has an associated latent binary feature incidence vector (binary vector) indicating presence or absence of a feature. Models used in unsupervised learning show relative singular representations of the data.
The simplest representation, used in mixture models, associates each object with a single latent class. This approach is suitable when objects can be partitioned into relatively similar subsets like clustering methods. However, the properties of many objects are better captured by representing each object using multiple latent features. For example, select each latent feature as a binary vector, with entries indicating the presence or absence of each internal unit waveform, representing data in a latent space.
Unsupervised learning recovers a latent structure responsible for generating observed properties or attributes of a set of objects. In latent feature modeling, one or more attributes of each object can be represented by an unobserved vector of latent features.
Disaggregation of an aggregate waveform signal into individual waveform signals of individual internal units is a combinatorial problem or combinatorial optimization problem. Therefore, recovering latent features from the aggregate waveform signal is a computationally complex problem. The latent features estimated are converted back to recover an individual waveform signal of an individual internal unit.
As described above, unsupervised disaggregation of an aggregate waveform signal into individual waveform signal of individual internal unit is a combinatorial problem or combinatorial optimization problem. Recovering latent features from the aggregate waveform signal is a computationally complex problem. The latent features estimated are converted back to recover an individual waveform signal of its corresponding individual internal unit.
When performing disaggregation of the aggregate waveform into individual waveform signals of individual internal units based on LFM, there are such cases, where the recovered waveform signal of an individual internal unit is inappropriate due to incorrect estimation of the latent features.
One of the reasons for causing the incorrect estimation is the model optimization solution falls in incorrect local minima where the recovered electric current signal from latent features is out of phase with the phase of measured signal.
Accordingly, it is an object of the present invention to provide an apparatus, a method, and a program recording medium, each making it possible to perform automatic unsupervised disaggregation of an aggregate waveform into appropriate individual waveform signal of internal units using LFM.
According to an aspect of the present invention, there is provided an unsupervised disaggregation apparatus comprising a processor and a memory coupled to the processor and program instructions to be executed by the processor. The processor executes the process comprising:
creating an observation matrix X including N number of D-dimensional observation vectors, each composed of a measured aggregate waveform that is a sum of a plurality of individual waveform signals of a plurality of internal units;
estimating, by using a latent feature model approach, a binary matrix Z with N rows and K columns and a latent feature matrix W with K rows and D columns, from the observation matrix X with N rows and D columns, where N, D, and K are predetermined positive integers; calculating a dot product of the latent feature matrix W and a D dimensional vector x, a dot product of which with each row of the latent feature matrix W is assumed to give a positive value;
repeating, for i=1 to K, checking whether or not a result of the dot product for the i-th row of the latent feature matrix W with the D dimensional vector x is negative, and if the result of the dot product is negative, discarding the i-th row from the latent feature matrix W and discarding i-th column from the binary matrix Z;
checking whether or not there exists at least one discarded row in the latent feature matrix W, and as a result of the checking,
if there exists at least one discarded row in the latent feature matrix W,
using a new latent feature matrix Wnew, each row thereof being a row of the latent feature matrix W, the dot product of the row thereof with the D dimensional vector x being non-negative and not discarded, updating the latent feature matrix W, and using a new binary matrix Znew, each column thereof being a column of the binary matrix Z not discarded, updating the binary matrix Z; and
performing iteration from the estimation of the matrices Z and W from the observation matrix X using the updated matrices Z and W, until there is no discarded row in the latent feature matrix.
According to an aspect of the present invention, there is provided a computer-based disaggregation method comprising:
creating an observation matrix X including N number of D-dimensional observation vectors, each composed of a measured aggregate waveform that is a sum of a plurality of individual waveform signals of a plurality of internal units;
estimating, by using a latent feature model approach, a binary matrix Z with N rows and K columns and a latent feature matrix W with K rows and D columns, from the observation matrix X with N rows and D columns, where N, D, and K are predetermined positive integers;
calculating a dot product of the latent feature matrix W and a D dimensional vector x, a dot product of which with each row of the latent feature matrix W is assumed to give a positive value;
repeating, for i=1 to K, checking whether or not a result of the dot product for the i-th row of the latent feature matrix W with the D dimensional vector x is negative, and if the result of the dot product is negative, discarding the i-th row from the latent feature matrix W and discarding i-th column from the binary matrix Z;
checking whether or not there exists at least one discarded row in the latent feature matrix W, and as a result of the checking,
if there exists at least one discarded row in the latent feature matrix W,
using a new latent feature matrix Wnew, each row thereof being a row of the latent feature matrix W, the dot product of the row thereof with the D dimensional vector x being non-negative and not discarded, updating the latent feature matrix W, and using a new binary matrix Znew, each column thereof being a column of the binary matrix Z not discarded, updating the binary matrix Z; and
performing iteration from the estimation of the matrices Z and W from the observation matrix X using the updated matrices Z and W, until there is no discarded row in the latent feature matrix.
According to an aspect of the present invention, there is provided a (non-transitory) computer-readable recording medium storing therein a program causing a computer to execute processing comprising:
creating an observation matrix X including N number of D-dimensional observation vectors, each composed of a measured aggregate waveform that is a sum of a plurality of individual waveform signals of a plurality of internal units;
estimating, by using a latent feature model approach, a binary matrix Z with N rows and K columns and a latent feature matrix W with K rows and D columns, from the observation matrix X with N rows and D columns, where N, D, and K are predetermined positive integers;
calculating a dot product of the latent feature matrix W and a D dimensional vector x, a dot product of which with each row of the latent feature matrix W is assumed to give a positive value;
repeating, for i=1 to K, checking whether or not a result of the dot product for the i-th row of the latent feature matrix W with the D dimensional vector x is negative, and if the result of the dot product is negative, discarding the i-th row from the latent feature matrix W and discarding i-th column from the binary matrix Z;
checking whether or not there exists at least one discarded row in the latent feature matrix W, and as a result of the checking,
if there exists at least one discarded row in the latent feature matrix W,
using a new latent feature matrix Wnew, each row thereof being a row of the latent feature matrix W, the dot product of the row thereof with the D dimensional vector x being non-negative and not discarded, updating the latent feature matrix W, and using a new binary matrix Znew, each column thereof being a column of the binary matrix Z not discarded, updating the binary matrix Z; and
performing iteration from the estimation of the matrices Z and W from the observation matrix X using the updated matrices Z and W, until there is no discarded row in the latent feature matrix.
The recording medium may be a non-transitory computer-readable recording medium such as a semiconductor memory (Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable and Programmable Read Only Memory (EEPROM), flash memory, or the like), Hard Disk Drive (HDD), Solid State Drive (SSD), Compact Disc, Digital Versatile Disc, and so forth).
According to the present invention, it is made possible to perform disaggregation of an aggregate waveform into appropriate individual waveform signals of internal units of an electric appliance or any electric facilities using LFM.
The following describes example embodiments of the present invention.
The present invention provides an apparatus comprising an automatic optimization function for convergence of a latent feature model. That is, a latent feature model is employed to separate an observation matrix X (each row vector of which including an observed waveform) into a latent feature matrix W (each row vector of which includes, as a latent feature, an estimated individual waveform of each internal unit) and a binary matrix Z (each row vector of which includes elements indicating presence/absence of a corresponding latent feature).
Each phase of the estimated waveform (row vector of the latent feature matrix W) is matched with a phase of an observed waveform (row vector of the observation matrix X). The matching of phase is important because the observed waveform phase is aligned with the phase of a voltage waveform, which generates a positive power value, while the out of phase estimated waveforms generate a negative power value, which is an incorrect solution. This matching process contributes to convergence to a minimum disaggregation error. Simulation results (described later) show that the disclosed invention reduces separation error (disaggregation error) and it is made possible to estimate the latent feature matrix W (e.g., individual waveforms of internal units) correctly with accurate matching of phases.
Unsupervised learning recovers a latent structure responsible for generating observed properties or attributes of a set of objects. In latent feature modeling, one or more attributes of each object can be represented by an unobserved vector of latent features.
x
i
=Z
i
·W+ε
i (1)
where
xi∈RD: observation vector (D-dimensional i-th row vector of real number),
W∈RK×D: latent feature matrix which is composed of K latent-feature row vectors of D-dimension),
zi∈{0,1}K: binary vector (K-dimensional i-th row vector, also termed as latent indicator),
εi∈RD: noise (D-dimensional i-th row vector).
In the latent feature modeling, there are known approaches such as IBP (Indian Buffet Process) and matrix-factorization to compute latent feature matrix W that best approximates the observation XN×D.
It is assumed that the observations are generated from a distribution determined by latent feature values.
In
Whether the latent feature matrix W is estimated correctly or not can be judged by comparing phases of the observed waveform and the estimated waveform.
A mismatch in phases of current waveforms may result in a negative value of a power (effective power). The power (effective power) is calculated by sum (integral) of an instantaneous power over one AC power supply cycle, for example, where the instantaneous power is given by multiplication of an instantaneous current value (an element of a current waveform) and a corresponding instantaneous voltage value (corresponding element of a voltage waveform).
It is noted that an instantaneous power assumes a negative value which corresponds to such an operation in which energy accumulated in a capacitor (condenser) in a load is returned to a power supply or energy is generated by a regenerative operation of the load such as a motor or the like, while a positive value of an instantaneous power corresponds to an operation in which an energy is consumed in a load or accumulated in a capacitor, inductor (coil) or the like in the load, but the effective power should assume a non-negative value.
The incorrect estimation of the matrix W can be found if a phase of a waveform estimated is different from a phase of the observed (measured) waveform. The incorrect estimation of the matrix W may include an inverted waveform, which is out-of-phase from the measured waveform.
The inverted waveform is incorrect because the estimated waveforms (row vectors of the latent feature matrix W) should have the same phase as that of the measured waveform (observation row vector of the observation matrix X).
Since the inverted waveform may generate a negative value of a power, the inverted waveform is not a suitable latent feature.
Some of the latent features may be inverted waveforms, or all the latent features may be inverted waveforms, or none of the latent feature may be inverted waveforms.
Depending on initial parameters, the solution might change due to multiple local minima problem. The solution for above described problem is to introduce a post-process step, after a model estimation step (S201 in
The post-process step includes:
an optimization loop to optimize a non-inverted waveform(s) with minimized error solution.
The post-process step may include the following 2 steps, as illustrated in
Step 1. Check inverted waveform (S202); and
Step 2. Discard latent feature (S203).
Alternatively, the post-process step may include the following three steps as illustrated in
Step 1. Check inverted waveform (S202);
Step 2. Discard latent feature (S203); and
Step 3. Residual fusion (S204).
In
A latent feature model estimation step (S201) estimates a latent feature matrix W and a binary matrix Z from the observation matrix X.
The check inverted waveform step (S202) checks if there exists any inverted waveform in the estimated matrix W. This step (S202) is located after the latent feature model estimation step (S201).
The check inverted waveform step (S202) is supplied with the estimated matrix W, Z and a vector x. The vector x is of length D, which may be a mean row vector of N observation row vectors xi (i=1, . . . , N) of the observation matrix X, voltage signal, phase vector, or, any vector such that a linear dot product of the matrix W and the vector x produces a positive value.
When the check inverted waveform step (S202) finds an inverted waveform in the estimated matrix W, the discard latent feature step (S203) discards the inverted waveform (row vector) from the latent feature matrix W, and updates the binary matrix Z.
In
In
In
In step S301, using latent feature model, the latent feature matrix W, and the binary matrix Z are estimated.
In step S302 the estimated matrices W, and Z, and the vector x denoted as Xmean which is obtained as a mean vector of N observation row vectors of the observation matrix X are inputted.
where xi is i-th row vector of the matrix X.
The matrix W is of size K×D (K rows and D columns), the matrix Z is of size N×K (N rows and K columns) and the vector Xmean is of length D (D-dimensional row vector).
In step S303, a K-dimensional column vector P is obtained by the dot product of the matrix W and the vector Xmean,
P
(K×1)
×W
(K×D)
·X
mean
T (3)
where T is a transpose operator.
That is, i-th element Pi of the column vector P is given as:
where Wi,j (1=<i=<K, 1=<j=<D) is a (i, j) element of the K×D latent feature matrix W and xj is j-th element of the D-dimensional row vector Xmean.
In step S304, a loop variable m is initialized and a matrices Wnew (new latent feature matrix) and Znew (new binary matrix) are initialized to null.
In step S305, it is checked whether Pm (given by the equation (4) with an index i set to m, i.e., a value of the loop variable) is not less than zero (i.e., greater than or equal to zero).
If Pm is greater than or equal to zero (branch “Yes” of S305), m-th column vector Zm of the binary matrix Z and m-th row vector Wm of the latent feature matrix W are appended respectively to the matrices Znew and Wnew, respectively (step S306). More specifically, the m-th column vector Zm is appended as a column next to the last column of the new binary matrix Znew. When the new binary matrix Znew is in the initialized state, i.e., null, the m-th column vector Zm is placed in the first column of the new binary matrix Znew. In the same way, the m-th row vector Wm is appended as a row next to the last row of the new latent feature matrix Wnew. When the new latent feature matrix Wnew is in the initialized state, i.e., null, the m-th row vector Wm is placed in the first column of the new latent feature matrix Wnew.
If Pm is less than zero (branch “No” of S305), m-th column vector Zm of the binary matrix Z and m-th row vector Wm of the latent feature matrix W are discarded. That is, m-th column vector Zm and m-th row vector Wm are not appended (stored) in the matrices Znew and Wnew.
The loop variable m is incremented by 1 (step S307). If the loop variable m is greater than K (S308), the loop is exited, otherwise, the loop is repeated. In steps S305 to S307, row vectors of the estimated matrix W that contribute to a non-negative power value and thus are not discarded are collected in the new latent feature matrix Wnew. Column vectors of the estimated binary matrix Z that are not discarded are collected in the new binary matrix Znew.
If the new latent feature matrix Wnew is equal to the estimated latent feature matrix W (branch “Yes” of S309), then the post process is ended, else (branch “No” of S309), the binary matrix Z is updated by the new binary matrix Znew and the latent feature matrix W is updated by the new latent feature matrix Wnew (S310). Then, the model estimation step (S301) is re-executed. That is, the latent feature model estimation step (S301) is performed to obtain a more appropriate solution: (Z, W), e.g., to re-estimate a latent feature matrix W′ and a binary matrix Z′ from the observation matrix X by using the updated matrices W and Z as initialization matrices. Regarding obtaining appropriate solution among equivalent solutions, reference may be made to NPTL 3.
The following describes the second example embodiment. As described with reference to
The residual latent features may be generated by any arbitrary process, method, any statistical model or the like. The basic method to generate residual latent features as follows:
The residual matrix can be generated by subtracting the estimated values from the measured values. Here, 2 cases may be possible.
One is to calculate a residual after the model estimation.
R=X−(Z·W) (5)
Other is to calculate a residual after the discard latent feature step (S203).
R=X−(Z′·W′) (6)
In each case, the residual matrix R is a residue or a remaining part of the measured data matrix X which is composed of N residue row vectors of D-dimension, where i-th residue row vector is given as
r
i
=x
i
−z
i
·W, i=1, . . . ,N. (7)
The N×D residual matrix R is utilized as a new input data for any arbitrary model and new information can be generated from this residual matrix R. For example, applying a clustering model on the residual matrix R will generate clusters. The residual matrix R can be represented by these clusters. The cluster number for each of instances is estimated and transformed to a binary matrix ZR of size N×kR (N rows and kR columns).
A value 1 of j-th element (j=1, . . . , kR) of the vector of the binary matrix ZR, represents presence of a relevant cluster (j-th cluster), while a value 0 thereof represents absence a relevant cluster (j-th cluster). It is noted that such model as, Histogram, Cluster, Combination, Gaussian Mixture Model, Classification, or the like can as a matter of course be used to generate the binary matrix ZR from the residual matrix R.
Then, concatenation of the estimated latent feature (Z or Z′ (depend on the use)) and the new binary matrix ZR is performed. The new updated matrices W and Z are used as input in the model with changed parameters and re-executed for the optimization.
In step S408, if the value of the loop variable m>K, the number of the column vectors in Znew is subtracted from the number of the column vectors in Z to obtain kR.
If kR is greater than or equal to 1 (branch “Yes” of S410), then steps S411-S413 are executed and then the step S401 is re-executed. kR is a positive integer which is used to identify the number of inverted waveforms that were present in the previous estimation of the latent feature matrix W and the binary matrix Z. kR may be used to generate number of clusters, histograms bins, classification classes, models or the like.
In step S411, the residual matrix R, after the discard latent feature is calculated.
In step S412, the residual matrix R is modelled by utilizing kR parameter to generate the binary matrix ZR. The modelling of R matrix is done to create kR number of clusters or groups present in the residual matrix R. For example, if the number of clusters are assumed as kR, then a transformed binary matrix is generated. Each column of the binary matrix ZR represents the cluster number, and (i,j) element of the binary matrix ZR assume a value 1 to indicate presence of the cluster j otherwise zero.
In step S413, the N×kR binary matrix ZR generated after modelling the N×D residual matrix R is concatenated in columns with N×(K−kR) binary matrix Znew and the new N×K binary matrix Z is created by concatenation of Znew, and ZR ([Znew, ZR]). In the same manner, the K×D feature matrix WR is concatenated in rows with (K−kR)×D feature matrix Wnew and the new N×K feature matrix W is created by [Wnew, WR].
In step S401, based on the updated N×K binary matrix Z and K×D feature matrix W, the latent feature modeling is performed to estimate the latent feature matrix W and the binary matrix Z. The step S401 may perform the same latent feature model estimation step as step S301 in
If kR=0 (Z==Znew) (branch “No” of the step S410), that is, if the latent feature matrix W is composed of K non-inverted waveforms, the processing is ended.
In step S501, the N×D residual matrix R and the integer value kR are inputted.
In step S502, the residual matrix R is modelled by using a clustering approach. The clustering result provides a residual feature value belong to which cluster.
In step S503, the clustering result is transformed into the binary matrix indicating, with a value 1, presence of the cluster at that time instant.
In step S504, the N×kR binary matrix ZR is generated based on clustering result of the N×D residual matrix R.
The second example embodiment may be combined with the first example embodiment. According to the above described example embodiments, it is possible to automatically optimize convergence of non-inverted waveforms. The latent feature model is employed to separate the waveform into each internal unit's waveform. Visualization of each internal unit of any electric appliance or facility is possible without any extra information like labels for each waveform. The present invention is also applicable in monitoring real time status of electric appliance into ON or OFF states.
(B) [Gibbs Sampler]+[Check inverted waveform]+[Discard latent feature];
(C) [Gibbs Sampler]+[Residual fusion]; and
(D) [Gibbs Sampler]+[Check inverted waveform]+[Discard latent feature]+[Residual fusion].
In
The test cases according to the present invention, outperforms the related art results as result graph can be read as follows for each test case;
(A) Output of the sampling method, i.e., Related art;
(B) Solution guarantees that it falls in non-inverted waveform local minima and eventually decreases error;
(C) Solution falls in local minima and does not decrease error; and
(D) Solution falls in local minima and guarantees non-inverted waveform local minima with minimized error after 50 iterations.
The combination of the three methods decreased error as compared to test case (A).
The unsupervised disaggregation apparatus (or system) described in the above example embodiments may be implemented on a computer system such as a server system (or a cloud system), as illustrated in
The computer system 100 can connect to a network 106 such as LAN (Local Area Network) and/or WAN (Wide Area Network) via the communication unit 105 that may include one or more network interface controllers (cards) (NICs). A program (instructions and data) for executing processing of the unsupervised disaggregation apparatus 100 in
Each disclosure of the aforementioned NPTL 1 to NPTL 3 is incorporated by reference herein. The particular example embodiments or examples may be modified or adjusted within the scope of the entire disclosure of the present invention, inclusive of claims, based on the fundamental technical concept of the invention. In addition, a variety of combinations or selections of elements disclosed herein may be used within the concept of the claims. That is, the present invention may encompass a wide variety of modifications or corrections that may occur to those skilled in the art in accordance with the entire disclosure of the present invention, inclusive of claims and the technical concept of the present invention.
This application is a National Stage Entry of PCT/JP2018/043404 filed on Nov. 26, 2018, the contents of all of which are incorporated herein by reference, in their entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2018/043404 | 11/26/2018 | WO | 00 |