Each of the references cited below are incorporated herein by reference.
The present disclosure is directed, in general, to state tracking systems and, more specifically, to a system and method for estimating the state of a system in a noisy measurement environment.
Niels Bohr is often quoted as saying, “Prediction is very difficult, especially about the future.” Prediction is the practice of extracting information from sets of data to identify patterns and predict future behavior. The Institute for Operations Research and the Management Sciences (INFORMS) defines several types of prediction which have been expanded here for clarity:
Table 1 (below) provides a sampling of the state of the art of machine learning methods, where each method has its own set of implementation requirements; it was presented by Anais Dotis-Georgiou at the Big Data and Artificial Intelligence Conference in 2019 at Addison, TX.
In the case of supervised learning, the designer is required to manually select features, choose the classifier method, and tune the hyperparameters. In the case of unsupervised learning, some algorithms (e.g., k-means, k-medoid, and fuzzy c-means) require the number of clusters to be selected a priori; principal component analysis requires the data to be scaled, assumes the data is orthogonal, and results in linear correlation; nonnegative matrix factorization requires normalization of the data; and factor analysis is subject to interpretation.
Deep learning brings with it its own set of demands. Enormous computing power through high performance graphics processing units (GPUs) is needed to process big data, on the order of 105 to 106 points. Also, the data must be numerically tagged. Furthermore, it takes a long time to train a model. In the end, because of the depth of complexity, it is virtually impossible to understand how conclusions were reached.
The artificial neural network (ANN) architecture supporting machine/deep learning is supposedly inspired by the biologic nervous system. The model learns through a process called back propagation which is an iterative gradient method to reduce the error between the input and output data. But humans do not back-propagate when learning, so the analogy is weak in that regard. Other drawbacks include the following:
A system is needed which improves upon machine learning and statistical methods such that the practitioner can perform real-time predictive and prescriptive analytics. The system should avoid the pitfalls of artificial neural networks with their arbitrary hidden layers, iterative feature and method selection, and hyperparameter tuning. Furthermore, the system should not require enormous computing power. Preferably, such a system will overcome state estimation challenges in a noisy measurement environment.
Deficiencies of the prior art are generally solved or avoided, and technical advantages are generally achieved, by advantageous embodiments of the present disclosure of a system and method for estimating a state of a system. The method includes making a first measurement of a value of a characteristic of a state of a system, and a second measurement of a value of the characteristic of the state of the system after the first measurement. While the second measurement is the last measurement in this example, there may be a plurality of measurements followed by the following steps that apply to the corresponding measurements. After the measurements have been taken, the method carries out a filtering process. The method also includes constructing a first filter measurement estimate after the second measurement coinciding with the first measurement including a first filter measurement covariance matrix describing an accuracy of the first filter measurement estimate, and constructing a first filter time estimate after the first filter measurement estimate including a first filter time covariance matrix describing an accuracy of the first filter time estimate employing a dynamic model of the state of the system. The method also includes constructing a second filter measurement estimate after the first filter time estimate coinciding with the second measurement including a second filter measurement covariance matrix describing an accuracy of the second filter measurement estimate, and constructing a second filter time estimate after the second filter measurement estimate including a second filter time covariance matrix describing an accuracy of the second filter time estimate employing the dynamic model of the state of the system.
After the filtering process, the method carries out a smoothing process. The method includes constructing a smoothing estimate from the first filter measurement estimate and the second filter measurement estimate. The smoothing estimate may be obtained by sweeping backward recursively from the second filter measurement estimate to the first filter measurement estimate. After the smoothing process, the method carries out a prediction. The method includes constructing a first prediction estimate after the smoothing estimate that provides a forecast of a value of the characteristic of the state of the system including a first prediction covariance matrix describing an accuracy of the first prediction estimate employing the dynamic model of the state of the system. Of course, the method can carry out a plurality of prediction estimates providing corresponding forecasts of a value of the characteristic of the state of the system.
The foregoing has outlined rather broadly the features and technical advantages of the present disclosure in order that the detailed description of the disclosure that follows may be better understood. Additional features and advantages of the disclosed embodiments will be described hereinafter, which form the subject matter of the claims. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures or processes for carrying out the same purposes of the present disclosure, and that such equivalent constructions do not depart from the spirit and scope of the disclosure as set forth in the appended claims.
For a more complete understanding of the present disclosure, reference is now made to the following detailed description taken in conjunction with the accompanying drawings, in which:
Corresponding numerals and symbols in the different figures generally refer to corresponding parts unless otherwise indicated and, in the interest of brevity, may not be described after the first instance.
The making and using of exemplary embodiments of the disclosed invention are discussed in detail below. It should be appreciated, however, that the general embodiments are provided to illustrate the inventive concepts that can be embodied in a wide variety of specific contexts, and the specific embodiments are merely illustrative of specific ways to make and use the systems, subsystems, and modules for estimating the state of a system in a real-time, noisy measurement, machine-learning environment. While the principles will be described in the environment of a linear system in a real-time machine-learning environment, any environment such as a nonlinear system, or a non-real-time machine-learning environment, is within the broad scope of the disclosed principles and claims.
Intelligent prediction is a system introduced herein that uniquely combines the three forms of optimal estimation (filtering, smoothing, and predicting) to provide utility for predictive and prescriptive analytics with applications to real-time sensor data. The results of this systematic approach outperform the best of the current statistical methods and, as such, outperform machine learning methods.
To perform predictive and prescriptive analytics in real-time and avoid the pitfalls of machine learning which utilizes artificial neural networks, the system architecture introduced herein is based on combining three types of estimation: filtering, smoothing, and predicting, which is an approach new to forecasting, and which it is believed has not been previously considered with regard to machine learning, as illustrated in Table 1.
With the following definitions:
x
k+1=ϕkxk+wk,
measurements are described by
z
k
=H
k
x
k
+v
k,
and initial conditions given by
{circumflex over (x)}
0
−
=E[x
0]
P
0
−
=E[(x0−{circumflex over (x)}0−)(x0−{circumflex over (x)}0−)T].
The discrete-time Kalman filter recursive equations are given by
K
k
=P
k
−
H
k
T(HkPk−HkT+Rk)−1Filtering gain
{circumflex over (x)}
k
={circumflex over (x)}
k
+K
k(zk−Hk{circumflex over (x)}k−)State measurement estimate
P
k=(I−KkHk)Pk−(I−KkHk)T+KkRkKkT State measurement covariance
{circumflex over (x)}
k+1
−=ϕk{circumflex over (x)}kState time estimate at a next time step tk+1
P
k+1
−=ϕkPkϕkT+Qk State time covariance(also at next time step)
It is worthy to mention C. F. van Loan's method is employed to compute ϕk and Qk. As previously mentioned, the Kalman filter has numerous applications for guidance, navigation, and control of aerospace vehicles, e.g., aircraft, spacecraft, rockets, and missiles. However, the filter will be combined, as introduced herein, with smoothing and predicting with applications to (possibly real-time) predictive and prescriptive analytics.
The second part of the three-part system is an implementation of discrete fixed-interval smoothing. The smoothing form of optimal estimation is when an estimate falls within a span of measurement points. For the proposed system, the time interval of the measurements is fixed (hence the name) and optimal estimates of the (saved) states{circumflex over (x)}k are obtained.
With initial conditions given by the last a posteriori estimate and covariance from the filter
x
T
s
=x
r
P
T
s
=P
T,
the smoother sweeps backward recursively
C
k
=P
kϕkT(Pk+1−)−1Smoothing gain
x
k
s
={circumflex over (x)}
k
+C
k(xk+1s−ϕk{circumflex over (x)}k)State smoothing estimate
P
k
s
=P
k
+C
k(Pk+1s−Pk−1−)CkT State smoothing covariance
The third part of the three-part system is an implementation of a predictor. The predicting form of optimal estimation is when an estimate falls beyond the last measurement point. The equations for the predictor are identical to the filter with the following three exceptions:
{circumflex over (x)}
0
−
=x
T
s
P
0
−
=P
T
s,
Those skilled in the art know how to model dynamic process and measurements described by xk+1=ϕkxk+wk and zk=Hk xk+vk, respectively. Thus, once initialized with {circumflex over (x)}0− and P0−, the five-step Kalman filter iterates recursively until the set of data to be filtered is exhausted resulting in state estimates {circumflex over (x)}k and state covariances Pk.
Upon saving the state estimates {circumflex over (x)}k, the state covariances Pk, and properly initializing xTs and PTs with the last entries of {circumflex over (x)}k (Tf) and Pk (Tf), the three-step smoother iterates recursively with a backward sweep to an earlier time point, as illustrated in
The last state estimate and state covariance of the smoother is used to initialize the predictor. The predictor runs just like the five-step Kalman filter with two exceptions: (i) the covariance of the measurement noise R k is set to an arbitrarily large value to indicate the measurements are worthless, because there are not any, and (ii) the measurement z k is fixed to its final value, because that is the last piece of information available.
An application of the disclosed three-part system's implementation is shown in
Referring again to
Data used to perform the analysis was gathered from a live, operating Horizontal Pump System (HPS). Filtering and predicting was tested with eighteen data sets. Measurements were taken every hour so that one forecasting period represents one hour of elapsed time. The data measures various components of the HPS including bearing and winding temperatures in the motor, pump vibration and suction pressure, overall system health, and other attributes of the system.
Each data set includes noise which may vary with time. The different sources of data provide a mixture of different characteristics such as seasonality, trends, impulses, and randomness. For example, temperature data is affected by the day/night (diurnal) cycle which creates a (short) seasonal characteristic. Vibration data, however, is not affected by the day/night cycle and is not seasonal but does contain a significant portion of randomness.
A missed prediction occurs when an observed measurement exceeds a threshold value, but no forecast was produced which predicted the exception. Any forecast which predicted the exception within twelve periods leading up to the exception was not considered because such a short forecast is not useful. A prediction strategy should produce as few missed predictions as possible.
Turning now to Table 2 (below), illustrated are temperature and vibration filtering after filtering and predicting, showing sensitivities, forecast lengths, and average percent of missed predictions.
Table 2 compares each strategy (ETS versus filter/prediction) over 24 periods (1 day) and 336 periods (14 days). The filter sensitivity column refers to how closely the signal is being tracked. For instance, temperature changes slowly over time so the filter/prediction combination is set to high sensitivity to track the slowly changing signal; whereas vibration, which contains high frequency noise is set to low sensitivity, as illustrated in
The filtering, smoothing, and predicting process introduced herein outperforms the ETS strategy which was the basis of performance assessment over machine learning strategies as shown in Table 1. These results appear to be independent of sensitivity setting (low, medium, or high). Therefore, in general, a practitioner could use the filter/predictor strategy to avoid missed predictions. Furthermore, with the inclusion of smoothing, these results are improved upon as shown in
Turning now to
At a step or module 510, a first estimate of a state of a system is constructed at a first time including a first covariance matrix describing an accuracy of the first estimate.
At a step or module 520, a second estimate of the state of said system is constructed at a second time, after the first time, including a second covariance matrix describing an accuracy of the second estimate employing a dynamic model of the state of the system; the dynamic model comprises a matrix with coefficients that describes a temporal evolution of the state of the system.
At a step or module 530, a value of a characteristic of the state of the system is measured at the second time. Measuring the value of the characteristic can include making a plurality of independent measurements characterized by a diagonal measurement covariance matrix. At a step or module 540, the second estimate of the state of the system and the second covariance matrix are adjusted based on the value of the characteristic.
At a step or module 550, a third estimate of the state of the system is constructed at a third time, before the second time, including a third covariance matrix describing an accuracy of the third estimate employing the dynamic model of the state of the system.
At a step or module 560, a fourth estimate of the state of the system is constructed at a fourth time, after the second time, from the second estimate. In some embodiments, the fourth time is on a different time scale from the first, second and third times.
At a step or module 570, the dynamic model is altered in response to the value of the characteristic.
At a step or module 580, the state of the system is reported based on the fourth estimate.
At a step or module 590, a fifth estimate of the state of the system is constructed at a fifth time, after the second time, from the second estimate.
In certain embodiments, the dynamic model is a linear dynamic model with constant coefficients. In an embodiment, constructing the first estimate and constructing the second estimate are performed by a Kalman filter.
At a step or module 595, the state of the system is altered based on the fourth estimate.
The method 500 terminates at end step or module 598.
The impacts to implementation of predictive analysis of processes introduced herein cannot be understated. Whereas machine learning approaches are directly dependent on a large and fully populated training corpus, purely statistical approaches, such as ETS and the novel filter/predictor strategy introduced herein, learn directly from the real-time signal with additional data or knowledge imposed. Based upon the findings indicated in Table 1, the established ETS approach is already of better performance than the more widely used machine learning techniques. The improvements and advantages of the process introduced herein over ETS (shown in Table 2) only solidifies the merits of the new approach.
In short, advantages of the novel filtering, smoothing, and predicting process do not requiring a priori knowledge as it does for machine learning techniques. Because the system combines optimal estimation techniques of filtering, smoothing, and predicting, there are no dependencies on artificial neural nets and their (shallow, greedy, brittle, and opaque) shortcomings.
Turning now to
The functionality of the apparatus 600 may be provided by the processor 610 executing instructions stored on a computer-readable medium, such as the memory 620 shown in
The processor 610 (or processors), which may be implemented with one or a plurality of processing devices, perform functions associated with its operation including, without limitation, performing the operations of estimating the state of a system, computing covariance matrices, and estimating a future state of the system. The processor 610 may be of any type suitable to the local application environment, and may include one or more of general-purpose computers, special purpose computers, microprocessors, digital signal processors (“DSPs”), field-programmable gate arrays (“FPGAs”), application-specific integrated circuits (“ASICs”), and processors based on a multi-core processor architecture, as non-limiting examples.
The processor 610 may include, without limitation, application processing circuitry. In some embodiments, the application processing circuitry may be on separate chipsets. In alternative embodiments, part or all of the application processing circuitry may be combined into one chipset, and other application circuitry may be on a separate chipset. In still alternative embodiments, part or all of the application processing circuitry may be on the same chipset, and other application processing circuitry may be on a separate chipset. In yet other alternative embodiments, part or all of the application processing circuitry may be combined in the same chipset.
The memory 620 (or memories) may be one or more memories and of any type suitable to the local application environment, and may be implemented using any suitable volatile or nonvolatile data storage technology such as a semiconductor-based memory device, a magnetic memory device and system, an optical memory device and system, fixed memory and removable memory. The programs stored in the memory 620 may include program instructions or computer program code that, when executed by an associated processor, enable the respective apparatus 600 to perform its intended tasks. Of course, the memory 620 may form a data buffer for data transmitted to and from the same. Exemplary embodiments of the system, subsystems, and modules as described herein may be implemented, at least in part, by computer software executable by the processor 610, or by hardware, or by combinations thereof.
The communication interface 630 modulates information for transmission by the respective apparatus 600 to another apparatus. The respective communication interface 630 is also configured to receive information from another processor for further processing. The communication interface 630 can support duplex operation for the respective other processor 600.
In summary, the inventions disclosed herein combine three techniques of optimal estimation of the state of a system. The three techniques include filtering, smoothing, and predicting processes, and can be performed, without limitation, in a machine learning and/or a noisy measurement environment.
The filtering portion of optimal estimation is performed to construct a first estimate of a state vector xk at a time point tk that coincides with a measurement of a value of characteristic of the state of the system at the time point tk. The filtering process employs a covariance matrix that describes the accuracy of the first estimate of the state vector xk at the time point tk. A second estimate of the state vector xk+1 at the time point tk+1 is then constructed by propagating the state of the system forward to a second time point tk+1, the second time point being after the first time point. The propagating forward employs a dynamic model of the state of the system to produce the estimate of the state vector xk+1 at the second time point tk+1. The first estimate of the state vector xk and constructing the second estimate of the state vector xk+1 can be performed by employing a Kalman filter.
The dynamic model can employ a matrix with coefficients that describes temporal evolution of the state of the system. In certain embodiment, the dynamic model is a linear dynamic model with constant coefficients.
A value of a characteristic of the state of the system xk+1 is measured at the second time point tk+1. The second estimate of the state of the system and the second covariance matrix are adjusted based on the measured value of the characteristic at the second time point tk+1
Measuring the value of the characteristic can include making a plurality of independent measurements characterized by a diagonal measurement covariance matrix.
The smoothing portion of optimal estimation is performed by constructing a third state estimate for a time point that is earlier than the time point tk+1. The earlier time point can fall within or before a span of current measurement points, e.g., between or before the time points tk and tk+1.
The predicting portion then propagates the state estimate forward for a forecast period of interest. The last state estimate and state covariance of the smoother can be used to initialize the predicting. The predictions may be at various time points in the future and over various time scales that are after the second time point. Measurement noise Rk can be set to an arbitrarily large value to accommodate the inherent absence of a state measurement at a future time point. The initial conditions for the prediction can be taken as the last a posteriori state estimate and the covariance of the smoother.
As described above, the exemplary embodiments provide both a method and corresponding apparatus consisting of various modules providing functionality for performing the steps of the method. The modules may be implemented as hardware (embodied in one or more chips including an integrated circuit such as an application specific integrated circuit), or may be implemented as software or firmware for execution by a processor. In particular, in the case of firmware or software, the exemplary embodiments can be provided as a computer program product including a computer readable storage medium embodying computer program code (i.e., software or firmware) thereon for execution by the computer processor. The computer readable storage medium may be non-transitory (e.g., magnetic disks; optical disks; read only memory; flash memory devices; phase-change memory) or transitory (e.g., electrical, optical, acoustical or other forms of propagated signals-such as carrier waves, infrared signals, digital signals, etc.). The coupling of a processor and other components is typically through one or more busses or bridges (also termed bus controllers). The storage device and signals carrying digital traffic respectively represent one or more non-transitory or transitory computer readable storage medium. Thus, the storage device of a given electronic device typically stores code and/or data for execution on the set of one or more processors of that electronic device such as a controller.
Although the embodiments and its advantages have been described in detail, it should be understood that various changes, substitutions, and alterations can be made herein without departing from the spirit and scope thereof as defined by the appended claims. For example, many of the features and functions discussed above can be implemented in software, hardware, or firmware, or a combination thereof. Also, many of the features, functions, and steps of operating the same may be reordered, omitted, added, etc., and still fall within the broad scope of the various embodiments.
Moreover, the scope of the various embodiments is not intended to be limited to the particular embodiments of the process, machine, manufacture, composition of matter, means, methods and steps described in the specification. As one of ordinary skill in the art will readily appreciate from the disclosure, processes, machines, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed, that perform substantially the same function or achieve substantially the same result as the corresponding embodiments described herein may be utilized as well. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps.
This application is a Continuation of U.S. patent application Ser. No. 16/674,848 entitled “System And Method For State Estimation In A Noisy Machine-Learning Environment” filed on Nov. 5, 2019 which claims benefit to U.S. Provisional Patent Application No. 62/756,044, entitled “Hybrid AI,” filed Nov. 5, 2018, which is incorporated herein by reference. This application is related to U.S. application Ser. No. 15/611,476 entitled “PREDICTIVE AND PRESCRIPTIVE ANALYTICS FOR SYSTEMS UNDER VARIABLE OPERATIONS,” filed Jun. 1, 2017, which is incorporated herein by reference. (INC-026) This application is related to U.S. Provisional Application No. 62/627,644 entitled “DIGITAL TWINS, PAIRS, AND PLURALITIES,” filed Feb. 7, 2018, converted to U.S. application Ser. No. 16/270,338 entitled “SYSTEM AND METHOD THAT CHARACTERIZES AN OBJECT EMPLOYING VIRTUAL REPRESENTATIONS THEREOF,” filed Feb. 7, 2019, which are incorporated herein by reference. (INC-030) This application is related to U.S. application Ser. No. 16/674,885 (Attorney Docket No. INC-031B), entitled “SYSTEM AND METHOD FOR ADAPTIVE OPTIMIZATION,” filed Nov. 5, 2019, U.S. application Ser. No. 16/674,942 (Attorney Docket No. INC-031C), entitled “SYSTEM AND METHOD FOR CONSTRUCTING A MATHEMATICAL MODEL OF A SYSTEM IN AN ARTIFICIAL INTELLIGENCE ENVIRONMENT,” filed Nov. 5, 2019, and U.S. application Ser. No. 16/675,000 (Attorney Docket No. INC-031D, entitled “SYSTEM AND METHOD FOR VIGOROUS ARTIFICIAL INTELLIGENCE,” filed Nov. 5, 2019, which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62756044 | Nov 2018 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16674848 | Nov 2019 | US |
Child | 18187860 | US |