The present invention relates generally to a system and method for detecting anomalous system behavior, for example in complex engineering assets including those found in maritime vessels.
Prognostics and Health Management (PHM) is an active and well subscribed to field of study. Companies managing complex distributed or safety-critical engineering assets want to understand the status of their portfolio in order to maximize efficiency and minimize downtime. Approaches such as health scores, anomaly detection, failure prediction and remaining useful life (RUL) calculation are applied across various domains in order to advise maintenance and repair schedules.
The published literature in this field will typically use synthetic data or carefully collected lab-based data to demonstrate the proficiency of a technique, however the use of “real-world” applications is less common. The challenges of developing PHM algorithms to run in a real-world environment are discussed and postulated, but tangible examples are not widespread. For example, see as a reference “Prognostics and Health Management for Maintenance Practitioners-Review, Implementation and Tools Evaluation” by Atamuradov et al. published in International Journal of Prognostics and Health Management in 2017.
Most known PHM applications apply to engineered systems operating in and on a well-defined single-state and without consideration of prior usage and age. Especially, but not exclusively, in the maritime domain, engineering systems' operating envelopes are manifold by design and operating environment, and can include rapid, anomalous transient events (the term may be used interchangeably with “transients”). These transients can pose problems for classic machine learning (ML) algorithms, in that during these events a system may exhibit behavior consistent with impending failure before recovering to “healthy” behavior, leading to a false-positive which results in wasted repair efforts and a loss of operator confidence in the prognostic algorithm. This problem is further compounded by the fact that such transient events can cause low-level damage to a device, resulting in an increased risk of failure at a later date.
ML algorithms performing anomaly-based fault detection are rarely capable of estimating the remaining useful life (RUL) of a system. However, it is of enormous benefit to a maintainer to have such an estimate and a probability value attached to the estimate in order to permit maximum efficiency in task planning and supply chain optimization. Furthermore, existing non-deep learning ML anomaly-based detectors, especially in the maritime domain, tend to trigger far in advance of system failure, provoking excessively early intervention by a maintainer, denying the operator the economic benefits of extracting maximum use from an asset.
It is an example aim of the present invention to at least partially overcome or avoid one or more problems of the prior art, whether identified herein or elsewhere, or to at least provide an alternative to existing systems and related methodologies.
According to first and second aspect of the present inventions, there is provided a method and a system for training an autoencoder as set out in in the independent claims. Further and optional features are described in the dependent claims.
We describe a method for a computer-implemented method for predicting failure of an engineering asset based on real-time data, the method comprising: receiving a data record comprising data on the engineering asset collected from a plurality of sensors at time t, generating, using a trained machine learning algorithm, a probability PF that the received data record indicates that the engineering asset is in a faulty state; determining what number of data records received in a look-back time Lt are indicative of the engineering asset being in a faulty state, wherein the look-back time Lt is a time period occurring before the time t at which the data record was collected; and predicting a probability of the engineering asset failing during a horizon time Ht, wherein the horizon time Ht is a time period after time t at which the data record was collected; wherein the predicting step comprises implementing a Bayesian model to predict the probability of failure within the horizon time Ht based on the generated probability PF and the number of data records which were determined to be faulty within the look-back time Lt.
We also describe a system for predicting failure of an engineering asset based on real-time data, the system comprising: a plurality of sensors for measuring data on the engineering asset and a processor which is configured to receive a data record comprising data on the engineering asset collected from the plurality of sensors at time t, generate, using a trained machine learning algorithm, a probability PF that the received data record indicates that the engineering asset is in a faulty state; determine what number of data records received in a look-back time Lt are indicative of the engineering asset being in a faulty state, wherein the look-back time Lt is a time period occurring before the time t at which the data record was collected; and predict a probability of the engineering asset failing during a horizon time Ht, wherein the horizon time Ht is a time period after time t at which the data record was collected; wherein predicting comprises implementing a Bayesian model to predict the probability of failure with the horizon time Ht based on the generated probability PF and the number of data records which were determined to be faulty within the look-back time Lt.
The predicted probability of failure may be compared to a failure threshold. When the predicted probability exceeds the failure threshold, there is a high risk that the engineering asset will fail within the horizon time. In this instance, an alert may be output, for example to act as a decision aid to a user. Alternatively, an automated self-protection protocol within the asset may be triggered. The self-protection protocol may control one or more components within the asset to return the asset to normal behavior or may shut down the engineering asset. For example where the engineering asset comprises a diesel engine, shaft-speed limitations may be introduced or where the engineering asset comprises a pump, suction pressure, motor speed or discharge pressure within the pump may be adjusted or controlled. Merely as another example, where the engineering asset is a refrigeration plant, the asset may be shutdown. When the predicted probability is below the failure threshold, a new data record may be received for evaluation.
The invention may be considered to be a prognostics and health management (PHM) application which provides a probability of failure forecast for users including operators and maintenance staff. The invention may be used for monitoring any complex engineering asset whose operation may be described by a data set including a plurality of physical parameters which are measured by the plurality of sensors. Merely as examples, such engineering assets may be within a marine vessel, for example a diesel engine, an electrical pump, a refrigeration plant, or a high voltage switchboard. Such engineering assets may be in other systems, e.g. railway rolling stock bogies, nuclear reactor coolant pumps, gas turbine engines for passenger aircraft or oil rig drill bearings.
As set out above, an output of the trained machine learning algorithm is the probability of the received data record being faulty (PF). The trained machine learning algorithm may also be used to generate the probability of each of the data records received within the look-back time being faulty. From these generated probabilities a binary fault value for each of the multiple data records (i.e. the received data record and the data records within the look-back time) may be determined. For example, the fault value for each of the multiple data records may be equal to 1 when a probability that the data record is faulty exceeds a probability threshold and 0 otherwise. The fault value Fi,t for each record i at time t may be defined as
where PF is the probability of the data record being faulty and Pth is the probability threshold.
The present invention uses real-time data from multiple sensors. Owing to the potential for transients (as defined in the background section above) leading to an erroneous classification of the status of a data record, a Bayesian model to predict the probability of failure with the horizon time Ht based on the generated probability PF and the number of data records which were determined to be faulty within the look-back time Lt is used. By considering individual received data records together with previous faulty records, the likelihood of one anomalous data record erroneously triggering a failure prediction may be reduced. Thus, the method mitigates against the risk of an incorrect diagnosis of failure of the engineering asset based on one real-time record.
This mitigation or smoothing of the effect of one data record may be expanded further by considering a rolling average of the faulty state value Fi,t over a period (or “envelope”) of time (duration eL). The method may thus comprise grouping the received data record with multiple previously received data records to form an envelope of data records. Determining what number of data records received in the look-back time are indicative of the engineering asset being in a faulty state may comprise determining what number of envelopes received in the look-back time are indicative of the engineering asset being in a faulty state.
Each envelope may have an envelope length (eL) and may contain a sequence of all data records recorded during a time interval which is equal in length to the envelope length eL. The time interval may be measured back in time from the recordal time t of the final (or most recent) data record of the sequence of data records. There may be the same or a different number of data records in each data envelope. When each of the data records is captured at a set, uniform frequency, there will be the same number of data records within each envelope. However, the data records may be recorded at irregular time intervals and in such an instance, the envelopes may contain different numbers of data records.
The use of a rolling average (i.e. envelopes) has the effect of “smoothing” out the impact that a rapid transient/anomalous record may have on the determination of whether the underlying monitored engineering asset truly is in a faulty or “non-faulty” state. This rolling average is hereafter referred to as the sick rate. The sick rate of each data envelope may be equal to the sum of each fault value for the multiple data records within the data envelope, divided by the number of data records within the data envelope. In other words, the sick rate may be a normalized sum of the fault values. The sick rate SR may be calculated as defined below:
where Fi,t is the binary fault value for each record i within the envelope and N is the number of records within the envelope.
From these generated sick rates, a binary sick value for each of the multiple data envelopes may be determined. A data envelope (recorded in the j-th position of a sequence of envelopes and terminating at a time t) may be determined to be indicative of running in a faulty state when a sick rate of the data envelope equals or exceeds a sick rate threshold and a data envelope may be indicative of running in a non-faulty state when a sick rate of the data envelope is lower than the sick rate threshold. An envelope which is determined to be faulty state may be given a sick value of 1 and an envelope which is determined to be non-faulty state may be given a sick value of 0. The sick value Sj,t may be defined as:
where SR is the sick rate and Scrit is the sick rate threshold.
As set out above, the Bayesian model may be implemented to predict the probability of failure within the horizon time Ht based on the generated probability PF and the number of data records which were determined to be faulty or “non-faulty” within the look-back time Lt. For example, the Bayesian model may be defined as:
where FHt is the failure of the engineering asset within the time period which follows the moment of evaluation t and which is equal in length to the horizon time Ht, nLt is the number of faulty records or envelopes within the look-back time Lt prior to the moment of evaluation t, P(FHt|nLt) is the probability of failure of the engineering asset within the horizon time Ht given n records or envelopes classed as faulty in time Lt, P(nL,t|FHt) is the probability of n records or envelopes being classed as faulty in look back time Lt given that a failure is known to have occurred within the horizon time Ht, P(FHt) is the probability of failure of the engineering asset in any time period equivalent in length to the horizon time Ht and P(nLt) is the probability of n records or envelopes being classed as faulty in any time equivalent in length to the look-back time Lt. In other words, for any given record, a probability function which predicts the probability of the engineering asset failing completely in the horizon time from that record may be determined as a product of the probability of n records/envelopes being classed as faulty in the look-back time and the probability of the engineering asset failing in any time of length equivalent to the horizon time divided by the probability of n records being classed as faulty in any time of length equivalent to the look-back time.
As set out above, the output of the trained machine learning algorithm is the probability (PF) of the received data record indicating that the engineering asset is in a faulty state. The number of data records/envelopes (nLt) in the look-back time Lt which are indicative of the engineering asset being in a faulty state is also determined. In addition to the probability (PF) and the determined number of data records/envelopes, the Bayesian model may also be built based on knowledge of how many faults the engineering asset may manifest per unit time and/or how many records (or envelopes) may be classified as pertaining to a faulty state per unit time. This information may be obtained from a look-up table or other database which stores this information. The various probabilities which are input into the Bayesian model, e.g. P (nLt|FHt), P(FHt) and P(nLt) may be calculated using known techniques from the probability (PF), the number of faulty data records/envelopes nLt. the faults per unit time and the number of faulty records per unit time. This may be done using any suitable technique, for example a frequentist approach (i.e. consulting a stored historical database and determining the frequency of events) may be suitable. As an example, P(FHt) may be determined by calculating the total number of records for which there is a failure within the following horizon time, and dividing by the total number of records available.
The predicted probability may be compared to a failure threshold and when the predicted probability is equal to or exceeds the failure threshold, an alert may be output to a user. The alert may be an audio or visual cue, or a combination of both.
The Bayes forecasting model may comprise a plurality of sub-models, one for each of a plurality of different look-back times (L0, L1, L2, . . . , Lj). The method may comprises, for each sub-model: determining what number of data records (nL0, nL1, nL2, . . . nt,j) received in the corresponding look-back time (L0, L1, L2, . . . Lj) are indicative of the engineering asset being in a faulty state.
Before the implementation of the method and system described above, an arbitrary machine learning algorithm needs to be trained before use. The data records within the training data correspond to the data records which are received for prediction purposes. In other words, the training data comprises data records having data collected from at least the same set of sensors. The training data may be stored on a database which may be separate to the system which implements the prediction. Indeed, the training may be carried out on a separate system and the trained algorithm transferred to the prediction system when required
The method may comprise training the machine learning algorithm by receiving multiple data records for at least one engineering asset which corresponds to the engineering asset for which failure is to be predicted classifying each of the multiple data records as either indicative of at least one engineering asset running in an acceptable state or running in a faulty state. Classifying an engineering asset as running in an “acceptable” running state may be considered as a “non-faulty” running state and may also be considered as an alternative to the known method of classification as a “healthy” state. The multiple data records may comprise data previously collected from the plurality of sensors at a sequence of times prior to failure of at least one engineering asset which may be identical to the engineering asset being monitored.
The training may be implemented separately to the prediction and thus according to another aspect of the invention, there is provided a computer-implemented method for training a machine learning algorithm to predict failure of an engineering asset based on real-time data, the method of training comprising: receiving multiple data records for at least one engineering asset which corresponds to the engineering asset for which failure is to be predicted, wherein the multiple data records comprise data previously collected from the plurality of sensors at a sequence of times prior to failure of the least one engineering asset; and classifying each of the multiple data records as either indicative of the at least one engineering asset running in an acceptable state or running in a faulty state.
Classifying each of the multiple data records as indicative of running in a faulty state may comprise obtaining a minimum remaining running time, MRRT, which is a time period before failure of the at least one engineering asset; and classifying each of the multiple data records which is within the minimum remaining running time as indicative of running in a faulty state. The MRRT may be obtained by an input from an operator who has knowledge of the engineering asset or may be input automatically by the management system. The MRRT may be determined based on cost/benefit calculations. Such a determination, in the context of a maritime vessel, might be performed by an insurer or maritime engineer. The MRRT may be any appropriate time period, e.g. measured in seconds, hours or even days, depending on the system.
Classifying each of the multiple data records as indicative of running in an acceptable state may comprise obtaining a minimum running time, MRT, which is a time period after the at least one similar engineering asset has been started; and classifying each of the multiple data records which is after the minimum running time and before the minimum remaining running time as indicative of running in an acceptable state. In other words, the “acceptable” running data is sampled from the data records between the MRT and the MRRT and thus this training uses acceptable running sampling (ARS). The MRT may be obtained by an input from an operator who has knowledge of the engineering asset or may be input automatically by the management system. The MRT may be any appropriate time period, e.g. measured in seconds, hours or even days, depending on the system. Although data received in the MRT may not be included in the training data, data received in the MRT for an engineering asset for which failure is being monitored is likely to be included when making the predictions.
As an alternative, or in addition to training using the acceptable running sampling (ARS) described above, time envelopes (as described above) may also be used. The method may comprise training the machine learning algorithm by receiving multiple data records for at least one engineering asset which corresponds to the engineering asset for which failure is to be predicted, wherein the multiple data records comprise data previously collected from the plurality of sensors at a sequence of times prior to failure of the least one engineering asset; grouping the multiple data records into a plurality of data envelopes, wherein each envelope has an envelope length (eL) and contains a sequence of all data records recorded during a time interval of length el; and targeting correct classification of each data record within each data envelope as indicative of running in a faulty state or running in a not-faulty state.
This training may also be implemented separately to the prediction and thus according to another aspect of the invention, there is provided a computer-implemented method for training a machine learning algorithm to predict failure of an engineering asset based on real-time data, the method of training comprising: receiving multiple data records for at least one engineering asset which corresponds to the engineering asset for which failure is to be predicted, wherein the multiple data records comprise data previously collected from the plurality of sensors at a sequence of times prior to failure of the least one engineering asset; grouping the multiple data records into a plurality of data envelopes, wherein each envelope have an envelope length (eL) and contains a sequence of all data records recorded during a time interval of length et; and targeting correct classification of each data record within each data envelope as indicative of running in a faulty state or running in a not-faulty state.
As set out above, there may be the same or a different number of data records in each data envelope. Targeting correct classification may comprise calculating a sick rate for each envelope, for example using equation 2 above. Targeting correct classification may comprise calculating a sick value for each envelope, for example using equation 3 above.
Thus, according to another aspect of the invention, there is provided a method for a computer-implemented method for detecting anomalous behavior of an engineering asset based on real-time data, the method comprising: receiving a data record comprising data on the engineering asset collected from a plurality of sensors at time t; and classifying, using a trained machine learning algorithm, the data record as faulty or not-faulty, wherein the machine learning algorithm has been trained as described above using one or both of acceptable running sampling and time envelopes. Classifying may further comprise grouping the received data record with a set of previously received data records to form a data envelope, classifying the data envelope as faulty or not-faulty and applying the classification of the data envelope to the data record.
According to another aspect of the invention, there is a non-transitory computer-readable medium comprising processor control code which when running on a system causes the system to carry out the method described above.
The various aspects described above may be considered to include a novel three-fold approach using one or more of acceptable running sampling, time envelopes and a Bayesian forecasting approach. The outputs may be presented as a decision aid in a graphical user interface (GUI) that allows the user to identify suitable maintenance actions. The application is designed such that the maximum acceptable use can be extracted from an engineering asset without excessively early intervention.
It will be appreciated that any one or more features described in relation to an aspect of the invention may replace, or be used in combination with, any one or more features described in relation to another aspect of the invention, unless such replacement or combination would be understood by the skilled person as mutually exclusive, after a reading of this disclosure. In particular, any features described in relation to apparatus-like aspects may be used in combination with, or in place of, any features described in relation to method-like aspects. They may also be carried out in any appropriate order.
For a better understanding of the invention, and to show how embodiments of the same may be carried into effect, reference will now be made, by way of example, to the accompanying diagrammatic figures in which:
The management system 10 is connected to a plurality of sensors, including a temperature sensor 50, a pressure sensor 52, an electrical current sensor 54, a voltage sensor 56 and other sensor(s) 58, e.g. a sensor to measure the mass of any fluids in the system. It will be appreciated that the depicted sensors are merely illustrative, and any suitable sensors may be included in the overall system. The combination of sensors may record a plurality of physical parameters, e.g. temperatures, pressures, salinity, chemical concentrations, fluid flow rates, electrical currents, voltages or any other physical parameter, which alone or together with digital parameters describing the configuration of the system, e.g. valve positions in a piping assembly, are indicative of whether the engineering asset is operating in a healthy or faulty state. For example, for a chilled water plant, the sensor data may include readings such as temperature and pressure of the water and the coolant, from before and after the pump, and pump mechanical power, current rate etc.
Some of the internal detail of the display system 10 is shown in
The processor 30 may process the received sensor data and training data in any appropriate way. As illustrated, the specific processing may be completed in dedicated modules, e.g. a machine learning engine (or module) 32 or a protection module 34. For example. the automated self-protection protocol module shown in
Returning to
Returning to
When the MRT and MRRT are obtained, as shown in
“Acceptable” running data may be considered as an alternative to including all the data records which are indicative of truly “healthy” running. This may alleviate issues with known classifiers in which “faulty” behavior is erroneously detected very early in a system's life cycle, prompting unnecessary human intervention, and provoking enormous error in RUL estimations (as with so much remaining life, the variations in the behavior of a vessel and the conditions a system is exposed to can be enormous). In the present proposal, it is assumed that a system will accrue wear and damage throughout its life yet continue to function and retain economic value.
For each data record i, if a probability (PF) of the record being faulty at a time t exceeds or equals a critical level (which may be referred to as a probability threshold, Pth), the record may be assigned a fault' value (Fi,t) of one. Alternatively, if a probability (PF) of the record being faulty at a time t is lower than the probability threshold, Pth, the fault value is zero, as defined below by equation 1:
The notation above in which the fault values are assigned both i and t subscript values reflects the fact that any given record can be labelled based on its ordinal position in the sequence of available data (i) and the time t at which it is recorded.
For each data envelope, a value which is termed the sick rate SR may be calculated from the fault values for the data records. The sick rate SR may be calculated as sum of the fault values Fi,t for each record within the envelope divided by the number of records (N) within the envelope as defined by equation
where eL is the envelope length as defined above.
In other words, the fault values for each record within a period of time (eL) preceding the most recent record at time t are summed together. It should also be noted that in this formulation, the sick rate can be applied both in the case of regular data supply (i.e. data being recorded for all sensor channels simultaneously at a set frequency), and in the case of irregular data supply. This is because it does not assume that the time between records is consistent (allowing for N to fluctuate for a given value of eL).
For each envelope j, if the sick rate equals or exceeds a critical threshold (which may be referred to as a sick rate threshold Scrit), the system may be determined to be faulty at the time t. An envelope which is determined to be in faulty state may be given a sick value of 1 and an envelope which is determined to be in a non-faulty state may be given a sick value of 0. The sick value is determined by equation 3:
The use of the probability and sick rate thresholds described above, provides new options for training the underlying machine learning algorithm. For example, as shown in step S110, arbitrary values for the probability threshold (Pth) and the sick rate threshold (Scrit) may be set.
The next step S112 is then to train the machine learning algorithm to target correct classification of the envelopes and the records within the envelopes as faulty or not-faulty using the received training data (i.e. to generate values for PF which, via equations 1, 2 and 3 lead to correct classification of the state of the engineering asset). The training may be done in any standard way, for example by evaluating an appropriate loss function. Examples of suitable loss functions include area under receiver operating or precision/recall curves. In this case, effort is directed primarily at optimizing the machine learning classification algorithm to produce values of PF which lead to correct classification of the state of a data envelope given set values of for the probability threshold (Pth) and the sick rate threshold (Scrit).
Another option for training the machine learning algorithm may be to obtain as in step S114 a pre-trained algorithm (e.g. one trained as in steps S110 and S112—or perhaps using a different training method). The next step S116 is then to train the machine learning algorithm to target the optimum values of the probability threshold (Pth) and the sick rate threshold (Sr) which best fit the training data to correctly classify the envelopes and hence the records within the envelopes as faulty or not-faulty. Again, the training may be done in any standard way, for example by simple brute force searching of the two-dimensional space described by these variables, and evaluating an appropriate loss function across the search space. In this case effort is directed primarily at optimising the values of for the probability threshold (Pth) and the sick rate threshold (Scrit) given a pre-trained machine learning classification algorithm.
Both training options lead to a trained algorithm being output at step S118. The trained algorithm may be used as described in
The management system may forecast the probability of the asset (or system) failing within a prescribed time period. This prescribed time period may be termed the horizon time (Ht) and may be set (or simply obtained) in step S124.
In addition to obtaining a horizon time which looks into the future, a look-back time (Lt) may also be set in step S126 of
The management system may then use the trained algorithm at step S127 to determine a first probability (PF) as to whether the new data record (recorded at time t) indicates that the asset which is being managed is “faulty” or “not-faulty”. As shown at step S128, equation 1 may then be applied to determine if the data record is indicative of the engineering system being in a faulty state (Fi,t has a value 1) or a non-faulty state (Fi,t has a value 0).
Optionally, the new data record may then be grouped with at least some of the previously received data records in the look-back time (Lt) to form a data envelope (labelled j and ending at time t) as described above. Each data envelope includes all records available within the envelope length time ex prior to the time t. The envelope length time et is normally shorter than the length of the look-back time (Lt) and other envelopes which occur before the envelope including the new data record may also be defined by grouping earlier data records. Once the records have been grouped, the sick rate (SR) of the (or each) envelope is calculated using equation 2 (step S132). The sick rate (SR) is used in equation 3 to determine the sick value (Sj,t) for the (or each) envelope. The sick value may be considered to be an overall classification of the system being in a faulty state (Sj,t has a value 1) or a non-faulty state (Sj,t has a value 0).
As shown in step S135, the number of faulty data records or envelopes (nLt) within the look-back time may then be determined. Using the look-back time (Lt), the horizon time (Ht) and the number of faulty records or envelopes (nLt) within the look-back time, the probability of the system failing with the horizon time (Ht) after the moment of calculation (t) may then be forecast as in step S136. The probability may be forecast by using a Bayes equation such as that shown in equation 4, which is repeated below:
where FHt is the failure of the system within the time period which follows the moment of evaluation t and which is equal in length to the horizon time Ht, nLt is the number of faulty records or envelopes within the look-back time Lt prior to the moment of evaluation f, P(FHt|nLt) is the probability of failure of the engineering asset within the horizon time Ht after the moment of evaluation t given n records/envelopes classed as faulty in the look-back time Lt prior the moment of evaluation t, P(nLt|FHt) is the probability of n records/envelopes being classed as faulty in look back time Lt prior to the moment of evaluation t given that a failure is known to have occurred within time Ht after the moment of evaluation t, P(FHt) is the probability of failure in any time period equivalent in length to the horizon time Ht and P(nLt) is the probability of n records/envelopes classed as failure in any time period equivalent in length to the look-back time Lt.
In other words, for any given record. the probability of the system failing completely in the horizon time from that record is determined as a product of the probability of n records/envelopes being classed as faulty in the look-back time measured from the time of the new data record and the probability of failure happening in any time period of length equivalent to the horizon time divided by the probability of n records/envelopes being classed as faulty in any time of length period equivalent to the look-back time. This formulation may permit the use of existing probabilistic computing techniques to determine the probability function, for example as described in the textbook “Practical Probabilistic Programming” by Avi Pfeffer, published by Manning Publications or by using standard software packages such as Edward, PyMCs and PyStan. It is also noted that a background to Bayesian optimization techniques is found in “Bayesian Approach to Global Optimization: Theory and Applications” by Jonas Mockus published by Springer Science & Business Media
Once the probability of failure has been determined, the management system may determine compare the calculated probability with a pre-set threshold determined by a user (step S138). If the calculated probability is greater than the threshold, an alert may be output to the user (step S140). For example, a visual alert may be output on the display screen or an audible alert may be output. The output may be presented as a decision aid in a graphical user interface on the display screen. Alternatively, or in addition to output the alert, the self-protection protocol may be triggered at step S140. The self-protection protocol may adjust one or more components within the engineering asset in an attempt to reduce the probability of system failure to below the threshold. In some instances, e.g. when the probability of system failure is significantly higher than the threshold, the self-protection protocol may determine that the engineering asset needs to be switched off completely. If the probability of system failure within the horizon time is below the threshold set by the user, the system continues to collect and evaluate more data.
Equation 4 may be extended to produce a plurality of hierarchical models comprising sub-models in which the number of records/envelopes (nL0, nL1, nL2, . . . nLju) classed as faulty through multiple look-back time windows (L0, L1, L2, . . . Li) of varying length are used to construct prior distributions. This may provide a more accurate prediction. Other adaptations may be made, e.g. to include more background or more detail. The adaptability of the Bayesian forecasting is a useful tool in improving the predictions.
Additional data may also be used to generate other sub-models. For example, total system run-time since last repair (survival analytics) may be used to generate sub-models. For marine vessels, metrics pertaining to the current state of the sea (or other water body) on which the marine vessel is operating may be used to generate sub-models. These additional sub-models may be brought into the overall hierarchical model to provide increasingly sophisticated forecasts for the probability of failure within the horizon time.
The three approaches described above (determining an acceptable running sample, use of time envelopes and Bayesian forecasting) may be combined to produce a fault detection algorithm with the ability to provide the user with an estimate of the probability of failure in the following horizon time. Appropriately trained and configured copies of the algorithm may work simultaneously on multiple systems and specific sub-systems. For example, a marine vessel may comprise a plurality of chilled water plants and four copies of the algorithm may be deployed, either having been trained on data accumulated on the specific chilled water plant they are to monitor. or having been trained on data pooled from several chilled water plants.
When training, the training data may have been sampled as “faulty” or “acceptable” using the MRT and MRRT described above and/or grouped in time envelopes. As set above, the training may be done by evaluating an appropriate loss function. The applicant has also recognized that more elegant solutions, such as the application of Bayesian optimisation functions may also be used and may be highly beneficial in terms of reduced computational overhead.
Attention is directed to all papers and documents which are filed concurrently with or previous to this specification in connection with this application and which are open to public inspection with this specification, and the contents of all such papers and documents are incorporated herein by reference.
Although a few preferred embodiments have been shown and described, it will be appreciated by those skilled in the art that various changes and modifications might be made without departing from the scope of the invention, as defined in the appended claims. The invention is not restricted to the details of the foregoing embodiment(s). The invention extends to any novel one, or any novel combination, of the features disclosed in this specification (including any accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed.
All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.
Number | Date | Country | Kind |
---|---|---|---|
21275151.5 | Oct 2021 | EP | regional |
2115423.2 | Oct 2021 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2022/052736 | 10/27/2022 | WO |