The present application generally relates to analyzing measurement results of a target system.
This section illustrates useful background information without admission of any technique described herein representative of the state of the art.
There are various automated measures that monitor operation of complex target systems, such as communications networks or industrial processes, in order to detect problems so that corrective actions can be taken.
For example anomaly detection models may be used for analyzing the measurement results to identify anomalous measurement results or data points that stand out from the rest of the data. Anomaly detection refers to identification of data points, items, observations, events or other variables that do not conform to an expected pattern of a given data sample or data vector. Anomaly detection models can be trained to learn the structure of normal data samples. The models output an anomaly score for an analysed sample, and the sample is classified as an anomaly, if the anomaly score exceeds some predefined threshold. There are various unsupervised and semi-supervised learning models that can be used in anomaly detection. Such models include for example k nearest neighbors (kNN), local outlier factor (LOF), principal component analysis (PCA), kernel principal component analysis, independent component analysis (ICA), isolation forest, autoencoder, angle-based outlier detection (ABOD), and others. Different models represent different hypotheses about how anomalous points stand out from the rest of the data.
Now a new approach is provided for analyzing measurement results of a target system.
The appended claims define the scope of protection. Any examples and technical descriptions of apparatuses, products and/or methods in the description and/or drawings not covered by the claims are presented not as embodiments of the present disclosure but as background art or examples useful for understanding the present disclosure.
According to a first example aspect there is provided a computer implemented method for analyzing measurement results of a target system. The method comprises
In some example embodiments, the information derived from the fifth matrix comprises an aggregated score for each row of the fifth matrix.
In some example embodiments, the target system is a communications network. In an alternative embodiment, the target system is an industrial process.
In yet another embodiment, the target system is a life science application.
In some example embodiments, each row of the matrices relates to respective one or more properties.
In some example embodiments, the first and second matrices are accompanied with a property matrix comprising a combination of properties for each row of the first and second matrices, and wherein the subset that matches the second matrix is selected based on respective combinations of properties.
In some example embodiments, the properties comprise one or more of the following: time, location, device type, device identifier, logical element, event type, management system.
In some example embodiments, the target system is a communications network and the properties comprise one or more of the following: time, location, subscriber type, subscription type, network technology, cell type, cell identifier, device type, device identifier, logical element, event type, antenna type, roaming network, management system.
In some example embodiments, the first measurement results comprise measurement results for a 24 hour time period or multiple thereof.
In some example embodiments, each row of the first and second matrices comprise measurement results aggregated over a 5-30 minute time period.
In some example embodiments, wherein the second measurement results comprise measurement results for a 5-30 time minute period or multiple thereof.
In some example embodiments, the first measurement results of the first matrix comprise measurement results of a previous day and the second measurement results of the second matrix comprise at least part of measurement results of a current day.
According to a second example aspect of the present disclosure, there is provided an apparatus comprising a processor and a memory including computer program code; the memory and the computer program code configured to, with the processor, cause the apparatus to perform the method of the first aspect or any related embodiment.
According to a third example aspect of the present disclosure, there is provided a computer program comprising computer executable program code which when executed by a processor causes an apparatus to perform the method of the first aspect or any related embodiment.
According to a fourth example aspect there is provided a computer program product comprising a non-transitory computer readable medium having the computer program of the third example aspect stored thereon.
According to a fifth example aspect there is provided an apparatus comprising means for performing the method of the first aspect or any related embodiment.
Any foregoing memory medium may comprise a digital data storage such as a data disc or diskette, optical storage, magnetic storage, holographic storage, opto-magnetic storage, phase-change memory, resistive random access memory, magnetic random access memory, solid-electrolyte memory, ferroelectric random access memory, organic memory or polymer memory. The memory medium may be formed into a device without other substantial functions than storing memory or it may be formed as part of a device with other functions, including but not limited to a memory of a computer, a chip set, and a sub assembly of an electronic device.
Different non-binding example aspects and embodiments have been illustrated in the foregoing. The embodiments in the foregoing are used merely to explain selected aspects or steps that may be utilized in different implementations. Some embodiments may be presented only with reference to certain example aspects. It should be appreciated that corresponding embodiments may apply to other example aspects as well.
Some example embodiments will be described with reference to the accompanying figures, in which:
In the following description, like reference signs denote like elements or steps.
A challenge in analyzing measurement results from complex target systems, such as communications networks or industrial processes or life science applications, is that the amount of data is huge and therefore identification of most relevant anomalous measurement results is not an easy task.
In the context of present disclosure, measurement results of a target system may involve sensor data and/or performance data such as pressure, temperature, manufacturing time, yield of a production phase etc. of an industrial process, or sensor data and/or performance data such as key performance indicator values, signal level, number of users, number of dropped connections etc. from a communications network. Still further, the measurement results of a target system may involve patient test results and/or sensor data from sensors monitoring patients.
In an embodiment of the present disclosure the scenario of
The process in the automation system 111 may be manually or automatically triggered. Further, the process in the automation system 111 may be periodically or continuously repeated.
The apparatus 20 comprises a communication interface 25; a processor 21; a user interface 24; and a memory 22. The apparatus 20 further comprises software 23 stored in the memory 22 and operable to be loaded into and executed in the processor 21. The software 23 may comprise one or more software modules and can be in the form of a computer program product.
The processor 21 may comprise a central processing unit (CPU), a microprocessor, a digital signal processor (DSP), a graphics processing unit, or the like.
The user interface 24 is configured for providing interaction with a user of the apparatus. Additionally or alternatively, the user interaction may be implemented through the communication interface 25. The user interface 24 may comprise a circuitry for receiving input from a user of the apparatus 20, e.g., via a keyboard, graphical user interface shown on the display of the apparatus 20, speech recognition circuitry, or an accessory device, such as a headset, and for providing output to the user via, e.g., a graphical user interface or a loudspeaker.
The memory 22 may comprise for example a non-volatile or a volatile memory, such as a read-only memory (ROM), a programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), a random-access memory (RAM), a flash memory, a data disk, an optical storage, a magnetic storage, a smart card, or the like. The apparatus 20 may comprise a plurality of memories. The memory 22 may serve the sole purpose of storing data, or be constructed as a part of an apparatus 20 serving other purposes, such as processing data.
The communication interface 25 may comprise communication modules that implement data transmission to and from the apparatus 20. The communication modules may comprise a wireless or a wired interface module(s) or both. The wireless interface may comprise such as a WLAN, Bluetooth, infrared (IR), radio frequency identification (RF ID), GSM/GPRS, CDMA, WCDMA, LTE (Long Term Evolution) or 5G radio module. The wired interface may comprise such as Ethernet or universal serial bus (USB), for example. The communication interface 25 may support one or more different communication technologies. The apparatus 20 may additionally or alternatively comprise more than one of the communication interfaces 25.
A skilled person appreciates that in addition to the elements shown in
The example of
In an embodiment, the first matrix covers measurement results for a 24 hour time period or a multiple of 24 hour time periods. Each row of the first matrix may comprise aggregated measurement results over a 15 minute time period or over a 5-30 minute time period, but equally some other time period could be covered by each row of the matrix. The aggregation may be based on sum of values, mean of values or standard deviation of values. Additionally or alternatively, the values of the matrix may be centered so that every column of the matrix has zero mean and unit variance. Still further, the values of the matrix may be rounded.
For example low-rank and sparse decomposition algorithms, robust PCA (Principal Component Analysis) or robust autoencoders can be used for this purpose. Robust autoencoders are discussed for example in Pu, Jie, Yannis Panagakis, and Maja Pantic. “Learning low rank and sparse models via robust autoencoders.” ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2019. Robust PCA is discussed for example in Candes, Emmanuel J., et al. “Robust principal component analysis?.” Journal of the ACM (JACM) 58.3 (2011): 1-37, and in Bouwmans, Thierry, et al. “On the applications of robust PCA in image and video processing.” Proceedings of the IEEE 106.8 (2018): 1427-1457.
A straightforward solution might be to treat the fourth matrix 304 as a result of the analysis, but in present disclosure this is not the case. Instead, in present disclosure the result of the decomposition is used for analyzing later data.
The second matrix is obtained the same way as the first matrix, but over a different time period. That is, the first measurement results and the second measurement results relate to measurement of the same phenomena or the same target over different time periods. In an example embodiment, the first matrix covers measurements of the previous day and the second matrix covers at least part of measurements of the current day.
The information derived from the fifth matrix 308 may comprise for example an aggregated score for each row of the fifth matrix. E.g. sum over the values of the row may be used. The row with the highest score may then be considered most anomalous result.
In general, each row of the different matrices relates to respective one or more properties. The properties define operating context in which respective measurement result is obtained. The following is non-exclusive list of possible properties: time, location, device type, device identifier, logical element, event type, product type, production phase, production equipment, management system. In the context of communications network the following properties may be additionally or alternatively used: subscriber type, subscription type, network technology, cell type, cell identifier, antenna type, roaming network. Other properties may be used, too.
The subset 307 may be selected based on properties related to the rows of the second matrix 306. That is, such rows of the third matrix may be selected for the subset that have at least partially the same properties as rows of the second matrix. In an example embodiment, the property is time and the subset 307 is selected by selecting from the third matrix 303 rows that have corresponding time stamp with the rows of the second matrix 306.
By using the matrix decomposition methods as defined in the method of
The example of
The following is non-exclusive list of possible properties: time, location, device type, device identifier, logical element, event type, product type, production phase, production equipment, management system. In the context of communications network the following properties may be additionally or alternatively used: subscriber type, subscription type, network technology, cell type, cell identifier, antenna type, roaming network. Other properties may be used, too.
It is to be noted that the amount of possible incidents may be very large. For example in the context of communications network, there may be 40 000 different incidents substantially at the same time. If each incident is considered for example every 15 minutes, the amount of data increases quickly. Based on this it is clear that the amount of measurement result to analyze may be significantly large.
The example of
received. The first matrix comprises first measurement results of the target system the same way as in phase 311 of
For the sake of clarity it is noted that the property matrix 401 applies to the third and fourth matrices as well. That is, the rows of the matrices 303 and 304 relate to the incidents defined by respective combination of properties of the property matrix.
The second property matrix 406 has corresponding structure with the first property matrix 401.
By arranging the measurement results by the incidents, the analysis may improve as the outcome of the analysis directly provides combination of properties that may have a problem and should be considered for possible corrective actions.
In the following, a practical example is discussed. The example relates to a communications network. Measurement results that are considered in the example comprise the following variables: moc_drops, moc_answers, lu_failures, lu_attempts, call_setup_failures, mtc_drops, mtc_answers, and mtc_attempts, wherein moc=mobile originated call, lu=location update, and mtc=mobile terminated call. These variables are readily available from communications networks.
Table 1 shows a first matrix of first measurement results. The values of the first matrix are centered so that each column has zero mean and unit variance.
The first matrix is accompanied with a first property matrix shown in Table 2. The first property matrix comprises combination of the following properties for each row of the first matrix: logical element, event type, cell id, roaming network, subscription type, network technology, and management system. Possible logical elements comprise RANAP (Radio Access Network Application Part), DTAP CC (Direct Transfer Application Part CC), and BSSMAP (Base Station System Management Application Part). Event type refers in this example to a release reason (final state of a signal). Management system in this example can be EMSS4 or EMSS5 (Element Management System).
Tables 3 and 4 show result of decomposition of the first matrix. It is to be noted that the rows and the property combinations of the property matrix of Table 2 are associated with respective rows of the decomposition results, too.
Table 5 shows a second matrix of second measurement results. The second matrix comprises the same variables as the first matrix and the values of the second matrix are centered using the mean and standard deviation values of the columns of the first matrix.
The second matrix is accompanied with a second property matrix shown in Table 6. The second property matrix comprises the same properties as the first property matrix.
The property combinations of the second and the first property matrices are used for selecting a subset of the third matrix for the purpose of analyzing the second matrix. In the shown example property combinations of rows 6, 9, 7, 3 and 10 of the first property matrix correspond to the property combinations of the rows 1-5 of the second property matrix. Therefore a subset comprising rows 6, 9, 7, 3 and 10 of the third matrix is selected. The subset is shown in Table 7.
Table 8 shows fifth matrix obtained by subtracting the subset of Table 7 from the second matrix. The fifth matrix provides numerical indication of the amount of anomaly on each row of the second matrix. The fifth matrix may be output as a result of the analysis.
The content of the fifth matrix may be further processed to determine aggregated score for each row of the second matrix. Table 9 shows such aggregated scores for the second matrix of Table 5. In this example, the values of each row are summed to obtain the aggregated score of Table 9. The row with the highest score, i.e. row 2 in this example, may then be considered most anomalous result.
Without in any way limiting the scope, interpretation, or application of the claims appearing below, a technical effect of one or more of the example embodiments disclosed herein is improved analysis of measurement results of a complex target system. Various embodiments suit well for analyzing large sets of multivariate measurement results. Such analysis is impossible or at least very difficult to implement manually. Various embodiments provide for example that process variables of a complex target system may be monitored to control whether all parameters remain stable over time. Further, various embodiments may be used in life science domain for learning normal or stable patterns from patients and for using this to detect anomalous or unstable patterns in new patients. New patients can be compared e.g. to some already analysed patients with similar profile (properties of the property matrices 401 and 406 of
A further technical effect is that faster detection of anomalies in new data may be enabled. For example fitting a robust PCA model on both earlier and current data and using the resulting matrix of anomalous or unstable measurement results (the fourth matrix 304 of
If desired, the different functions discussed herein may be performed in a different order and/or concurrently with each other. Furthermore, if desired, one or more of the before-described functions may be optional or may be combined
Various embodiments have been presented. It should be appreciated that in this document, words comprise, include and contain are each used as open-ended expressions with no intended exclusivity.
The foregoing description has provided by way of non-limiting examples of particular implementations and embodiments a full and informative description of the best mode presently contemplated by the inventors for carrying out the solutions of the present disclosure. It is however clear to a person skilled in the art that the present disclosure is not restricted to details of the embodiments presented in the foregoing, but that it can be implemented in other embodiments using equivalent means or in different combinations of embodiments without deviating from the characteristics of the present disclosure.
Furthermore, some of the features of the afore-disclosed example embodiments may be used to advantage without the corresponding use of other features. As such, the foregoing description shall be considered as merely illustrative of the principles of the present disclosure, and not in limitation thereof. Hence, the scope of the solutions of the present disclosure is only restricted by the appended patent claims.
Number | Date | Country | Kind |
---|---|---|---|
20206329 | Dec 2020 | FI | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/FI2021/050820 | 11/29/2021 | WO |