The invention relates to communication networks. Embodiments of the present invention relate generally to mobile communications and more particularly to network devices and methods in communication networks. In particular, the invention relates to a method for cell anomaly detection, to a network device, to a computer program product and a computer-readable medium.
Current cellular network management systems rely on human or automated alarm capabilities to assess the state of the network domain (i.e. check for alarms). Given the complexity and the continuous growth of cellular infrastructure, this process often does not scale well.
Consequently, there may be a need for an automated process in relation to cellular networks in order to detect cell anomaly.
According to an exemplary embodiment of the present invention there may be provided a method for cell anomaly detection in a network comprising receiving first training data of a first source; receiving second training data of a second source; generating profiles based on the first training data; generating profiles based on the second training data; collecting the generated profiles of the first training data and of the second training data in a pool of profiles; associating a weight with each profile in the pool of profiles; providing a set of predictions based on the profiles and their associated weights; and generating data for root cause diagnosis based on at least one prediction.
In the following exemplary embodiments are described in relation to the method. It should be understood that all features related to the method may be implemented as hardware and/or software in relation to one or more network devices.
According to exemplary embodiments of the present invention there may be provided a mechanism to manage an increased usage of multimedia streaming applications in mobile networks efficiently. The method may mine information from continuous streams of KPI data (KPI=Key Performance Indicator) and may determine deviation levels of KPIs/cells with high accuracy.
Moreover, according to an exemplary embodiment of the present invention the method may further comprise managing the pool of profiles. This could include adding profiles and/or removing profiles. It could also be foreseen utilizing an aging approach for removing the worst performing profile from the pool of profiles. Thus, aging out profiles could be performed. It could also be foreseen to provide a human input in order to remove profiles. Thus automatic mechanisms as well as manual mechanisms could be provided alone or could be combined.
Self-Organizing Networks (SON) may be seen as a key enabler for automated network management in next generation mobile communication networks such as LTE or LTE-A, as well as multi-radio technology networks known as heterogeneous networks (HetNet). SON areas include self-configuration, which may cover an auto-connectivity and initial configuration of new network elements (such as base stations), and self-optimization, which may target an optimal operation of the network, triggering automatic actions in case the demand for services, user mobility or usual application usability significantly changes that require adjusting network parameters as well as use cases such as energy saving or mobility robustness optimization. These functionalities are complemented by self-healing, which aims at automatic anomaly detection and fault diagnosis. Related areas may be Traffic Steering (TS) and Energy Savings Management (ESM).
For self-healing, typically only cell outage detection (COD) and cell outage compensation (COC) are mentioned as SON self-healing use cases. However, for exemplary embodiments of the present invention, Cell Anomaly Detection and Cell Diagnosis may be considered: both refer to the outage case and the case that the cell is still able to provide a certain level of service but its performance is below the expected level by an amount clearly visible to the subscribers as well. In other words a cell outage is a special case of degradation meaning that the cell is unable to provide any acceptable service, often meaning that users are not able to connect to it and there is no traffic in the cell at all. Furthermore, this approach clearly separates the detection (detecting relevant symptoms potentially pointing to degradations in the network) and diagnosis functionality (identifying the root cause of an incident).
Cell Anomaly Detection may be based on performance monitoring and/or alarm reporting. Performance data includes failure counters such as call drop, unsuccessful RACH access, etc. as well as more complex key performance indicators (KPIs) such as traffic load which needs to be monitored and profiled to describe the “usual” behavior of users and detect if patterns are changing towards a direction that indicates a problem in the network. Two different approaches for Cell Anomaly Detection are existing: a univariate approach where each individual KPI is considered independently, and a multivariate approach, where the correlation between KPIs is taken into account. Both univariate and multivariate detection approaches have been analyzed in the past. They share the characteristic that a (set of) certain “normal” state(s) are learned (called “profiles”) in the respective training phase. In the actual detection phase, deviations from those states are identified. An advantage is the highly automatic nature of the process (the operator only needs to verify the training phase as fault free and thus does not need to add per-KPI thresholds and the like). In order to analyze the root cause of a suspected fault, the different KPIs usually have to be correlated with each other to recognize the characteristic imprints of different faults.
Because of a wide range in the types of KPIs that need to be monitored, and the wide range of network incidents that need to be detected, no single traditional univariate or multivariate detection method (“classifier”) will be able to provide the desired detection performance. Detection performance relates to identifying correctly relevant events (true positive) and irrelevant events (true negatives), while avoiding missing relevant events (false negative) and incorrectly identifying events as being relevant (false positive). An exemplary ensemble method, as shown in
There are conventional cell outage detection and recovery methods especially for LTE technology However, typically available commercial features may not contain any “profiling”, but rather simple per-KPI thresholding and rule sets. Both univariate and multivariate approaches for cell anomaly/degradation detection have been proposed earlier, but without an ensemble method according to the present invention which takes into consideration the context information available from the network itself.
The ensemble method approach to achieve optimized detection performance when applied to the cell anomaly detection problem may be trained to determine and dynamically adjust weight parameter values for each individual detection method that is part of the ensemble method.
The present invention may provide determining and maintaining weight values so that the performance of the compound ensemble method may be continuously optimized for the data monitored to detect cell anomalies. Moreover, this approach may also propose a triggering mechanism for training new individual detection profiles and an aging mechanism for eliminating the less efficient ones.
The proposed framework may apply individual univariate and multivariate methods to the training KPI data leading to the construction of a pool of different predictors. Using the pool of predictors, the predictions obtained on the KPI data “under test” (i.e., being subject to detection) along with the weights allocated to each predictor lead to the computation of the “KPI level” (i.e., the deviation of a KPI from its “normal” state). The proposed methods rely on context information (available for cellular networks) extracted from human-generated, Configuration Management (CM) or confirmed Fault Management (FM) input data to take informed decisions
Embodiments of the present invention are described below with reference to the accompanying drawings, which are not necessarily drawn in scale, wherein:
1) performance data measurement or measurement collection;
2) degradation detection;
3) root cause diagnosis; and
4) solution deployment.
The degradation detection may have the task to find problematic cells with low false positive rate. The root cause diagnosis may have the task to infer the root cause of the detected degradation. The solution deployment may be triggered by the degradation detection or the root cause diagnosis components.
In summary characteristics of exemplary features of the present invention are:
The exemplary method of
The Weighted Majority Algorithm (WMA) is a meta-learning algorithm (supervised) used to construct a compound algorithm from a pool of prediction methods or prediction algorithms, which is leveraged by the proposed ensemble-based framework. WMA assumes that the problem is a binary decision problem (a sample is either normal or abnormal). Each prediction method or prediction algorithm from the pool has a weight associated with it. Initially, all weights are set to 1. The overall prediction is given by the collection of votes from all predictors. If the majority profiles in the pool make a mistake, their weights are decreased by a certain ratio 0<β<1.
The proposed ensemble method may implement a modified version of WMA that may return a KPI level in the range [0, 1] and may use the context information for updating the weights and creating new models. Initially, the algorithm may start with a set of profiles built using different univariate and multivariate algorithms and then may execute in a continuous fashion. In the following one example for such an implementation is given.
When a CM change is made in the system, a new profile set is created. If a predefined limit of number of models is reached, the worst-performing profiles are removed from the pool using an exponential decay approach (according to ωi*αage
If the algorithm has access to confirmed FM data or outlier information using homogeneous CM data, it uses this this information to train the weights corresponding to the different univariate and multivariate methods (M5):
where, th_perf is the threshold that determines if data is deemed normal or abnormal.
The KPI levels (D4) are computed according to the learnt weights as follows (M3):
The scheme described herein has been implemented experimentally and evaluated against real network data and has shown to have an anticipated superior detection performance.
CM Configuration Management
COC cell outage compensation
COD cell outage detection
ESM Energy Savings Management
FM Fault Management
GUI Graphical User Interface
KPI Key Performance Indicator
MDT Minimization of Drive Tests
MMI Man Machine Interface
NE Network Element
NM Network Management
OAM Operation, Administration and Maintenance
PM Performance Management
RACH Random Access Channel
RAT Radio Access Technology
SON Self-Organizing Networks
TS Traffic Steering
WMA Weighted Majority Algorithm
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2013/059914 | 5/14/2013 | WO | 00 |