METHOD AND NETWORK DEVICE FOR CELL ANOMALY DETECTION

Description

TECHNICAL FIELD

The invention relates to communication networks. Embodiments of the present invention relate generally to mobile communications and more particularly to network devices and methods in communication networks. In particular, the invention relates to a method for cell anomaly detection, to a network device, to a computer program product and a computer-readable medium.

BACKGROUND

Current cellular network management systems rely on human or automated alarm capabilities to assess the state of the network domain (i.e. check for alarms). Given the complexity and the continuous growth of cellular infrastructure, this process often does not scale well.

Consequently, there may be a need for an automated process in relation to cellular networks in order to detect cell anomaly.

SUMMARY

According to an exemplary embodiment of the present invention there may be provided a method for cell anomaly detection in a network comprising receiving first training data of a first source; receiving second training data of a second source; generating profiles based on the first training data; generating profiles based on the second training data; collecting the generated profiles of the first training data and of the second training data in a pool of profiles; associating a weight with each profile in the pool of profiles; providing a set of predictions based on the profiles and their associated weights; and generating data for root cause diagnosis based on at least one prediction.

In the following exemplary embodiments are described in relation to the method. It should be understood that all features related to the method may be implemented as hardware and/or software in relation to one or more network devices.

According to exemplary embodiments of the present invention there may be provided a mechanism to manage an increased usage of multimedia streaming applications in mobile networks efficiently. The method may mine information from continuous streams of KPI data (KPI=Key Performance Indicator) and may determine deviation levels of KPIs/cells with high accuracy.

Moreover, according to an exemplary embodiment of the present invention the method may further comprise managing the pool of profiles. This could include adding profiles and/or removing profiles. It could also be foreseen utilizing an aging approach for removing the worst performing profile from the pool of profiles. Thus, aging out profiles could be performed. It could also be foreseen to provide a human input in order to remove profiles. Thus automatic mechanisms as well as manual mechanisms could be provided alone or could be combined.

Self-Organizing Networks (SON) may be seen as a key enabler for automated network management in next generation mobile communication networks such as LTE or LTE-A, as well as multi-radio technology networks known as heterogeneous networks (HetNet). SON areas include self-configuration, which may cover an auto-connectivity and initial configuration of new network elements (such as base stations), and self-optimization, which may target an optimal operation of the network, triggering automatic actions in case the demand for services, user mobility or usual application usability significantly changes that require adjusting network parameters as well as use cases such as energy saving or mobility robustness optimization. These functionalities are complemented by self-healing, which aims at automatic anomaly detection and fault diagnosis. Related areas may be Traffic Steering (TS) and Energy Savings Management (ESM).

For self-healing, typically only cell outage detection (COD) and cell outage compensation (COC) are mentioned as SON self-healing use cases. However, for exemplary embodiments of the present invention, Cell Anomaly Detection and Cell Diagnosis may be considered: both refer to the outage case and the case that the cell is still able to provide a certain level of service but its performance is below the expected level by an amount clearly visible to the subscribers as well. In other words a cell outage is a special case of degradation meaning that the cell is unable to provide any acceptable service, often meaning that users are not able to connect to it and there is no traffic in the cell at all. Furthermore, this approach clearly separates the detection (detecting relevant symptoms potentially pointing to degradations in the network) and diagnosis functionality (identifying the root cause of an incident).

Cell Anomaly Detection may be based on performance monitoring and/or alarm reporting. Performance data includes failure counters such as call drop, unsuccessful RACH access, etc. as well as more complex key performance indicators (KPIs) such as traffic load which needs to be monitored and profiled to describe the “usual” behavior of users and detect if patterns are changing towards a direction that indicates a problem in the network. Two different approaches for Cell Anomaly Detection are existing: a univariate approach where each individual KPI is considered independently, and a multivariate approach, where the correlation between KPIs is taken into account. Both univariate and multivariate detection approaches have been analyzed in the past. They share the characteristic that a (set of) certain “normal” state(s) are learned (called “profiles”) in the respective training phase. In the actual detection phase, deviations from those states are identified. An advantage is the highly automatic nature of the process (the operator only needs to verify the training phase as fault free and thus does not need to add per-KPI thresholds and the like). In order to analyze the root cause of a suspected fault, the different KPIs usually have to be correlated with each other to recognize the characteristic imprints of different faults. FIG. 1 shows such a process and will be described later on.

Because of a wide range in the types of KPIs that need to be monitored, and the wide range of network incidents that need to be detected, no single traditional univariate or multivariate detection method (“classifier”) will be able to provide the desired detection performance. Detection performance relates to identifying correctly relevant events (true positive) and irrelevant events (true negatives), while avoiding missing relevant events (false negative) and incorrectly identifying events as being relevant (false positive). An exemplary ensemble method, as shown in FIG. 2 and described later on, may combine different classifiers and classifies new data points by taking a weighted vote of their prediction, effectively creating a new compound detection method that, with optimized weight parameter values learned by profiling the monitored data, provides an improved method compared to any other single method. Moreover, the ensemble method can also enable an increased level of automation.

There are conventional cell outage detection and recovery methods especially for LTE technology However, typically available commercial features may not contain any “profiling”, but rather simple per-KPI thresholding and rule sets. Both univariate and multivariate approaches for cell anomaly/degradation detection have been proposed earlier, but without an ensemble method according to the present invention which takes into consideration the context information available from the network itself.

The ensemble method approach to achieve optimized detection performance when applied to the cell anomaly detection problem may be trained to determine and dynamically adjust weight parameter values for each individual detection method that is part of the ensemble method.

The present invention may provide determining and maintaining weight values so that the performance of the compound ensemble method may be continuously optimized for the data monitored to detect cell anomalies. Moreover, this approach may also propose a triggering mechanism for training new individual detection profiles and an aging mechanism for eliminating the less efficient ones.

The proposed framework may apply individual univariate and multivariate methods to the training KPI data leading to the construction of a pool of different predictors. Using the pool of predictors, the predictions obtained on the KPI data “under test” (i.e., being subject to detection) along with the weights allocated to each predictor lead to the computation of the “KPI level” (i.e., the deviation of a KPI from its “normal” state). The proposed methods rely on context information (available for cellular networks) extracted from human-generated, Configuration Management (CM) or confirmed Fault Management (FM) input data to take informed decisions

BRIEF DESCRIPTION OF DRAWINGS

Embodiments of the present invention are described below with reference to the accompanying drawings, which are not necessarily drawn in scale, wherein:

FIG. 1 illustrates an exemplary cell anomaly/degradation detection and diagnosis;

FIG. 2 illustrates an exemplary general ensemble method approaches for anomaly detection;

FIG. 3 illustrates an exemplary overall approach of the proposed ensemble method applied to a single cell in a cellular networks; and

FIG. 4 illustrates exemplary aging mechanisms for the profile pool using context information.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

FIG. 1 illustrates a block diagram of a cell degradation management method, which may include four different boxes, representing tasks:

1) performance data measurement or measurement collection;

2) degradation detection;

3) root cause diagnosis; and

4) solution deployment.

The degradation detection may have the task to find problematic cells with low false positive rate. The root cause diagnosis may have the task to infer the root cause of the detected degradation. The solution deployment may be triggered by the degradation detection or the root cause diagnosis components.

FIG. 2 illustrates an exemplary embodiment of general ensemble method approaches for anomaly detection according to the present invention. The ensemble method learns its weight parameter values and takes the weighted vote of the different profiles in the pool of profiles as s final outcome of the KPI level.

FIG. 3 illustrates an exemplary embodiment of a detailed ensemble method approach. There may be provided a measurement collection which aims in a root cause diagnosis as shown in FIG. 1. The ensemble method or method in FIG. 3 may learn its weight parameter values based on confirmed FM data, human knowledge and/or CM data, used for determining cell outliers with homogeneous CM. The ensemble method uses CM changes to trigger the constructions of new profiles and to age profiles based on their performance. The boxes D1-D6 are representing data, whereas the boxes M1-M6 are representing steps of a method. The rest of the elements indicate different context information. The dashed lines indicate that an event is triggered in the presence of new evidence/data.

FIG. 3 presents details of an example of an ensemble method according to the present invention, wherein it is distinguished between data, methods, context information and human expert knowledge. Each cell of a cellular network may be characterized by a set of KPI measurements generated as a stream of data. The provided ensemble method may be applied to each cell.

- Initially, for a given period of time, the KPI measurements of a given cell are selected as the training dataset (D1) for the pool of profiles of the ensemble method.
- A diverse set of univariate and multivariate algorithms (M1) is applied to the training dataset (D1). The univariate methods operate at the individual KPI level, while the multivariate methods operate across all KPIs.
- The result of (M1) is a set of profiles used as the pool of profiles for the ensemble method (D2). Each profile in the pool of profiles has a weight associated with it. For the initial pool of profiles, all profiles have the same weight value associated.
- Given the pool of profiles (D2), the stream of KPIs is used in a continuous fashion as the testing dataset (D5) against the pool of predictors.
  - Any CM change (C1) triggers the testing dataset to also become training KPI dataset, after which the method for generating a new set of profiles (M1) is executed. The CM change is determined automatically, based on the state of CM data.
  - If the pool of profiles reaches the maximum number of profiles, the CM change also triggers an aging mechanism (M4), which removes profiles from the pool based on both their age and performance.
- The testing dataset (D5) is tested against the profiles in the pool of profiles using the testing techniques corresponding to the univariate and multivariate methods (M2).
- The result of (M2) is a set of KPI level predictions provided by each individual profile in the pool of profiles (D3). Some of the predictions are binary (0 for a normal KPI level and 1 for an abnormal KPI level) and some have continuous values in the [0,1] range.
  - Ground truth information updates (human expert knowledge (C2), confirmed FM data (C3) and cell classification based on CM information (D6)) triggers the update weights method (M5), which penalizes the profiles in the pool of predictors based on their prediction with regards to the ground truth. The human expert knowledge assumes a manual process, while the confirmed FM data usage and outlier detection applied to CM homogenous cells are automated processes.
    - Based on CM data (Cl), an outlier detection algorithm (M6) is applied to cells with identical configurations. The assumption is that CM homogenous cells (i.e., cells with identical/very similar configuration) should exhibit the same behavior across all KPIs. This component takes into consideration the behavior across multiple cells.
    - The result of (M6) indicates if the cell under test is considered an outlier or not (D6) with respect to cells with homogenous configurations.
  - The result of (M5) is an updated pool of profiles (D2) with adjusted weights, which continue to be used in the testing mode.
- All the predictions in (D3) along with the weights associated with the corresponding profiles are used in a modified weighed majority approach (M3) to generate the KPI level.
- The result of (M3) is the KPI level (D4) associated with each KPI measurement of each cell. The KPI level is then relayed to the Root Cause Diagnosis component.

In summary characteristics of exemplary features of the present invention are:

- Using human expert knowledge (C2) (allowing for visual inspection and direct input as ground truth) to automatically assess the classification quality of each individual profile and update the weights
- Exploiting context information such as CM, FM and special event information to
  - Label data as abnormal and update the ensemble method weights appropriately, which corresponds to real cell degradation phenomenon. This assumes that the FM information has been confirmed by human investigation.
  - Automatically trigger new profiles to be added to the pool of profiles of the ensemble methods based on CM information. With changes in the system, older models need to be aged out based on both age and/or performance (weights). For example, an exponential decay approach can be used for aging less accurate profiles.
  - Determine if a cell reached an anomalous state with regard to similarly configured cells, by leveraging homogenous CM information. Degrade the ensemble method weights corresponding to the outlier cells deemed normal by the corresponding profiles in the pool.

The exemplary method of FIG. 3 can be categorized as “supervised learning”, i.e., it exposes an interface to a human operator, where the weights and corresponding performance associated with the different detection methods are visible, and enables him with the ability to provide ground truth information on the actual state of the cell under test. Hence the respective MMI (GUI) is characteristic for the invention.

The Weighted Majority Algorithm (WMA) is a meta-learning algorithm (supervised) used to construct a compound algorithm from a pool of prediction methods or prediction algorithms, which is leveraged by the proposed ensemble-based framework. WMA assumes that the problem is a binary decision problem (a sample is either normal or abnormal). Each prediction method or prediction algorithm from the pool has a weight associated with it. Initially, all weights are set to 1. The overall prediction is given by the collection of votes from all predictors. If the majority profiles in the pool make a mistake, their weights are decreased by a certain ratio 0<β<1.

The proposed ensemble method may implement a modified version of WMA that may return a KPI level in the range [0, 1] and may use the context information for updating the weights and creating new models. Initially, the algorithm may start with a set of profiles built using different univariate and multivariate algorithms and then may execute in a continuous fashion. In the following one example for such an implementation is given.

When a CM change is made in the system, a new profile set is created. If a predefined limit of number of models is reached, the worst-performing profiles are removed from the pool using an exponential decay approach (according to ω_i*α^ageⁱ, where α ∈ [0,1] and age_iis the number of hours since the model was created).

If the algorithm has access to confirmed FM data or outlier information using homogeneous CM data, it uses this this information to train the weights corresponding to the different univariate and multivariate methods (M5):

for all KPI levels in training data {

q₀= Σ_KPI_level_i_<th_perf ω_i
(normal)

q₁= Σ_KPI_level_i_≧th_perf ω_i
(abnormal)

K custom-character

vel

= {\begin{matrix} \frac{\sum_{KPI_{level}_{i} \geq th_perf} ω_{i} * {KPI_level}_{i}}{\sum_{{KPI}_{{level}_{i}} \geq th_perf} ω_{i}}, if q_{1} > q_{0} \\ \frac{\sum_{KPI_{level}_{i} < th_perf} ω_{i} * {KPI_level}_{i}}{\sum_{{KPI}_{{level}_{i}} < th_perf} ω_{i}}, if q_{1} \leq q_{0} \end{matrix}

(voting)

∀i: if KPI_level_i < th_perf & abnormal | KPI_level_i ≧ th_perf & normal,

then ω_i← β * ω_i

}

where, th_perf is the threshold that determines if data is deemed normal or abnormal.

The KPI levels (D4) are computed according to the learnt weights as follows (M3):

for all KPI levels in testing data {

q₀= Σ_KPI_level_i_<th_perf ω_i
(normal)

q₁= Σ_KPI_level_i_≧th_perf ω_i
(abnormal)

K custom-character

vel

= {\begin{matrix} \frac{\sum_{KPI_{level}_{i} \geq th_perf} ω_{i} * {KPI_level}_{i}}{\sum_{{KPI}_{{level}_{i}} \geq th_perf} ω_{i}}, if q_{1} > q_{0} \\ \frac{\sum_{KPI_{level}_{i} < th_perf} ω_{i} * {KPI_level}_{i}}{\sum_{{KPI}_{{level}_{i}} < th_perf} ω_{i}}, if q_{1} \leq q_{0} \end{matrix}

(voting)

}

The scheme described herein has been implemented experimentally and evaluated against real network data and has shown to have an anticipated superior detection performance.

FIG. 4 illustrates an aging mechanism for a pool of profiles comprising profiles P₁-P_Nincluding their respective weighting factor ω₁-ω_N. If a context information, such as a CM information, changes a current profile, here profile P₁, is deleted due to its age compared to the other profiles P₂-P_N. This means the oldest profile P₁and its weighting factor ω₁are deleted in the pool of profiles. In summary FIG. 4 illustrates how context information can be leveraged for creating and aging out profiles (e.g., based on CM data).

LIST OF ABBREVIATIONS

CM Configuration Management

COC cell outage compensation

COD cell outage detection

ESM Energy Savings Management

FM Fault Management

GUI Graphical User Interface

KPI Key Performance Indicator

MDT Minimization of Drive Tests

MMI Man Machine Interface

NE Network Element

NM Network Management

OAM Operation, Administration and Maintenance

PM Performance Management

RACH Random Access Channel

RAT Radio Access Technology

SON Self-Organizing Networks

TS Traffic Steering

WMA Weighted Majority Algorithm

Claims

1. Method for cell anomaly detection in a network comprising: receiving first training data of a first source;receiving second training data of a second source;generating profiles based on the first training data;generating profiles based on the second training data;collecting the generated profiles of the first training data and of the second training data in a pool of profiles;associating a weight with each profile in the pool of profiles;providing a set of predictions based on the profiles and their associated weights; andgenerating data for root cause diagnosis based on at least one prediction.
2. Method according to claim 1, wherein the first source is an anomaly detection method based on an univariate approach and the second source is an anomaly detection method based on an multivariate approach.
3. Method according to claim 1, the method further comprises generating a further profile in the pool of profiles by using a context information, wherein the context information is a configuration management information.
4. Method according to claim 1, the method further comprises: detecting a change of a context information; andtriggering an update of at least one weight.
5. Method according to claim 1, the method further comprises providing at least one weight based on a cell classification.
6. Method according to claim 1, the method further comprises providing at least one weight based on human expert knowledge.
7. Method according to claim 1, the method further comprises providing at least one weight based on confirmed Fault Management data.
8. Method according to claim 1, the method further comprises utilizing Key Performance Indicator measurements for the first training data or the second training data.
9. Method according to claim 1, the method further comprises generating a Key Performance Indicator level for a root cause diagnosis component.
10. Method according to claim 1, the method further comprises: testing a testing dataset against one or a plurality of profiles in the pool of profiles; andgenerating from that testing a set of predictions provided by each tested profile in the pool of profiles.
11. Method according to claim 10, the method further comprises utilizing the set of predictions for updating the weights.
12. Method according to claim 1, the method further comprises managing the pool of profiles.
13. Method according to claim 1, wherein the method is applied to cells in a network, wherein the method further comprises distinguishing between outlier cells and homogenous cells.
14. Network device installed in a network, comprising a receiving unit for receiving first training data of a first source and for receiving second training data of a second source;a computing unit for generating profiles based on the first training data and for generating profiles based on the second training data;a memory for collecting the generated profiles of the first training data and of the second training data in a pool of profiles; andwherein the computing unit is utilized for associating a weight with each profile in the pool of profiles; for providing a set of predictions based on the profiles and their associated weights; and for generating data for root cause diagnosis based on at least one prediction.
15. Computer program product embodied on a non-transitory computer-readable medium, said product comprising code portions for causing a network device, on which the computer program is executed, to carry out the method according to claim 1.
16. (canceled)

PCT Information

Filing Document	Filing Date	Country	Kind
PCT/EP2013/059914	5/14/2013	WO	00

METHOD AND NETWORK DEVICE FOR CELL ANOMALY DETECTION

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

PCT Information