The present invention relates to an alarm monitoring device that monitors alarms transmitted when failures occur in a communication network.
Conventionally, there are alarm monitoring devices that, when failures occur in a communication network, transmit alarms from parts where the failures are detected and monitor the alarms. For example, Patent Literature 1 mentioned below discloses the following technique. In a system such as a communication network where alarms are generated from a large number of nodes into a chain reaction, an alarm monitoring device monitors generated alarms and when a large number of alarms of the same type are generated within a certain time, one alarm is displayed to an operator as a representative alarm, so that an operator can quickly recognize that there are the large number of alarms.
Furthermore, Patent Literature 2 mentioned below discloses a technique that an alarm monitoring device learns a correlation or a generating frequency of generated alarms, and selects and displays only statistically related alarms, so that an operator automatically selects an alarm required for identifying a cause of the alarm and an operation load of the operator is reduced.
However, according to the above conventional techniques, when a large number of alarms are generated from a large number of nodes into a chain reaction, in order to narrow down alarm information to be displayed to an operator, the alarm monitoring device filters derivatively generated alarms to automatically extract only a causative alarm and displays this alarm to an operator. Accordingly, in the technique of Patent Literature 1 mentioned above, there is a problem that every time a new device or a new function is added to a system to be managed, programs that analyze a dependence relationship between alarms need to be modified. According to the technique of Patent Literature 2 mentioned above, the alarm monitoring device automatically learn a new dependent relationship, however, it takes time to complete the learning.
The present invention has been made in view of the above problems, and an object of the present invention is to provide an alarm monitoring device that can effectively narrow down alarm information and display the alarm information for an operator, regardless of a dependence relationship between a wide range of alarms in a communication network where a large number of alarms are generated into a chain reaction.
To solve the above problems and achieve an object, there is provided an alarm monitoring device according to the present invention that monitors, in a network which includes a plurality of transmission devices and establishes a route connecting one transmission device to another transmission device via a transmission line or via a transmission line and a transmission device other than the two transmission devices as a path, when a failure has occurred in the transmission device or the transmission line, an alarm transmitted from a transmission device that has detected the failure, the alarm monitoring device including: an alarm database for recording therein information included in the alarm, the information being on a detected time, a generated location, an alarm importance, and an ID of a path having the generated location as a route; and a monitoring control unit configured to register information of a received alarm in the alarm database, extract one alarm from the alarm database for each path ID based on an alarm importance and a detected time, determine the detected time as an alarm generation time, associate a path ID of the extracted alarm with the alarm generation time and the alarm importance, and cause to display the path ID as an alarm-generating path list.
The alarm monitoring device according to the present invention can effectively narrow down alarm information and display the alarm information for an operator.
Exemplary embodiments of an alarm monitoring device according to the present invention will be explained below in detail with reference to the accompanying drawings. The present invention is not limited to the embodiments.
The alarm monitoring device 1 performs centralized management on alarms transmitted by the respective communication nodes 100-1 to 100-N, notifies an operator who is responsible for managing the communication network of generation of an alarm, and provides alarm information. The alarm monitoring device 1 includes a monitoring control unit 11, a display unit 12, a path database (DB) 13, and an alarm database (DB) 14.
The monitoring control unit 11 controls an operation of the entire alarm monitoring device 1, receives alarms transmitted from the respective communication nodes 100-1 to 100-N, and stores the alarms in the alarm DB 14. The monitoring control unit 11 displays a path, in which a communication failure is occurring, as an alarm-generating path list in a list format on the display unit 12.
The alarm-generating path list is explained.
The display unit 12 is a display unit for displaying the alarm-generating path list created by the monitoring control unit 11. While a monitor can be provided as the display unit 12, the display unit 12 is not limited to a device that display lists on screens, and a printer or the like can be used.
The path DB 13 is a database for recording therein the path ID, a display mask flag, the alarm importance, the alarm generation time, and route information of a path in the communication network. The route information indicates information on a communication node, a transmission device, and a transmission line the path passes through.
13. The contents explained above are recorded for each path ID. For example, it is shown that a path having an ID “∘∘∘∘” has a display mask flag of “OFF”, an alarm importance of “Major”, an alarm generation time of “2010/11/16 11:12:15”, and route information of “node-1 device 1, transmission line-N transmission line 3, and node-N device 1”.
In the path DB 13, when the display mask flag is ON, the display mask flag indicates that a path is not displayed on the alarm-generating path list even while an alarm is generated. The alarm importance is an importance of an alarm with the highest importance out of alarms being generated in a path. The alarm generation time is a generation time of a most recently generated alarm out of alarms with the alarm importance.
The alarm-generating path list when the display mask flag is ON is explained.
The alarm DB 14 is a database for storing detailed information on generated alarms. For every alarm, an alarm generated location, a detected time, an alarm type, the alarm importance defined for each alarm type in advance, and an identifier (ID) of a related path are recorded. The monitoring control unit 11 can display contents of the alarm DB 14 as a generating alarm list on the display unit 12 so that an operator can refer to the list.
Referring back to the configuration of the communication network shown in
Each of the communication nodes 100-1 to 100-N is a communication node including a node management device and a plurality of transmission devices. Each of the communication nodes 100-1 to 100-N includes a large number of client interfaces for providing a communication service using a transmission line.
The transmission line is a communication line connecting the communication nodes to each other. While examples of the transmission line include optical fiber cables of a metro network and a submarine cable system, the transmission line is not limited thereto. While the transmission line is arranged so that all the communication nodes are connected to each other by a linear topology in
A flow of an alarm to the alarm monitoring device when a failure has occurred in the communication network shown in
The node management device 101-1 is one of devices that configure the communication node 100-1 and is a management device that transmits alarms from the transmission devices 201-1, 201-2, . . . , and 201-n, which configure the communication node 101-1, to the alarm management device 1.
The transmission devices 201-1, 201-2, . . . , and 201-n are devices that configure the communication node 100-1. Each of the transmission devices includes a plurality of client interfaces, and transmits and receives data to and from transmission devices belonging to other communication nodes by using the transmission line.
While the communication node 100-1 is explained, the communication nodes 100-2, . . . , and 100-N have the same configuration.
Transmission lines 31 and 32 each connects the communication nodes to each other. The transmission lines 31 and 32 are same as the transmission line shown in
The path 41 is a communication line that is configured in advance to connect a client interface of one transmission device (hereinafter, “Ingress transmission device”) serving as a start point to a client interface of the other transmission device (hereinafter, “Egress transmission device”) serving as an end point. When the Ingress transmission device is not directly connected to the Egress transmission device by the transmission line, the path 41 is relayed by a communication node in the middle of connection. In
When a fault occurs in the transmission line or the transmission device, the transmission device in which the fault occurs or a transmission device connected to the failed transmission line (in most cases, a signal-receiving transmission device), and a transmission device on a communication direction downstream transmit an alarm to notify an operator managing the network of a detected abnormality. At this time, the transmission device detects a failure and transmits an alarm via the node management device and the monitoring network 2 to the alarm monitoring device 1. The transmitted alarm includes information on an alarm generated location and pieces of information such as a detected time, an alarm type, an importance defined in advance for each alarm type, and an identifier of a related path (a path that includes the alarm generated location in its route).
“Node-1 device 1” displayed by the identification number in the path DB 13 shown in
An alarm monitoring operation of the alarm monitoring device 1 is explained next. When a failure occurs in a transmission device 202-1 on a route of the path 41, the transmission device 202-1 transmits an alarm via a node management device 101-2 and the monitoring network 2 to the alarm monitoring device 1. Furthermore, transmission devices 202-2 and (200+N)-1, which are located on the downstream of the transmission device 202-1 on the path 41 and are included in the communication nodes 100-2 to 100-N on the path 41, one after another transmit alarms via the node management device 101-2, a node management device 101-N, and the monitoring network 2 to the alarm monitoring device 1.
That is, when a failure occurs in the transmission device 202-1, in the alarm monitoring device 1, the monitoring control unit 11 one after another receives a large number of alarms from the transmission device 202-1 and its downstream transmission devices on the path 41. Concerning the order that alarms reach the alarm monitoring device 1, the alarm transmitted by the transmission device 202-1 does not necessarily reach the alarm monitoring device 1 first, and the alarms reach the alarm monitoring device 1 in a random order.
In the alarm monitoring device 1 having received alarms, the monitoring control unit 11 confirms a hold timer with respect to a path related to each of the received alarms. The hold timer is a timer that waits for a process of displaying a path related to a received alarm on the alarm-generating path list for a period set by the timer since the reception of the alarm. The hold timer prevents the contents displayed on the alarm-generating path list from being updated frequently in a short time. There is a method of associating a hold timer with a path to which a configuration indicated in items of the route information of
A process in the alarm monitoring device 1 before an alarm-generating path list is displayed after the hold timer elapses is explained next.
The monitoring control unit 11 then confirms whether the display mask flag of the corresponding path is ON in the path DB 13 (Step S16). When the display mask flag is OFF (NO at Step S16), the monitoring control unit 11 displays the path ID, the alarm generation time, and the alarm importance registered in the path DB 13 as the alarm-generating path list on the display unit 12 (Step S17). At this time, when there is an already displayed path, paths are sorted by the alarm generation time and displayed so that a path of the latest alarm generation time is at the head of the list. On the other hand, when the display mask flag is ON (YES at Step S16), the monitoring control unit 11 ends the process without displaying the corresponding path on the alarm-generating path list.
A method of handling a case when there is a critical failure, such an optical cable serving as a transmission line is cut, is explained here.
Accordingly, an operator confirms that a large number of paths have failed at the substantially same time in the display unit 12 of the alarm monitoring device 1. As a result of confirmation, when an operator recognizes the failure of the transmission line between the communication nodes 100-2 and 100-3, an operator can specify the transmission line between the communication nodes 100-2 and 100-3 on the alarm-generating path list and instruct the display mask to the monitoring control unit 11.
While a case of setting ON/OFF of the display mask flag has been explained above, it is also possible to configure such that the display mask flag cannot be set. In this case, as shown in
As explained above, according to the present embodiment, in a communication network where a plurality of alarms are generated by one failure, the monitoring control unit 11 of the alarm monitoring device 1 displays, with respect to respective paths, only the latest alarm out of alarms having the alarm importance of the highest priority for each path ID as the alarm-generating path list. With this configuration, each of failure occurring paths is displayed with the alarm importance in one row. An operator can confirm states of a large number of paths simultaneously and take a measure to handle failures, starting from a path in which a critical alarm is generated.
The monitoring control unit 11 sorts the alarm-generating path list by the alarm generation time and displays a path having the latest alarm generation time at the head. Accordingly, in a path in which a failure is already occurring, when a failure that is not related to the previous failure occurs after a while, the display position of this path is changed to the head of the list. Thus, an operator can easily recognize that a new failure occurs.
The monitoring control unit 11 updates the alarm-generating path list after a set time since the first alarm is generated by using a hold timer. Accordingly, even when the communication network is a wide area network such as a submarine cable system and a time lag occurs between times when alarms transmitted from the respective communication nodes reach the alarm monitoring device 1, because each of a large number of alarms caused by one failure is not searched for in the alarm DB 14, a processing load of the monitoring control unit 101 can be reduced.
The monitoring control unit 11 processes selection of the alarm information displayed on the alarm-generating path list depending on not the alarm type but the alarm importance and the generation time. Accordingly, even when a new device is added to the communication network and thus the alarm type is increased, the alarm type does not need to be modified in the alarm monitoring device 1.
When a failure by which alarms are generated in a large number of paths such as a failure of a transmission line occurs, an operator can hide, from the alarm-generating path list by using a display mask function, a path passing through the transmission line and/or the transmission device in which a known failure is occurring. With this configuration, an operator can easily find, on the alarm-generating path list, a path in which an alarm is generated, the alarm being caused by other failures not related to the known failure.
According to the first embodiment, a path passing through a transmission device and/or a transmission line in which a display mask is specified is hidden from an alarm-generating path list even when an alarm is generated. According to the present embodiment, when an alarm is newly generated after a certain time elapses since the display mask is specified, the path is displayed on the alarm-generating path list. Elements different from those of the first embodiment are explained below.
The alarm monitoring device 1a performs centralized management on alarms transmitted by the respective communication nodes 100-1 to 100-N, notifies an operator who is responsible for managing the communication network of generation of an alarm, and provides alarm information. The alarm monitoring device 1a includes a monitoring control unit 11a, the display unit 12, a path database (DB) 13a, the alarm database (DB) 14, and a mask database (DB) 15.
The monitoring control unit 11a controls an operation of the entire alarm monitoring device 1a, receives alarms transmitted from the respective communication nodes 100-1 to 100-N, and stores therein the alarms in the alarm DB 14. The monitoring control unit 11a displays a path, in which a communication failure is occurring, as an alarm-generating path list in a list format on the display unit 12.
While the path DB 13a is a database similar to that of the first embodiment (see
The mask DB 15 is a database in which a time when the display mask ends is stored for the ID of each transmission device and/or the ID of each transmission line installed in the communication network.
The alarm-generating path list when the display-mask end time is set is explained here.
An alarm monitoring operation of the alarm monitoring device 1a according to the present embodiment is explained next. An operation from when the transmission device transmits an alarm to when the alarm monitoring device 1a receives the alarm is same as that in the first embodiment. Similarly to the first embodiment, when the alarm monitoring device 1a receives an alarm, the alarm monitoring device 1a performs the process when receiving an alarm shown in
A process in the alarm monitoring device 1a before the alarm-generating path list is displayed after the hold timer expires is explained next.
The monitoring control unit 11a then confirms the alarm generation time of the corresponding path in the path DB 13a (Step S31) and when the alarm generation time is after the display-mask end time (YES at Step S31), displays a path ID, an alarm importance, and an alarm generation time registered in the path DB 13a as the alarm-generating path list on the display unit 12 (Step S17). At this time, when there is an already displayed path, paths are sorted by the alarm generation time and displayed so that a path of the latest alarm generation time is at the head of the list. On the other hand, when the alarm generation time is before the display-mask end time (NO at Step S31), the monitoring control unit 11a ends the process without displaying the corresponding path on the alarm-generating path list.
A case where an operator specifies the display mask is explained next.
Meanwhile, as a result of searching for the specified transmission device and/or the specified transmission line in the mask DB 15 (Step S42), when there is corresponding information (YES at Step S43), the display-mask end time in the mask DB 15 is updated from the current time to the time after the set time (Step S48). A process subsequent to Step S45 is then continued.
An operation when an operator cancels a display mask is explained next.
When the corresponding path is found, the monitoring control unit 11a performs a search whether information of other passing transmission lines and/or transmission devices is registered in the mask DB 15 with respect to the corresponding path (Step S54). With respect to a path in which the corresponding mask information is not found (NO at Step S55), the monitoring control unit 11a clears the display-mask end time of the corresponding path ID in the path DB 13a (Step S56). Further, when the alarm-generation time is registered, that is, when an alarm is generated, displays the path ID, the alarm generation time, and the alarm importance of the corresponding path on the alarm-generating path list (Step S57).
Meanwhile, with respect to the path in which the corresponding mask information is found (YES at Step S55), the monitoring control unit 11a copies, out of the mask information, the latest display-mask end time as the display-mask end time of this path in the path DB 13a (Step S58). When the alarm generation time of this path registered in the path DB 13a is after the display-mask end time (YES at Step S59), the monitoring control unit 11a displays this path on the alarm-generating path list (Step S57). When the alarm generation time of this path is before the display-mask end time (NO at Step S59), the monitoring control unit 11a ends the process without displaying the path on the alarm-generating path list.
As described above, according to the present embodiment, the monitoring control unit 11a sets a set time during which a display-masked path is hidden. The monitoring control unit 11a does not display a path corresponding to the display mask on the alarm-generating path list for the set time. With this configuration, only alarms in a set time during which an alarm related to a recognized failure reaches the alarm monitoring device are not displayed on the alarm-generating path list, as well as newly generated alarms on a path that are not related to the recognized failure can be displayed on the alarm-generating path list. Therefore, when failures independent from each other occur on the same path in an overlapped manner, an operator can find the alarms.
As described above, the alarm monitoring device according to the present invention is useful for monitoring a network, and is particularly suitable for a network in which where a plurality of alarms are generated by the same cause.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2010/072381 | 12/13/2010 | WO | 00 | 2/5/2013 |