The present invention relates to an access node monitoring control apparatus, access node monitoring system, access node monitoring method, and access node monitoring program for monitoring an access node apparatus.
Now that the Internet is used as a social infrastructure for various purposes, it is demanded that service providers sign an SLA (service level agreement), which includes a reliability index, with a user and provide services in accordance with the SLA.
It is generally demanded that carrier-grade reliability ensure 99.999% of availability. This high level of reliability indicates that service disrupts for a period of approximately 5 minutes per year. Therefore, when a fault occurs, it is demanded that fault recovery be achieved in minutes. In addition, it is necessary to maintain service fairness among a plurality of users who have signed the same SLA.
When a fault occurs and then corresponding fault recovery is to be achieved, the order of actions to be taken by maintenance personnel, such as unit replacement, is largely determined at the discretion of the maintenance personnel. In such a case, therefore, optimum maintenance for attaining the SLA and maintaining inter-user fairness is not always performed. If, for instance, a plurality of access node apparatuses are faulty at one access node station, the order of actions to be taken to achieve fault recovery should be optimized in consideration of the SLA and user fairness. However, it is difficult for the maintenance personnel to determine the optimum order of actions to be taken to achieve fault recovery.
Technologies for attaining a reliability-related SLA and assuring inter-user fairness have already been disclosed. For instance, Patent Document 1 describes a subscriber QoS control method that is designed to provide Qos services to remote subscriber terminals in accordance with a subscriber-specific SLA.
A remote diagnosis/troubleshooting system disclosed, for instance, in Patent Document 2 subjects a fault analysis/troubleshooting process to load distribution and performs troubleshooting in accordance with a fault level.
The problem can be addressed by complying with a user-specific SLA concerning reliability while fairness is maintained among users who have signed the same SLA.
Solving the problem makes it possible to provide fair and highly satisfying services to the users. The subscriber QoS control method described in Patent Document 1 aims at implementing traffic-related QoS and effectively using network resources. This method does not consider the implementation of a reliability-related SLA or the provision of fairness among users who have signed the same SLA. This method does not indicate that, when multiple faults occur for example, how prompt recovery from such multiple faults is achieved while maintaining fairness among users who have signed the same SLA.
Meanwhile, the remote diagnosis/troubleshooting system described in Patent Document 2 can perform troubleshooting in accordance with the level of an encountered fault. However, this system does not consider the implementation of a reliability-related SLA for a plurality of users or the provision of fairness among the users. Therefore, this system cannot address the aforementioned problem, as is the case with the method described in Patent Document 1.
The present invention has been made in view of the above circumstances and has objects to provide an access node monitoring control apparatus, access node monitoring system, access node monitoring method, and access node monitoring program for increasing the possibility of implementing a user-specific SLA concerning reliability while maintaining fairness among users who have signed the same SLA.
An access node monitoring control apparatus according to the present invention monitors a plurality of access node devices that each include, one or more interface cards corresponding respectively to a plurality of users.
The access node monitoring control apparatus includes: SLA information storage means for storing information about reliability-related SLAs signed by the users; fault information receiving means for receiving information about faults in a device used to provide communication services to the users; fault history storage means for storing a history of faults indicated by the information received by the fault information receiving means in association with the users affected by the faults; and next maintenance operation determination means for determining the next maintenance operation in accordance with information about a currently faulty device, information about an SLA for a user affected by a fault in the faulty device, and a fault history associated with the user.
An access node monitoring system according to the present invention monitors a plurality of access node devices that each include one or more interface cards corresponding respectively to a plurality of users. The access node monitoring system includes an access node monitoring control apparatus that is connected to each of the plurality of access node devices through a communication network. The access node monitoring control apparatus includes: SLA information storage means for storing information about reliability-related SLAs signed by the users; fault information receiving means for receiving information about faults in a device used to provide communication services to the users; fault history storage means for storing a history of faults indicated by the information received by the fault information receiving means in association with the users affected by the faults; and next maintenance operation determination means for determining the next maintenance operation in accordance with information about a currently faulty device, information about an SLA for a user affected by a fault in the faulty device, and a fault history associated with the user.
An access node monitoring method according to the present invention is used to monitor a plurality of access node devices that each include one or more interface cards corresponding respectively to a plurality of users. The access node monitoring method includes the steps of: causing a storage device to store information about reliability-related SLAs signed by the users; receiving information about faults in a device used to provide communication services to the users; causing the storage device to store a history of faults indicated by the received information in association with the users affected by the faults; and determining the next maintenance operation in accordance with information about a currently faulty device, information about an SLA for a user affected by a fault in the faulty device, and a fault history associated with the user.
An access node monitoring program according to the present invention is used to monitor a plurality of access node devices that each include one or more interface cards corresponding respectively to a plurality of users. The access node monitoring program causes a computer, the computer having SLA information storage means for storing information about reliability-related SLAB signed by the users, to perform: a fault information receiving process for receiving information about faults in a device used to provide communication services to the users; a fault history registration process for causing a storage device to store a history of faults indicated by the received information in association with the users affected by the faults; and a next maintenance operation determination process for determining the next maintenance operation in accordance with information about a currently faulty device, information about an SLA for a user affected by a fault in the faulty device, and a fault history associated with the user.
The present invention makes it possible to increase the possibility of implementing a user-specific SLA concerning reliability while maintaining fairness among users who have signed the same SLA. As a result, fair and highly satisfying services can be provided to the users.
An exemplary embodiment of the'present invention will now be described with reference to the accompanying drawings.
In
The access node monitoring control apparatus 1 includes control means 11 and a database (SLA-DB) 12. The control means 11 is a computer having, for instance, a CPU that operates in accordance with a program. The database 12 stores, for instance, user-specific SLA information.
The control means 11 determines the priorities of maintenance operations in accordance, for instance, with the user-specific SLA information, a fault history, and a currently encountered fault, notifies the maintenance personnel of the next optimum maintenance operation, and provides control over the automatic shipment of spare parts.
At the access node station 2, the access node devices 21 and the maintenance terminal 24 are installed. The maintenance terminal 24 is operated by the maintenance personnel. The access node devices 21 each include a common section 22 and the interface cards 23.
The common section 22 exchanges a monitoring control signal with the access node monitoring control apparatus 1 through the monitoring control network. The common section 22 also exercises various setup and control functions for the local interface cards 23.
Each interface card 23 is a communication control device that provides a connected user (or more specifically, a connected user terminal) with an access link to a predetermined network such as the Internet.
The repair/delivery request receiving device 31 receives a repair request or a delivery request from the access node monitoring control apparatus 1 and performs a voucher printing process or other predetermined process as needed.
The fault information receiving section 101 receives fault information about the interface cards 23 from the access node station 2 (individual access node devices 21).
The priority determination section 102 determines the recovery priorities of faulty devices from currently encountered fault information, fault history, and SLA information.
The fault queue priority processing section 104 performs a process on the fault queues 103 (or more specifically, the high-priority queue 1031 and the low-priority queue 1032) to store fault information in order of priority.
The maintenance operation/spare parts shipment determination section 105 determines the next maintenance action from the status of the fault queues 103 and references the spare parts inventory database 109 to determine the dispatch of spare parts as needed.
The maintenance operation/spare parts shipment instruction transmission section 106 receives instructions from the maintenance operation/spare parts shipment determination section 105, transmits a maintenance operation instruction signal to the maintenance terminal 24 in order to issue a maintenance operation instruction to the maintenance personnel at the access node station 2, and transmits a spare parts shipment instruction signal to the repair/delivery request receiving device 31 installed in the repair plant/warehouse 3.
The fault history database 107 records a fault history of each device (access node device and interface card).
The SLA database 108 stores information about an SLA that is signed between each user and a service provider, who is an operator of the access node monitoring system. In the present invention, the information about an SLA at least includes information indicated by a reliability-related index. The reliability-related index may be, for example, availability, MTTF (mean time to failure), MTTR (mean time to recovery), or a combination of these.
The spare parts inventory database 109 stores information about the inventory of spare parts for each device.
In the present exemplary embodiment, the fault information receiving section 101, the priority determination section 102, the fault queue priority processing section 104, and the maintenance operation/spare parts shipment determination section 105 are implemented, for instance, by a CPU operating in accordance with a program. The maintenance operation/spare parts shipment instruction transmission section 106 is implemented by a CPU operating in accordance with a program and a communication control device such as a network card. The fault history database 107, the SLA database 108, and the spare parts inventory database 109 are implemented, for instance, by a storage device, such as a database system, and a control section that provides access control of the storage device. The fault queues 103 which include the high-priority queue 1031 and the low-priority queue 1032 is implemented, for instance, by an internal storage device such as a RAM.
An operation of the present exemplary embodiment will now be described.
In the access node monitoring control apparatus 1, the fault information receiving section 101 receives fault information that is transmitted from an access node device 21 (step S101). Upon receipt of the fault information, the fault information receiving section 101 updates the fault history database 107 and forwards (outputs) the received fault information to the priority determination section 102 (step S102).
Upon receipt of the fault information, the priority judgment section 102 judges recovery priorities based on the currently encountered fault information, fault history, and SLA information, and stores the fault information in the fault queues in accordance with the determination result (steps S103 and S104). The priority judgment section 102 changes as needed the order of queued items within the same class in accordance with the fault history. More specifically, the priority judgment section 102 determines the priorities of all faulty interface cards and places the fault information on the fault queues in accordance with the determined priorities. For example, a user for which the services were disrupted for an extended period of time may be placed at the beginning of the queue.
Further, when a target availability is defined as the SLA information, an actual availability may be calculated from the relationship between operating time and downtime to determine the percentage at which the SLA is satisfied (or dissatisfied). This result may be defined as the degree of SLA satisfaction. The associated fault queue may be determined in accordance with the degree of SLA satisfaction. For example, when the degree of SLA satisfaction is lower than a predetermined threshold value, the associated fault queue may be identified as the high-priority queue, and when the degree of SLA satisfaction is higher than the predetermined threshold value, the associated fault queue may be identified as the low-priority queue. An additional process may be performed, for instance, to rearrange the queued items placed on a priority-specific queue in the order from the lowest degree of SLA satisfaction to the highest. For example, a current determination result may be compared against a stored judgment history so as to change the order of queued items. As regards the MTTF and MTTR, the degree of SLA satisfaction may be similarly calculated from the relationship between target and result to determine the recovery priorities. Maintenance actions can be taken in accordance with the determined recovery priorities to fully implement a reliability SLA while maintaining fairness among users.
As another method of determining the priorities from the degree of SLA satisfaction, a priority judgment table may be used. For example, the degree of SLA satisfaction may be classified into a level of 120% or higher, a level of 120% to 100%, a level of 100% to 80%, a level of 80% to 60%, and a level of lower than 60%. The priority judgment table registering the relationship between the levels and queues may be stored for use. Values to be entered into the priority judgment table should be such that items exhibiting the high degree of SLA satisfaction are placed on the low-priority queue. If, for instance, no result is available immediately after the commissioning of a newly installed device, control may be exercised so that faults involving a high target SLA level are placed on the high-priority queue 1031.
Next, the maintenance operation/spare parts shipment determination section 105 determines the next maintenance action in accordance with the status of the fault queues 103 and the data of the spare parts inventory database 109 (step S105). For example, the maintenance operation/spare parts shipment determination section 105 sequentially acquires fault information sets with priority given to the high-priority queue 1031 and then determines the next maintenance action so as to achieve recovery from faults indicated by the acquired fault information sets. The next maintenance action may include an action that is taken as needed to issue a spare parts shipment request to the repair plant/warehouse.
Further, the maintenance operation/spare parts shipment determination section 105 transmits the determination result, as a maintenance operation instruction or a spare parts shipment instruction, to the access node station 2 (or more specifically, the maintenance terminal 24 and the associated access node devices 21) and the repair plant/warehouse 3 (or more specifically, the repair/delivery request receiving device 31) through the maintenance operation/spare parts shipment instruction transmission section 106 (steps S106 and S107).
In the currently used exemplary configuration, fault queue priorities are classified into two classes of users including ordinary users and business users so that business users and ordinary users are placed on the high-priority queue and low-priority queue, respectively. The maintenance operation instruction transmitted from the access node monitoring control apparatus 1 is issued to the maintenance personnel through the monitor of the maintenance terminal 24. The spare parts shipment instruction is issued to the repair/delivery request receiving device 31. It is preferred that upon receipt of the spare parts shipment instruction, the repair/delivery request receiving device 31 provide predetermined equipment control to send spare parts through the transport route.
As described above, the present exemplary embodiment makes it possible, to retain user-specific SLA information indicated by a reliability-related index and issue an optimum next maintenance operation instruction as needed in accordance with the user-specific SLA information, fault history, and current fault status. Therefore, fair and highly satisfying services can be provided to the users. In addition, the present exemplary embodiment reduces the time required for fault recovery by automating the shipment of spare parts.
It should be noted that the number of fault queues and the method of fault queue classification can be changed as appropriate. For example, three or more classes of fault queues may be used. If the three or more fault queue classes are used, the priority judgment table can be organized, for instance, to define different queues for the aforementioned levels.
A display section may be added to the common section 22 of each device and used to display the order of maintenance actions for the local device while the monitor of the maintenance terminal 24 in the access node station 2 displays the maintenance operation instruction from the access node monitoring control apparatus 1.
The present invention will now be outlined.
The SLA information storage means 501 stores information about reliability-related SLAB signed by the users. The SLA information storage means 501 may, for example, store a target value in which at least availability, mean time to failure, or mean time to recovery is used as an index, as reliability-related SLA information.
The fault information receiving means 502 (e.g., the fault information receiving section 101) receives information about faults in a device used to provide communication services to the users. The fault information includes, for instance, information indicative of fault occurrence or recovery, the time of fault occurrence or recovery, and identification information about a faulty device.
The fault history storage means 503 (e.g., the fault history database 107) stores a history of faults indicated by the information received by the fault information receiving means 502 in association with the users affected by the faults. The users affected by the faults can be identified by referencing preregistered setup information in accordance with device identification information included in the fault information.
The next maintenance operation determination means 504 (e.g., the control means 11 incorporating the functions of the priority judgment section 102, the fault queue priority processing section 104, and the maintenance operation/spare parts shipment determination section 105) determines the next maintenance operation in accordance with information about a currently faulty device, information about an SLA for a user affected by a fault in the faulty device, and a fault history associated with the user. The next maintenance operation may be determined, for instance, by determining the next maintenance target (that is, the device to be restored to normal with the highest priority). The next maintenance target can be determined by determining the recovery priorities of faulty devices.
The spare parts arrangement means 505 (e.g., the maintenance operation/spare parts shipment instruction transmission section 106) provides control so as to arrange for the delivery of spare parts for a faulty device. In such a configuration, the next maintenance operation determination means 504 may decide to exercise control so as to arrange for the delivery of spare parts for a faulty device as one of the maintenance actions to be taken subsequently.
The priority judgment means 5041 (e.g., the priority judgment section 102) determines the recovery priorities of all currently faulty devices indicated by the information received by the fault information receiving means 502.
The queuing means 5042 (e.g., the fault queue priority processing section 104) places the fault information about the currently faulty devices on a queue in accordance with the priorities determined by the priority judgment means 5041. For example, the queuing means 5042 may queue a plurality of pieces of fault information in order of high priority for recovery of devices indicated by the fault information.
The maintenance operation determination means 5044, (e.g., the maintenance operation/spare parts shipment determination section 105) sequentially acquires the queued pieces of fault information and determines the maintenance operation required for the recovery from faults indicated by the acquired fault information.
In the above-described configuration, the access node monitoring control apparatus 50 may further include maintenance operation notification means 506 (e.g., maintenance operation/spare parts shipment instruction transmission section 106). The maintenance operation notification means 506 transmits information about the next maintenance action to the maintenance terminal 70.
The access node devices 60 may each include fault information transmission means 601 (e.g., common section 22). The fault information transmission means 601 transmits information about a fault in a local access node device, including a fault in an interface card inserted into the local access node device, to the access node monitoring control apparatus 50.
The maintenance terminal 70 may include display means 701 (e.g., the monitor). The display means 701 receives information indicative of the next maintenance action from the access node monitoring control apparatus and presents the received information to the maintenance personnel.
While the present invention has been described with reference to a preferred exemplary embodiment and examples thereof, it will be understood by those skilled in the art that the present invention is not limited to the preferred exemplary embodiment and examples, and that modifications and variations can be made without departing from the spirit and scope of the invention.
This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2009-31949, filed on Feb. 13, 2009, the disclosure of which is incorporated herein in its entirety by reference.
The present invention is preferably applicable to a system in which devices used to provide communication services to classified users are managed.
Number | Date | Country | Kind |
---|---|---|---|
2009-031949 | Feb 2009 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2010/000264 | 1/19/2010 | WO | 00 | 8/10/2011 |