The present invention relates to a network management method, a network system, an aggregated analysis apparatus, a terminal apparatus and a non-transitory medium storing a program.
A network, which is utilized, in an enterprise or the like, for business activities and so forth, has been no longer limited to use within an enterprise due to progress in services and devices. For example, there is a case where an external terminal accesses an enterprise internal server by using a radio access network, a core network, or the like of a communication carrier, and a case where a terminal, from an enterprise internal LAN (Local Area Network) or the like, utilizes an external cloud service and so forth. In a case where a malfunction occurs in communication between a terminal and a destination thereof due to a network congestion, failure or the like, an analysis is executed for a network appliance(s) on a side of a communication carrier, a network appliance(s) in the enterprise internal LAN, a communication service and so forth. This analysis operation may require increased man-hours and resources, and further skills, depending on a scale of a network and the number of components thereof.
Regarding the analysis on a network failure, for example, PTL (Patent Literature) 1 discloses the following problems. That is, in a case where data transmitted from a certain apparatus to another apparatus as a destination does not reach there, the apparatus that has transmitted the data can detect an error. However, a system administrator identifies a location of a failure in a communication path from the apparatus that has transmitted the data to a destination apparatus, that is, a location of an actual failure, and failure analysis takes too much time. The larger is a scale of a system, the more difficult is identification of a failure occurrence location (suspected fault location). Therefore, bloated time required for the failure analysis becomes a problem. In PTL 1, to address this problem, the followings are disclosed as a network monitoring method for detecting a location of failure occurrence on a network. A communication state monitoring means monitors a communication status with other device(s) on the network, and an anomaly detection means detects an event indicating an anomaly from communication contents detected by the communication status monitoring means. A failure location determination means, by referencing to a failure location determination table in which elements, each being a possible cause of occurrence of a failure on the network are classified in advance and an event indicating an anomaly in communication via the network is associated with the element classified, determines an element which is an occurrence cause of an event detected by the anomaly detection means. A failure information output means outputs failure information indicating a determination result by the failure location determination means.
Regarding AI (Artificial Intelligence) based failure analysis, PTL 2 discloses a problem that, in a case of a single failure, a processing speed is not regarded as so problematic in an existing expert system, when a plurality of failures are notified asynchronously, it is almost impossible to present an inference result with high reliability in a short time period, and in a case of occurrence of lack knowledge or a system error, the system would stop processing for a long period or would result in complete no function. To address this problem, PTL 2 discloses a communication network failure management system having excellent distributed processing capability and real-time processing performance and capable of being configured more flexibly and easy for maintenance. This system includes a rule-based inference autonomous agent and a memory-based inference autonomous agent and includes a primary isolation autonomous agent group that analyzes an event notified from an event recognition autonomous agent group and determines a failure cause or a failure location.
NPL (Non-Patent Literature) 1 discloses a network anomaly detection technology and an automatic failure location inference technology utilizing AE (Auto Encoder) (that has been subjected to supervised learning using the same data in an input layer and an output layer in 3-layers neural network), which is one type of deep learning capable of realizing learning of complicated structure inherently present in data.
PTL 1: Japanese Unexamined Patent Application Publication NoJP2005-167347A
PTL 2: Japanese Unexamined Patent Application Publication No JP Hei 09-160849A
NPTL 1: Keishiro WATANABE, et. al., “Creation of new value by utilizing Network-AI technology”, NTT journal, 2018 Vol. 30, No. 3, searched on Feb. 5, 2019, internet <URL: http://www.ntt.co.jp/journal/1803/files/JN20180313.pdf>
An analysis on the related technologies is provided as follows.
In PTL 1, the communication status monitoring means monitors a communication status with another apparatus on a network, and obtains a packet exchanged between a communication means and a communication interface to analyze content of the packet. PTL 1 discloses that, for example, the communication status may be monitored for each connection, but does not disclose a configuration where a failure analysis on the network is executed based on path information between with a destination. The same is applied to PTL 2 and NPTL 1.
It is an object of the present invention to provide a network management method, a network system, apparatuses, a non-transitory medium storing a program, each enabling to appropriately narrow down a suspected failure location on a network, thereby enabling to perform efficient failure analysis.
According to one aspect of the present invention, there is provided a network management method including:
According to another aspect of the present invention, there is provided a network system including: at least one terminal apparatus connecting to a network; and an aggregated analysis apparatus connecting to the terminal apparatus.
The terminal apparatus includes: a means that acquires path information from the terminal apparatus to a destination node; a storage part to store the path information; and a means that transmits the path information stored in the storage part to the aggregated analysis apparatus.
The aggregated analysis apparatus includes a means that receivs the path information from one or a plurality of the terminal apparatuses to isolate, by using a learning model, a suspected failure location on the network, based on the path information received.
According to further another aspect of the present invention, there is provided an aggregated analysis apparatus including: a means that receives, from one or a plurality of terminal apparatuses connecting to a network, path information from an individual terminal apparatus to a destination node, the path information acquired by the the individual terminal apparatus; and a means that isolate, by using a learning model, a suspected failure location on the network, based on the path information received.
According to further another aspect of the present invention, there is provided a terminal apparatus connecting to a network, wherein the terminal apparatus includes: a means that acquires path information from the terminal apparatus to a destination node; a storage part that stores the path information; and a means that transmits the path information stored in the storage part to an aggregated analysis apparatus that isolates, by using a learning model, a suspected failure location on the network, based on the path information acquired by one or a plurality of terminal apparatuses.
According to further another aspect of the present invention, there is provided a program causing a computer to execute processing including:
According to further another aspect of the present invention, there is provided a program causing a processor of a terminal apparatus to execute processing including:
According to the present invention, there is provided a computer-readable recording medium storing the above program (non-transitory computer readable recording medium, such as a semiconductor storage (e.g., a RAM (Random Access Memory), a ROM (Read Only Memory), or, an EEPROM (Electrically Erasable and Programmable ROM)), or the like), an HDD (Hard Disk Drive), a CD (Compact Disc), a DVD (Digital Versatile Disc), or the like).
According to the present invention, narrowing down of a suspected failure location on a network is enabled, thus enabling to perform efficient failure analysis.
Example embodiments of the present invention will be described. In one of example embodiments of the present invention,
a terminal apparatus (terminal) obtains:
The aggregated analysis apparatus performs, by using, for example, AI, feature extraction from information received from one or a plurality of terminal apparatus to isolate a suspected location of a network failure or the like, thereby enabling to narrow down failure candidates. Thus, it is possible to reduce the number of elements of analysis targets, in failure analysis of a network.
The destination node 120 may be a server or the like which the terminal apparatus 100 usually accesses, or a specific destination configured in advance in order to isolate of a failure location on a network 140. A plurality of terminal apparatuses 100 may connect to the same destination node 120. Alternatively, a plurality of terminal apparatuses 100 may connect to different destination nodes 120, respectively.
The information acquisition part 101 of the terminal apparatus 100 obtains at least path information on the network 140 from the terminal apparatus 100 to the destination node 120. In addition to the path information about the network 140 between the terminal apparatus 100 and the destination node 120, the information acquisition part 101 may obtain one or both of transmission delay information about the network 140 between the terminal apparatus 100 and the destination node 120 and success or failure information between the terminal apparatus 100 and the destination node 120 (e.g., information about a destination, with which the terminal apparatus 100 has failed in communication).
The information storage part 102 stores, in the storage part (not shown), the path information, the transmission delay information, the success or failure information in communication about the network 140 for each of communication destination nodes 120 obtained by the information acquisition part 101.
The information transmission part 103 transmits the information stored in the information storage part 102 to the aggregated analysis apparatus 110.
The aggregated analysis apparatus 110 analyzes the information (path information, etc.) transmitted from one or a plurality of terminal apparatuses 100, extracts a feature pattern or the like, and executes isolation of a suspected failure location or the like on the network 140. The aggregated analysis apparatus 110 extracts a suspected failure location on the network 140 (e.g., a failure in a port of a NIC (Network Interface Card) of a network appliance, or a failure in a link between two opposing ports, etc.), for the path information transmitted from one or a plurality of the terminal apparatuses 100, based on a learning model (e.g., classification model), or the like, created in advance using machine learning.
The information acquisition part 101 of the terminal apparatus 100 may be configured to obtain the path information, the transmission delay information and so forth to the destination node 120, depending on an instruction from the aggregated analysis apparatus 110, store the obtained information in the information storage part 102 and transmit the stored information to the aggregated analysis apparatus 110. Alternatively, the information acquisition part 101 of the terminal apparatus 100 may be a configured to obtain the path information, the transmission delay information and so forth to the destination node 120, store the obtained information in the information storage part 102, and transmit the stored information to the aggregated analysis apparatus 110, at a predetermined timing or responsive to receiving an instruction from the aggregated analysis apparatus 110. Furthermore, the information acquisition part 101 of the terminal apparatus 100 may be a configured to, when a failure or the like. occurs in communication with the destination node 120, obtain the path information, the transmission delay information and so forth to the destination node 120 and transmit the obtained information to the aggregated analysis apparatus 110.
In
As schematically illustrated in
CC (Continuity Check) verifies (checks) connectivity between MEPs. An MEP on one end transmits a CCM (Continuity Check Message) toward an MEP on the other end in order to detect communication link failure between the MEPs, and a CCM frame is exchanged between MEP-MEP and between MEP-MIP to perform verification of continuity and isolation of a failure (see
LB (Loop Back) transmits, by unicast, an LBM (Loopback Message) from an MEP to an MIP or an MEP which is a destination. On reception of an LBM frame, the MIP or MEP generates an LBR (Loopback Reply) frame and transmits the LBR frame to a transmission source MEP (e.g., the terminal apparatus 100 in
LT (Link Trace) verifies normality of a path by exchanging a loopback message between an MEP and an MEP, between an MEP and a MIP. When a transmission source MEP (e.g., the terminal apparatus 100 in
The information acquisition part 101 may obtain the path information and the transmission delay information of the network 140 to the destination node 120, by using a ping or a traceroute on layer 3. The ping verifies reachability to the destination node 120 by transmitting an echo request (also referred to as a “ping request”) of ICMP (Internet Control Message Protocol) to the destination node 120 and receiving an echo reply transmitted from the destination node 120. In a case of ping, an RTT (Round-Trip Time) and/or a packet loss ratio are calculated based on time until the echo response is returned from the destination node 120 and/or a response ratio. Ping corresponds to LB (Loopback) in Ethernet OAM on layer 2.
Traceroute is a command for verifying path information of a packet up to a destination, which is used to acquire an IP address(es) of a router(s) through which a packet passes from an own node to a destination node, a hop count, and a round trip arrival time to each router. In traceroute, a transmission source transmits a packet by adding 1 to TTL (Time to Live) of an IP (Internet Protocol) header (TTL of a first packet is 1) to obtain path information. TTL represents a living time period of a packet and 1 is subtracted therefrom every time the packet passes through a router. A router, on reception of a packet with a value of TTL being 2 or more, decreases, by 1, the value of TTL of the packet to forward the packet to a next router. A router, on reception of a packet with a value of TTL being 1, discards the packet and returns an ICMP time exceeded packet to the transmission source.
In the analysis part 112, a classification model (pattern recognition model) may be created by machine learning, by using, for example, training data (for example, path information from the terminal apparatus to the destination node, transmission delay information, success or failure information in communication with the destination node or processed information thereof) and a ground-truth label (presence/absence, a type of a failure and so forth on a network appliance and a link). On reception, by the reception part 111, of path information, transmission delay information, or success or failure information in communication with a destination node (or processed information thereof) obtained by the terminal apparatus 100, the analysis part 112 may classify the received information, by using the classification model and extract a suspected failure location on the network 140. The learning model (classification model) may be a decision tree of NN (Neural Network) (or deep NN), SVM (Support Vector Machine), Forest Tree, or the like. Parameters or the like in the classification model, such as NN and SVM, may be adjusted by using actual data.
The aggregated analysis apparatus 110 may be installed in, for example, a server of a cloud system or the like (aggregated analysis system) to provide analysis and isolation of a failure location (candidate) on the network 140 as a cloud service.
As a non-limiting example, in
The terminal apparatus 100-1, the terminal apparatus 100-4 and the terminal apparatus 100-5 are connected to the server 121 via network appliances 11, 12 and 13 on the network 140 (route 17).
The terminal apparatus 100-2 is connected to the server 121 via network appliances 14, 15, 12 and 13 on the network 140.
The terminal apparatus 100-3 is connected to the server 121 via network appliances 16 and 13 on the network 140.
Reachability to the server 121 may be verified by transmitting a ping request (echo request) in each of the terminal apparatuses 100-1 to 100-5 to the server 121 that corresponds to the destination node 120 of
In a case where the network appliances 11 to 16 are a layer 2 switch or the like that is connected via a layer 2 link of Ethernet or the like, the server 121 that corresponds to the destination node 120 of
Alternatively, Link Trace of Ethernet OAM may be performed. The terminal apparatuses 100-1 to 100-5 may respectively transmit LTM of
In
In
Measurement information obtained by the terminal apparatuses 100-1 to 100-5 is transmitted to the aggregated analysis apparatus 110. The aggregated analysis apparatus 110, performs analysis of the path information collected from each of the terminal apparatuses using a learning model obtained based on machine learning to perform feature extraction. When finding that paths from the terminal apparatuses to the server 121 with which the terminal apparatuses failed in communication, go through a network appliance as a common point, the aggregated analysis apparatus 110 outputs this result (the network appliance as a common point) as an isolation result of a suspected location.
In
In contrast, the present example embodiment can cope with a large-scale network by, for example, creating a learning model (classification model) based on supervised machine learning, classifying measurement information obtained by the terminal apparatuses 100-1 to 100-5 with the classification model, and extracting a suspected location(s).
According to the present example embodiment, since an aggregated analysis apparatus executes analysis and isolation of a suspected failure location(s) based on information collected in advance and information at a time when a problem occurs, network appliances and communication services to be analyzed can be narrowed down, and resources required for isolation and analysis of the suspected failure location can be suppressed.
The aggregation analysis device 110 may be configured to periodically analyze transmission delay information collected from each terminal apparatus to monitor for presence of a characteristic change therein.
Referring to
When the aggregate analysis unit 110 confirms, with a periodic analysis, that a transmission delay of communication from the terminal apparatus 100-3 to the server 121 has become large, the analysis part 112 performs analysis of path information from each of the terminal apparatuses 100-1 to 100-5 collected up to that time and performs feature extraction. In this case, the analysis part 112 checks that only the communication in question uses a path from the network device 13 to the terminal apparatus 100-3, as a feature of the path from the terminal apparatus 100-3 to the server 121, a transmission delay of which has increased. The output section 113 outputs this result, as an isolation result of a suspected location. Such a configuration makes it possible to detect, for example, a sign of failure of a link (cable) which connects ports of network appliances, a port, a module or the like, and to detect a communication bandwidth crunch of the network 140.
According to the present example embodiment, a terminal apparatus connected to a network stores communication path information and so forth to a communication party (destination node), and aggregates the communication path information and so forth, in the aggregated analysis apparatus 110 so that it is made possible to isolate a failure candidate without effect exerted on a network appliance and a communication service which is used on a communication path between the terminal apparatus and the destination node.
The aggregated analysis apparatus 110 may be also implemented by the computer apparatus 200 in
Each disclosure of the above cited PTLs 1 and 2, and NPL 1 is contemplated to be incorporated herein in its entirety by reference thereto, and to be used as basis or part of the present invention, as necessary. Modifications and adjustments of example embodiments and examples may be made within the bounds of the entire disclosure (including the scope of the claims) of the present invention, and also based on fundamental technological concepts thereof. Furthermore, various combinations and selections of various disclosed elements (including respective elements of the respective appendices, respective elements of the respective example embodiments, respective elements of the respective drawings, and the like) are possible within the scope of the claims of the present invention. That is, the present invention clearly includes every type of transformation and modification that a person skilled in the art can realize according to the entire disclosure including the scope of the claims and to technological concepts thereof. Further, each of the disclosures in the above-cited documents may be used, if necessary, as part of the disclosure of the present invention in accordance with the gist of the present invention, in part or as a whole, in combination with the descriptions in the present disclosure, and shall be deemed to be included in the disclosure of the present application.
11 to 16 network appliances
100, 100-1 to 100-5 terminal apparatuses
101 information acquisition part
102 information storage part
103 information transmission part
110 aggregated analysis apparatus
111 reception part
112 analysis part
113 output part
120 destination node
121 server
140 network
150 carrier network
200 computer apparatus
201 processor
202 storage (memory)
203 display apparatus
204 communication interface
Number | Date | Country | Kind |
---|---|---|---|
2019-037194 | Mar 2019 | JP | national |
This application is a National Stage Entry of PCT/JP2020/008454 filed on Feb. 28, 2020, which claims priority from Japanese Patent Application 2019-037194 filed on Mar. 1, 2019, the contents of all of which are incorporated herein by reference, in their entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/008454 | 2/28/2020 | WO | 00 |