The present invention relates to a communication network management technique that performs centralized management of a communication network by using a management computer.
In recent years, a communication network has a significant role as a social infrastructure that provides various services, and failure of the communication network has an incalculable impact on users. Therefore, health-checking of the communication network has become a very important issue.
Patent Literature 1 (International Publication WO2005/048540) discloses a technique that uses a keep-alive frame to detect a failure in a communication network. More specifically, in a communication system in which a plurality of base nodes perform communication through one or more relay node, each base node transmits a keep-alive frame that is broadcasted by the relay node. Here, the plurality of base nodes mutually transmit and receive the keep-alive frame and detect failure by monitoring arrival state of the keep-alive frame transmitted from the other side node. In this case, in order to health-check all physical links in the communication network, it is necessary to configure a plurality of communication routes so as to cover all the physical links and to transmit and receive the keep-alive frame with respect to each communication route. That is, it is required to transmit and receive a large number of keep-alive frames. This causes increase in transmission and reception burden placed on each base node.
Non-Patent Literature 1 (S. Shah and M. Yip, “Extreme Networks' Ethernet Automatic Protection Switching (EAPS) Version 1”, The Internet Society, October 2003; (http://tools.ietf.org/html/rfc3619).) discloses a health-check technique in a communication network that is configured in a ring shape. In this case, a plurality of switches are connected through communication lines to form a ring shape, and one health-check frame is transferred sequentially along the ring. For example, a master switch on the ring transmits the health-check frame from a first port. Another switch forwards the received health-check frame to the next switch. The master switch receives the self-transmitted health-check frame at a second port, and thereby can confirm that no failure occurs. This technique assumes such a ring-shaped network structure and thus is not versatile.
Patent Literature 2 (Japanese Patent No. 3740982) discloses a technique that a management host computer performs health-check of a plurality of host computers. First, the management host computer determines an order of the health-check for the plurality of host computers. Next, the management host computer generates a health-check packet into which a health-check table is incorporated. The health-check table has a plurality of entries respectively related to the plurality of host computers, and the plurality of entries are arranged in the above determined order. Each entry includes an address of the related host computer and a check flag. Then, the management host computer transmits the health-check packet to a first host computer. A host computer that receives the health-check packet searches for the related entry in the health-check table and marks the check flag of the corresponding entry. After that, the host computer refers to the address in the next entry and transmits the health-check packet to the next host computer. Due to repetition of the above-mentioned processing, one health-check packet travels the host computers. Eventually, the management host computer receives the health-check packet that has traveled in this manner. Then, the management host computer determines that a failure occurs in a host computer the corresponding check flag of which is not marked.
According to Patent Literature 3 (Japanese Patent Publication JP-2006-332787), one health-check packet travels a plurality of monitor-target terminals, as in the case of Patent Literature 2. A similar health-check table is incorporated into the health-check packet. However, each entry includes, instead of the above-mentioned check flag, a check list in which such information as a date and time and an operating status is to be written. A monitoring terminal transmits the health-check packet to a first monitor-target terminal. When receiving the health-check packet, the monitor-target terminal judges whether or not itself is operating normally. In a case of a normal operation, the monitor-target terminal searches for the related entry in the health-check table and writes designated information such as the date and time and the operating status in the check list of the corresponding entry. Then, the monitor-target terminal refers to the address in the next entry and transmits the health-check packet to the next monitor-target terminal. Here, if communication with the next monitor-target terminal is impossible, the monitor-target terminal transmits the health-check packet to the monitor-target terminal after the next monitor-target terminal. Due to repetition of the above-mentioned processing, one health-check packet travels the monitor-target terminals. Eventually, the monitoring terminal receives the health-check packet that has traveled in this manner. If the designated information is not written in any check list, the monitoring terminal determines that a failure occurs.
[Patent Literature 1] International Publication WO2005/048540
[Patent Literature 2] Japanese Patent No. 3740982
[Patent Literature 3] Japanese Patent Publication JP-2006-332787
[Non-Patent Literature 1] S. Shah and M. Yip, “Extreme Networks' Ethernet Automatic Protection Switching (EAPS) Version 1”, The Internet Society, October 2003; (http://tools.ietf.org/html/rfc3619).
According to the technique described in the above-mentioned Patent Literature 2 or Patent Literature 3, the management computer (management host computer, monitoring terminal) performs centralized management of a communication network. More specifically, the management computer transmits a health-check packet, and the health-check packet returns back to the management computer after travelling a plurality of nodes (host computers, monitor-target terminals). The management computer performs health-checking of each node, based on information written in the returned health-check packet. However, according to the technique, in order to perform the health-checking continuously, the management computer needs to periodically repeat the transmission of the health-check packet. This causes increase in burden placed on the management computer.
An object of the present invention is to reduce burden placed on a management computer that performs centralized management of a communication network.
In an aspect of the present invention, a communication network management system is provided. The communication network management system has: a communication network including a plurality of nodes and a plurality of links connecting between the plurality of nodes; and a management computer connected to the plurality of nodes. A loop-shape communication route in the communication network is a loop route. The management computer transmits a frame to one node on the loop route. When receiving the frame, each node on the loop route forwards the received frame to a subsequent node along the loop route and, if a predetermined condition is satisfied, returns a response back to the management computer. The management computer determines whether or not a failure is occurring, based on reception state of the response.
In another aspect of the present invention, a management computer that manages a communication network is provided. The communication network includes a plurality of nodes and a plurality of links connecting between the plurality of nodes. The management computer has: a storage unit in which loop route information indicating a loop route being a loop-shape communication route in the communication network is stored; an entry control unit configured to instruct each node on the loop route; and a monitoring unit configured to transmit a frame to one node on the loop route. The entry control unit instructs the each node to, when receiving the frame, forward the received frame to a subsequent node along the loop route and, if a predetermined condition is satisfied, return a response back to the management computer. The monitoring unit determines whether or not a failure is occurring, based on reception state of the response.
In still another aspect of the present invention, a communication network management method that manages a communication network by using a management computer is provided. The communication network includes a plurality of nodes and a plurality of links connecting between the plurality of nodes. A loop-shape communication route in the communication network is a loop route. The communication network management method includes: (A) transmitting a frame from the management computer to one node on the loop route; (B) forwarding, by each node on the loop route, the frame to a subsequent node along the loop route and, if a predetermined condition is satisfied, returning a response back to the management computer; and (C) determining whether or not a failure is occurring, based on reception state of the response in the management computer.
In still another aspect of the present invention, a management program recorded on a tangible computer-readable medium that, when executed, causes a management computer to perform management processing of a communication network is provided. The communication network includes a plurality of nodes and a plurality of links connecting between the plurality of nodes. The management processing includes: (a) storing loop route information indicating a loop route being a loop-shape communication route in the communication network, in a storage device; (b) instructing each node on the loop route to, when receiving a frame, forward the received frame to a subsequent node along the loop route and, if a predetermined condition is satisfied, return a response back to the management computer; (c) transmitting a frame to one node on the loop route; and (d) determining whether or not a failure is occurring, based on reception state of the response.
According to the present invention, it is possible to reduce burden placed on a management computer that performs centralized management of a communication network.
The above and other objects, advantages and features of the present invention will be more apparent from the following description of certain exemplary embodiments taken in conjunction with the accompanying drawings.
The communication network NET includes a plurality of nodes 2 to 5 and a plurality of physical links 71 to 75 connecting between the nodes 2 to 5. The physical link 71 is a signal line that bi-directionally connects the node 2 and the node 4. The node 2 and the node 4 can communicate bi-directionally through the physical link 71. The physical link 72 is a signal line that bi-directionally connects the node 4 and the node 5. The node 4 and the node 5 can communicate bi-directionally through the physical link 72. The physical link 73 is a signal line that bi-directionally connects the node 5 and the node 2. The node 5 and the node 2 can communicate bi-directionally through the physical link 73. The physical link 74 is a signal line that bi-directionally connects the node 2 and the node 3. The node 2 and the node 3 can communicate bi-directionally through the physical link 74. The physical link 75 is a signal line that bi-directionally connects the node 3 and the node 5. The node 3 and the node 5 can communicate bi-directionally through the physical link 75.
A control link 62 is a signal line that bi-directionally connects the management computer 1 and the node 2. A control link 63 is a signal line that bi-directionally connects the management computer 1 and the node 3. A control link 64 is a signal line that bi-directionally connects the management computer 1 and the node 4. A control link 65 is a signal line that bi-directionally connects the management computer 1 and the node 5. The management computer 1 and the nodes 2 to 5 can communicate bi-directionally through the control links 62 to 65, respectively.
A loop-shape communication route in the communication network NET is hereinafter referred to as a “loop route LP”. For example, as shown in
The management computer 1 transmits one frame for health-checking (hereinafter referred to as a “check frame FR”) to one node on the loop route LP. For example, the management computer 1 transmits one check frame FR to the node 2 on the loop route LP. The node 2, when receiving the check frame FR, forwards the check frame FR to the subsequent node 5 along the loop route LP. The node 5, when receiving the check frame FR, forwards the check frame FR to the subsequent node 3 along the loop route LP. The node 3, when receiving the check frame FR, forwards the check frame FR to the subsequent node 2 along the loop route LP. Thereafter, each node on the loop route LP repeats the forwarding of the check frame FR along the loop route LP in a similar way. As a result, the check frame FR repeatedly circulates in the loop route LP, if no failure is occurring on the loop route LP.
Moreover, when receiving the check frame FR and if a predetermined condition is satisfied, each node returns a “response RES” indicating the reception of the check frame FR back to the management computer 1. For example, the predetermined condition is that TTL (Time To Live) of the check frame FR reaches a predetermined threshold value. Alternatively, the predetermined condition may be that a hop number of the check frame FR reaches a predetermined threshold value. Alternatively, the predetermined condition may be satisfied in each node with a predetermined probability or with a predetermined period. A node among the nodes on the loop route LP where the above-mentioned predetermined condition is satisfied is hereinafter referred to as a “response node”. It is the response node that returns the response RES indicating the reception of the check frame FR back to the management computer 1. The response RES may be a reception notification signal indicating the reception or may be a copy of the received check frame FR. The response node transmits the response RES to the management computer 1 and continues the transfer of the check frame FR along the loop route LP. For example, as shown in
The management computer 1 can determine whether or not a failure relating to the loop route LP is occurring, based on reception state of the response RES from the loop route LP. For example, if the management computer 1 receives the response RES from any node on the loop route LP within a predetermined period of time, it means that the check frame FR is steadily circulating in the loop route LP. Therefore, the management computer 1 can determine that no failure is occurring in the loop route LP. On the other hand, if the management computer 1 fails to receive the response RES from all nodes on the loop route LP within the predetermined period of time, it means that the circulation of the check frame FR in the loop route LP has ceased. Therefore, the management computer 1 can determine that some sort of failure is occurring in the loop route LP.
According to the present exemplary embodiment, as described above, once a check frame FR is transmitted from the management computer 1 to the loop route LP for one time, then the check frame FR repeatedly circulates in the loop route LP. When the predetermined condition is satisfied, each node on the loop route LP returns the response RES to the management computer 1. The management computer 1 can determine whether or not a failure relating to the loop route LP is occurring, based on reception state of the response RES from the loop route LP. According to the present exemplary embodiment, the management computer 1 needs not to periodically repeat the transmission of the check frame FR in order to perform the health-checking continuously. It is therefore possible to reduce burden placed on the management computer 1.
The present invention can be applied to health-check of nodes and physical links on a LAN of companies, data centers, universities and the like and health-check of communication equipments and physical links of telecommunication carriers.
Hereinafter, an exemplary embodiment of the present invention will be described in more detail. Various methods are possible as a method for achieving the traveling of the check frame FR along a predetermined route in the communication network NET. In the following description, for example, each node is provided with a “forwarding table” in order to achieve the traveling of the check frame FR. The forwarding table is a table that indicates a correspondence relationship between input sources and forwarding destinations of the check frames FR. Each node can forward the check frame FR received from an input source to a designated forwarding destination, by referring to the forwarding table.
Contents of the forwarding table of each node are set up by each node in accordance with an instruction from the management computer 1. More specifically, the management computer 1 uses the control link (62, 63, 64, 65) to instruct each node (2, 3, 4, 5) to set up the forwarding table. Here, the management computer 1 instructs each node to set up the forwarding table such that the check frames FR are forwarded along the loop route LP. Each node sets up the contents of the forwarding table in accordance with the instruction from the management computer 1.
Various interfaces are possible as an interface between the management computer and the nodes for achieving the processing described above. For example, Openflow (refer to http://www.openflowswitch.org/) is applicable. In this case, an “Openflow Controller” serves as the management computer 1 and an “Openflow Switch” serves as each of the nodes 2 to 5. It is possible to set up the forwarding table by using “Secure Channel” of the Openflow. Alternatively, GMPLS (Generalized Multi-Protocol Label Switching) also is applicable. In this case, the management computer instructs a GMPLS switch to set up the forwarding table. Alternatively, VLAN (Virtual LAN) also is applicable. In this case, the management computer can control VLAN setting of each switch by using an MIB (Management Information Base) interface.
In the following description, let us consider a case where the Openflow is used as the interface between the management computer and the nodes.
The management host 1 has a storage unit 10, a topology management unit 11, a loop designing unit 12, an entry designing unit 13, an entry control unit 14, a monitoring unit 15, a node communication unit 16 and a display unit 17. The node communication unit 16 is connected to the switches 2 to 5 through the control links 62 to 65, respectively. The management host 1 can communicate bi-directionally with the switches 2 to 5 by using the node communication unit 16 and the control links 62 to 65.
The storage unit 10 is a storage device such as a RAM and an HDD. A topology table TPL, a loop table LOP, an entry data ENT and the like are stored in the storage unit 10. The topology table TPL (topology information) indicates the above-mentioned physical topology of the communication network NET, namely, a connection relationship between the switches 2 to 5. The loop table LOP (loop route information) indicates the loop route LP in the communication network NET. The entry data ENT indicates contents of an entry to be set in a forwarding table of each node. The details will be described later.
The topology management unit 11, the loop designing unit 12, the entry designing unit 13, the entry control unit 14 and the monitoring unit 15 are functional blocks that are realized by a processor executing a computer program (management program). A function and an operation of each functional block will be described later.
The switch 2 has a table storage unit 20, a forwarding processing unit 21, a host communication unit 23, a table setup unit 24, a port 27, a port 28 and a port 29. The host communication unit 23 is equivalent to the “Secure Channel” of the “Openflow Switch”. The host communication unit 23 is connected to the management host 1 through the control link 62, and the switch 2 can communicate bi-directionally with the management host 1 by using the host communication unit 23 and the control link 62. Moreover, each port (communication interface) is connected to another switch through the physical link, and the switch 2 can communicate bi-directionally with another switch by using the port and the physical link.
The table storage unit 20 is a storage device such as a RAM and an HDD. The above-mentioned forwarding table 22 that indicates a correspondence relationship between input sources and forwarding destinations of the check frames FR is stored in the table storage unit 20.
The forwarding processing unit 21 receives the check frame FR from the host communication unit 23 (i.e. management host 1). Alternatively, the forwarding processing unit 21 receives the check frame FR from any port (i.e. another switch). Then, by referring to the forwarding table 22 stored in the table storage unit 20, the forwarding processing unit 21 forwards the check frame FR received from an input source to a forwarding destination (host communication unit 23 or port) designated by the forwarding table 22. In a case where a plurality of forwarding destinations are designated, the forwarding processing unit 21 copies the check frame FR and forwards them respectively to the plurality of forwarding destinations.
The table setup unit 24 receives through the host communication unit 23 the above-mentioned instruction (table setup command) from the management host 1. Then, in accordance with the instruction (table setup command) from the management host 1, the table setup unit 24 sets (add, delete, change) the contents of the forwarding table 22 stored in the table storage unit 20.
The forwarding processing unit 21, the host communication unit 23 and the table setup unit 24 can be realized by a processor executing a computer program.
Other switches 3 to 5 each has a similar configuration to that of the switch 2. That is, the switch 3 has a table storage unit 30, a forwarding processing unit 31, a host communication unit 33, a table setup unit 34, a port 37, a port 38 and a port 39. A forwarding table 32 is stored in the table storage unit 30. The switch 4 has a table storage unit 40, a forwarding processing unit 41, a host communication unit 43, a table setup unit 44, a port 47, a port 48 and a port 49. A forwarding table 42 is stored in the table storage unit 40. The switch 5 has a table storage unit 50, a forwarding processing unit 51, a host communication unit 53, a table setup unit 54, a port 57, a port 58 and a port 59. A forwarding table 52 is stored in the table storage unit 50.
In the example shown in
The topology management unit 11 creates the topology table TPL (topology information) indicating the physical topology of the communication network NET and stores it in the storage unit 10. Moreover, the topology management unit 11 receives a topology change notification received from each switch through the node communication unit 16. Here, the topology change notification is information indicating change in the physical topology of the communication network NET and includes connection information of a new switch, up/down notification of a physical link and the like. The topology management unit 11 updates the topology table TPL in accordance with the topology change notification.
Here, let us consider a case where the physical topology of the communication network NET is as shown in
The link ID is an identifier given to the associated physical link. The source switch is a switch as a start-point of the physical link, and the source port is a port of the source switch. The destination switch is a switch as an end-point of the physical link, and the destination port is a port of the destination switch. For example, in
The status flag included in each entry indicates whether the associated physical link is available or not. If validity of a physical link is confirmed, the status flag of the entry associated with the physical link is set to “1 (available)”. On the other hand, if validity of a physical link is not yet confirmed or failure occurs at the physical link, the status flag of the entry associated with the physical link is set to “0 (not available)”. In the example shown in
The loop designing unit 12 refers to the topology table TPL stored in the storage unit 10 to determine (design) the loop route LP in the communication network NET. Then, the loop designing unit 12 creates the loop table LOP (loop route information) indicating the determined loop route LP and stores it in the storage unit 10.
Here, the loop designing unit 12 may determine one or more loop routes LP such that all the physical links 71 to 75 in the communication network NET (all the link IDs: A1 to A10) is covered by the one or more loop routes LP. In other words, the loop designing unit 12 designs a group of loop routes such that each link ID path belongs to any loop route. In a case where a plurality of loop routes are obtained, successive loop IDs such as “LP1”, “LP2”, “LP3” . . . are given to the respective loop routes.
The present exemplary embodiment has the following characteristics.
(1) In the check frame FR, TTL (Time to Live) is set as forwarding count information.
(2) The management host 1 transmits the check frame FR in which the TTL is set to a predetermined initial value, to the loop route LP.
(3) Each switch on the loop route LP, when receiving the check frame FR, decreases the TTL by 1 and then forwards the check frame FR to the subsequent switch along the loop route LP.
(4) If the TTL becomes 0 (threshold value) in a certain switch, the switch becomes the “response switch” that returns the response RES back to the management computer 1. The response switch forwards the received check frame FR as the response RES to the management computer 1. Moreover, the response switch forwards the check frame FR to the subsequent switch along the loop route after resetting the TTL to the predetermined initial value.
The entry designing unit 13 designs the entry in the forwarding table of each switch such that the above-described operations (3) and (4) can be achieved in each switch. Here, the entry designing unit 13 refers to the topology table TPL and the loop table LOP stored in the storage unit 10. A data indicating the entry designed by the entry designing unit 13 is an entry data ENT and is stored in the storage unit 10.
The input port represents the input source (port or host communication unit) to which the check frame FR is input. If the input source is any port (i.e. another switch), the input port is expressed by its port number. If the input source is the host communication unit (i.e. the management host 1), the input port is expressed by “HOST”.
The destination MAC address is used as identification information for identifying the loop route in which the check frame FR circulates. For that purpose, a special MAC address is used as the destination MAC address. Different destination MAC addresses are allocated to different loop routes. In the present example, “00-00-4c-00-aa-01”, “00-00-4c-00-aa-02” and “00-00-4c-00-aa-03” are allocated to the loop routes LP1 to LP3, respectively.
The input TTL indicates a TTL condition of the check frame FR input from the input port.
The output port represents the forwarding destination (port or host communication unit) of the check frame FR. If the forwarding destination is any port (i.e. another switch), the output port is expressed by its port number. If the forwarding destination is the host communication unit (i.e. management host 1), the output port is expressed by “HOST”. It should be noted that a plurality of output ports may be set with respect to one entry. In this case, the check frame FR is output to the respective output ports.
The output TTL defines a TTL operation method or a TTL setting method in the switch receiving the check frame FR.
The entry in the entry data ENT is created with respect to each loop route and each switch. Also, as shown in
The first entry is “switch=3, input port=37, MAC DA=00-00-4c-00-aa-01, input TTL=2 or more, output port=39, output TTL=input TTL−1”. As shown in
The second entry defines the processing in a case of receiving the check frame FR from the management host 1, and is different in the “input port” as compared with the first entry. More specifically, in the second entry, the input port is set to HOST. As a result, the switch 3 on the loop route LP1, when receiving the check frame FR from the management host 1, forwards the check frame FR to the subsequent switch 5 along the loop route LP1. In this manner, the above-described operation (3) in the switch 3 on the loop route LP1 can be achieved by the first entry and the second entry.
The third entry is for achieving the above-described operation (4) and defines the processing of the response switch. More specifically, in the third entry, the input TTL is set to 1, the output port is set to 39 and HOST, and further the output TTL is set to the initial value (=11). When the switch 3 receives the check frame FR whose TTL=1, the TTL becomes 0 in the switch 3. In this case, the switch 3 resets the TTL to the initial value (=11) in order to continue the circulation of the check frame FR in the loop route LP1. Moreover, the switch 3 not only forwards the check frame FR to the subsequent switch 5 along the loop route LP1 but also forwards it to the management host 1. The check frame FR transmitted from the switch 3 to the management host 1 is equivalent to the response RES. In this manner, the operation (4) as the response switch can be achieved by the third entry.
It should be noted that the initial value of the TTL is set to 11 in the present example, but it is not limited to that. Also, the initial value of the TTL may be set separately for the respective loop routes LP1 to LP3. A preferable method of setting the initial value of the TTL will be described later.
The entry control unit 14 instructs each switch (2, 3, 4, 5) to set up the forwarding table (22, 32, 42, 52). More specifically, the entry control unit 14 refers to the entry data ENT stored in the storage unit 10 and puts together the entries included in the entry data ENT with respect to each switch. Then, the entry control unit 14 instructs each switch to set the associated entry in the forwarding table. In other words, the entry control unit 14 instructs each switch on the loop route to forward the received check frame FR to the subsequent switch along the loop route and, if the predetermined condition is satisfied, return the response RES back to the management host 1. The entry control unit 14 transmits a table setup command indicating the instruction to each switch (2, 3, 4, 5) through the node communication unit 16 and the control link (62, 63, 64, 65).
In the switch 2, the table setup unit 24 receives the table setup command from the host communication unit 23. Then, the table setup unit 24 sets, in accordance with the table setup command, the contents of the forwarding table 22 stored in the table storage unit 20. The setting of the forwarding table 22 is based on the entries associated with the switch 2 included in the entry data ENT shown in
In the switch 3, the table setup unit 34 receives the table setup command from the host communication unit 33. Then, the table setup unit 34 sets, in accordance with the table setup command, the contents of the forwarding table 32 stored in the table storage unit 30.
In the switch 4, the table setup unit 44 receives the table setup command from the host communication unit 43. Then, the table setup unit 44 sets, in accordance with the table setup command, the contents of the forwarding table 42 stored in the table storage unit 40.
In the switch 5, the table setup unit 54 receives the table setup command from the host communication unit 53. Then, the table setup unit 54 sets, in accordance with the table setup command, the contents of the forwarding table 52 stored in the table storage unit 50.
After the setting of the forwarding table in each switch is completed, the monitoring unit 15 of the management host 1 transmits the check frame FR to each of the loop routes LP1 to LP3. As an example, let us describe hereinafter the transmission of the check frame FR from the management host 1 to the loop route LP1 and the circulation of the check frame FR in the loop route LP1. The same applied to the other loop routes LP2 and LP3.
The monitoring unit 15 of the management host 1 transmits the check frame FR shown in
The forwarding processing unit 21 of the switch 2 receives the check frame FR (TTL=11) from the host communication unit 23. The forwarding processing unit 21 searches the forwarding table 22 by using the input port (HOST), MAC DA (00-00-4c-00-aa-01), MAC SA (00-00-4c-00-12-34) and input TTL (11) of the received check frame FR as the search keyword. Referring to
The forwarding processing unit 31 of the switch 3 receives the check frame FR (TTL=10) from the port 37. The forwarding processing unit 31 searches the forwarding table 32 by using the input port (37), MAC DA (00-00-4c-00-aa-01), MAC SA (00-00-4c-00-12-34) and input TTL (10) of the received check frame FR as the search keyword. Referring to
The forwarding processing unit 51 of the switch 5 receives the check frame FR (TTL=9) from the port 59. The forwarding processing unit 51 searches the forwarding table 52 by using the input port (59), MAC DA (00-00-4c-00-aa-01), MAC SA (00-00-4c-00-12-34) and input TTL (9) of the received check frame FR as the search keyword. Referring to
The forwarding processing unit 21 of the switch 2 receives the check frame FR (TTL=8) from the port 28. The forwarding processing unit 21 searches the forwarding table 22 by using the input port (28), MAC DA (00-00-4c-00-aa-01), MAC SA (00-00-4c-00-12-34) and input TTL (8) of the received check frame FR as the search keyword. Referring to
Thereafter, each switch on the loop route LP1 repeats the forwarding of the check frame FR along the loop route LP1 in a similar way. As a result, the check frame FR repeatedly circulates in the loop route LP1.
The check frame FR circulates in the loop route LP1, and the TTL becomes 1 at a certain timing. The switch 3 that receives the check frame FR whose TTL=1 becomes the response switch that returns the response RES back to the management host 1 (Step S23).
More specifically, the forwarding processing unit 31 of the switch 3 receives the check frame FR (TTL=1) from the port 37. The forwarding processing unit 31 searches the forwarding table 32 by using the input port (37), MAC DA (00-00-4c-00-aa-01), MAC SA (00-00-4c-00-12-34) and input TTL (1) of the received check frame FR as the search keyword. Referring to
The port 39 is connected to the port 59 of the switch 5, and thus the check frame FR (TTL=11) is forwarded from the switch 3 to the subsequent switch 5 on the loop route LP1. Thereafter, the circulation of the check frame FR in the loop route LP1 continues in a similar way. Meanwhile, the host communication unit 33 transmits the check frame FR to the management host 1 through the control link 63. The check frame FR serves as the response RES that is transmitted from the switch 3 as the response switch to the management host 1.
The monitoring unit 15 of the management host 1 receives the check frame FR from the switch 3 through the node communication unit 16. When receiving the check frame FR, the monitoring unit 15 resets the timer TM for the loop route LP1.
Meanwhile, the circulation of the check frame FR in the loop route LP1 is continuing as shown in
The monitoring unit 15 of the management host 1 can determine whether or not a failure relating to the loop route LP1 is occurring, based on reception state of the check frame FR (response RES) from each switch on the loop route LP1.
In the case of the example shown in
In the case of the failure occurrence determination, the monitoring unit 15 updates the topology table TPL stored in the storage unit 10. More specifically, the monitoring unit 15 rewrites the status flags associated with all the link IDs (A3, A5, A9) constituting the loop route LP1 that is determined to have the failure occurrence, from “1” to “0”. As a result, the topology table TPL is updated from one shown in
In the example shown in
The reason why all switches 3, 2 and 5 on the loop route LP1 become the response switch is that the initial value of the TTL of the check frame FR is set to “11”. In other words, the reason is that the number of links (the number of switches) included in the loop route LP1 is 3 and the initial value (=11) of the TTL is set to a natural number other than multiples of 3. In this case, the switches on the loop route LP1 become the response switch for every 11 switches along the loop route LP1.
This can be generalized as follows. The number of links (the number of switches) included in a certain loop route LP is “n”. In this case, it is preferable that the initial value of the TTL of the check frame FR for the loop route LP is selected from a group consisting of natural numbers other than multiples of n and multiples of a factor of n. As a simple method, the initial value may be set to a number that is obtained by adding or subtracting 1 to or from multiples of n. The initial value is denoted by “i”. In this case, the switches on the loop route LP become the response switch for every i switches along the loop route LP. That is, all of the n switches on the loop route LP can become the response switch.
In the example shown in
The initial value of the TTL contributes to a frequency at which the management host 1 receives the response RES. To put it the other way around, a frequency at which the management host 1 receives the response RES is adjustable based on the setting of the initial value of the TTL. The initial value of the TTL is preferably set to an appropriate value depending on system specification.
In the example shown in
First, the entry control unit 14 instructs all switches (2, 3, 5) belonging to the loop route LP1 where the failure occurrence is detected to temporarily change the forwarding table (22, 32, 52). More specifically, the entry control unit 14 instructs rewriting of the forwarding table such that the response switch forwards the check frame FR only to the management host 1. Therefore, the entry control unit 14 instructs each of the switches 2, 3 and 5 to change the output port in the entry associated with “MAC DA32 00-00-4c-00-aa-01, input TTL=1” in the forwarding table to just “HOST”.
As a result, the forwarding table 22 of the switch 2 is temporarily changed from one shown in
After the changing of the forwarding table in each switch is completed, the monitoring unit 15 transmits the check frame FR concurrently to all the switches (2, 3, 5) belonging to the loop route LP1. At this time, the TTL of each check frame FR is set to “2”. Also, the monitoring unit 15 starts the timer TM at the same time as the transmission of the check frame FR.
The forwarding processing unit 21 of the switch 2 receives the check frame FR (TTL=2) from the host communication unit 23 (HOST). The forwarding processing unit 21 refers to the forwarding table 22 shown in
The forwarding processing unit 31 of the switch 3 receives the check frame FR (TTL=2) from the host communication unit 33 (HOST). The forwarding processing unit 31 refers to the forwarding table 32 shown in
The monitoring unit 15 of the management host 1 receives the check frame FR from the switch 5. This means that the check frame FR is transferred to the switch 5 along the loop route LP1. In other words, this means that the physical link (link ID=A5) whose destination switch is the switch 5 and included in the loop route LP1 is normal. Therefore, the monitoring unit 15 rewrites the status flag of the entry “destination switch=5, status flag=0” in the topology table TPL from “0” to “1”. As a result, the status flag of the entry of link ID=A5 returns back to “1” as shown in
Moreover, the forwarding processing unit 51 of the switch 5 receives the check frame FR (TTL=2) from the host communication unit 53 (HOST). The forwarding processing unit 51 refers to the forwarding table 52 shown in
The monitoring unit 15 of the management host 1 receives the check frame FR from the switch 2. This means that the check frame FR is transferred to the switch 2 along the loop route LP1. In other words, this means that the physical link (link ID=A9) whose destination switch is the switch 2 and included in the loop route LP1 is normal.
Therefore, the monitoring unit 15 rewrites the status flag of the entry “destination switch=2, status flag=0” in the topology table TPL from “0” to “1”. As a result, the status flag of the entry of link ID=A9 returns back to “1” as shown in
After that, the timer TM expires. At this time, the monitoring unit 15 does not yet receive the check frame FR (response RES) from the switch 3. Therefore, as shown in
The monitoring unit 15 has the display unit 17 display the contents of the topology table TPL shown in
According to the present exemplary embodiment, as described above, once a check frame FR is transmitted from the management host 1 to the loop route LP for one time, then the check frame FR repeatedly circulates in the loop route LP. When the predetermined condition is satisfied, each switch on the loop route LP returns the response RES to the management host 1. The management host 1 can determine whether or not a failure relating to the loop route LP is occurring, based on reception state of the response RES from the loop route LP. According to the present exemplary embodiment, the management host 1 needs not to periodically repeat the transmission of the check frame FR in order to perform the health-checking continuously. It is therefore possible to reduce burden placed on the management host 1.
Moreover, in the case where all switches on the loop route LP become the response switch, it is also possible to determine whether or not a failure is occurring at the control link between the between the management host 1 and the switch. In other words, it is possible to concurrently perform the health-checking of not only the physical link between the switches but also the control link between the management host 1 and the switch. To this end, it is preferable that the initial value of TTL of the check frame FR is designed such that all switches on the loop route LP become the response switch.
Furthermore, according to the present exemplary embodiment, each switch is provided with the forwarding table. The transfer of the check frame FR along the loop route LP can be achieved by appropriately setting up contents of the forwarding table of each switch in accordance with the instruction from the management host 1. Therefore, each switch just needs to refer to the forwarding table to forward the received check frame FR to a designated forwarding destination. In the present exemplary embodiment, there is no need to incorporate the health-check table including information of the transfer route, the check list and the like (see Patent Literature 2, Patent Literature 3) into the check frame FR. Each switch needs not to perform such the complicated processing as described in Patent Literature 2 and Patent Literature 3 in order to achieve the forwarding of the check frame FR. As a result, burden placed on each switch also is reduced.
Moreover, according to the present exemplary embodiment, it is possible to identify the location of failure by a simple processing. The reason is that the response switch on the loop route LP returns the response RES to the management host 1. The management host 1 can easily identify the location of failure in the loop route LP based on reception state of the response RES from the loop route LP. The complicated processing such as required in Patent Literature 2 or Patent Literature 3 is not necessary for identifying the location of failure. For example, such processing as described in Patent Literature 3 that each node investigates whether or not it can communicate with the next node is not necessary. As a result, burden placed on each switch is reduced.
In the above-described example, TTL is used as the forwarding count information of the check frame FR. However, it is not limited to that. For example, a hop number instead of TTL can be used as the forwarding count information. Even in this case, similar processing can be achieved. More specifically, the management host 1 transmits a check frame FR whose hop number is set to an initial value to one switch on the loop route LP. Each switch on the loop route LP, when receiving the check frame FR, increases the hop number of the check frame FR. The check frame FR circulates in the loop route LP, and a switch where the hop number reaches a predetermined threshold value becomes the response switch. The response switch returns the response RES back to the management host 1 and forwards the check frame FR to the subsequent switch after resetting the hop number to the initial value. Also, it is preferable that the initial value and the threshold value of the hop number are designed such that all switches on the loop route LP become the response switch, as in the case of TTL.
Alternatively, the forwarding count information such as TTL and hop number may not be used. That is, each switch on the loop route LP, when receiving the check frame FR, forwards the check frame FR to the subsequent switch and returns the response RES back to the management host 1 “with a predetermined probability”. For example, each switch on the loop route LP generates a random number when receiving the check frame FR and, if the random number is equal to or larger than a predetermined threshold value, transmits the response RES to the management host 1. In this case, each switch on the loop route LP becomes the response switch with a predetermined probability. It is also possible to determine whether or not a failure is occurring at the control link between the management host 1 and the switch, because all switches on the loop route LP can become the response switch.
As to the failure location identification processing (Step S40), the following modification example also is possible. That is, the forwarding table used in the Step S40 (see
It should be noted that although the term “frame” is used in the above description, the same applies to a case of “packet (IP packet etc.)”.
While the exemplary embodiments of the present invention have been described above with reference to the attached drawings, the present invention is not limited to these exemplary embodiments and can be modified as appropriate by those skilled in the art without departing from the spirit and scope of the present invention.
This application is based upon and claims the benefit of priority from Japanese patent application No. 2009-021330, filed on Feb. 2, 2009, the disclosure of which is incorporated herein in its entirely by reference.
Number | Date | Country | Kind |
---|---|---|---|
2009-021330 | Feb 2009 | JP | national |
This is a continuation of International Application No. PCT/JP2010/050905, filed on Jan. 25, 2010.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2010/050905 | Jan 2010 | US |
Child | 13067848 | US |