This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2017-105020, filed on May 26, 2017, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to a system management technique.
A cloud system has a complicated configuration constructed by many servers, switches, and others to realize provision of services to multiple customers. When a failure occurs in such a complicated environment, to support a cloud provider, a cloud management apparatus that manages the cloud system specifies customers affected by the failure based on physical path information and configuration information of a virtual system stored in advance.
Note that there is a technique for associating network identifiers for routing with computer identifiers by: grouping multiple computers to execute a program to be executed in parallel, for each relay apparatus in the bottom layer among relay apparatuses of a hierarchical configuration; sorting the groups thus formed; and allocating the identifiers to the computers in the order of the sorting.
There is also a technique for generating a VLAN setting information table in such a way that redundant paths for paths that connect switches A and B connected to terminals that configure a VLAN are specified based on information on physical connection states of network connection devices and connection states thereof in a spanning tree.
The related arts are disclosed in, for example, Japanese Laid-open Patent Publication Nos. 2012-98881 and 2007-158764.
According to an aspect of the invention, a system management apparatus for managing a network system, the system management apparatus includes a processor configured to perform specifying of a first communication path including a L3 relay apparatus between a first pair of information processing apparatuses included in the network system and a second communication path not including any L3 relay apparatus between a second pair of information processing apparatuses included in the network system, store management information in the memory, the management information including information of the first communication path and the second communication path in association with information of the first pair of information processing apparatuses and the second pair of information processing apparatuses, and when a failure occurs in the network system, perform a detection of communication between a third pair of information processing apparatuses affected by the failure.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
When a layer 3 (L3) relay apparatus that treats packets in the layer 3 or higher is present in a cloud system, turning back sometimes occurs in the L3 relay apparatus. In related art, information on a physical path turning back at the L3 relay apparatus is not included in physical path information used in processing of specifying customers affected when a failure occurs. Therefore, the affected customers may not be accurately specified.
System management apparatuses, system management methods, and computer programs according to embodiments are explained in detail below with reference to the drawings. In a first embodiment, an information processing system is explained that reduces an amount of physical path information used for specifying customers affected by a failure to reduce a time taken for the processing of specifying the affected customers. In a second embodiment, an information processing system is explained that specifies a physical path affected by a failure including a physical path turning back in the L3 relay apparatus. Note that the embodiments do not limit the disclosed technique.
First, an information processing system according to a first embodiment is explained.
The servers 41 are information processing apparatuses that perform information processing. The switches 42 are apparatuses that relay communication among the servers 41. Note that, in
A VM #1 operates in the server #1, a VM #2 operates in the server #2, and a VM #3 operates in the server #3. The VM indicates a virtual machine that operates on the server 41. VMs are allocated to tenants that use the information processing system 10. Virtual networks are allocated to the tenants that use the information processing system 10. In
The cloud management apparatus 1 is an apparatus that, when a failure occurs in a network, specifies affected customers by specifying affected inter-VM communication. For example, when a failure occurs in a network infrastructure, a cloud provider 7, which operates a cloud system, inquires about an influence range to the cloud management apparatus 1. The cloud management apparatus 1 specifies affected customers by specifying affected inter-VM communication and displays a specified result on a display apparatus used by the cloud provider 7. In
The cloud management apparatus 1 manages, as the same group, the servers 41, all edge switches connected to which are the same, and manages communication paths among server groups. The edge switches are the switches 42 connected to the servers 41 by one link 43. In
The cloud management apparatus 1 is explained.
In the redundancy management table 11, information on redundant configurations of the information processing system 10 are registered.
In the connection link management table 12, information on the links 43 connected to the switches 42 or the servers 41 is registered.
In the VM management table 13, the VMs 44 operating in the servers 41 are registered.
The server-group creating unit 14 groups the servers 41 referring to the connection link management table 12 and creates the server management table 15 and the server group management table 16. The server-group creating unit 14 groups the servers 41, all edge switches connected to which are the same, in the same group.
In the server management table 15, information on a server group is registered for each of the servers. In the server group management table 16, information on the edge switch to which the server group is connected is registered.
As illustrated in
As illustrated in
As illustrated in
The server-group creating unit 14 performs group allocation under a policy of allocating the servers 41, all edge switches connected to which are the same, to the same group. On the other hand, a policy of allocating all the servers 41 subordinate to a switch to the same group is also conceivable.
As illustrated in
When a failure occurs in the link #5, the server #1 is not affected because a path passing through the link #6 is present in communication with the server #3. However, the server #2 is affected because another path is absent in communication with the server #3. That is, in the example 1 of the group allocation, the servers 41 different in presence or absence of influence are present in the same group G#1.
On the other hand, as illustrated in
When a failure occurs in the link #5, the server #1 is not affected because a path passing through the link #6 is present in communication with the server #3. However, the server #2 is affected because another path is absent in communication with the server #3. However, since different groups are allocated to the servers #1 and #2, the servers 41 different in the presence or absence of influence is absent in the same group. In this way, by allocating the servers 41, all the edge switches connected to which are the same, to the same group, the server-group creating unit 14 may cause all the servers 41 in the same group to be affected the same by a failure.
The server-group creating unit 14 creates a server group by performing the following (1) to (5) on all the edge switches.
(1) Select one edge switch.
(2) Extract the server 41 adjacent to the edge switch selected in (1) and not allocated with a server group, allocate a server group to the server 41, and extract all edge switches to which the extracted server 41 is connected.
(3) Extract another server 41 adjacent to the edge switch selected in (1) and not allocated with a server group and extract all edge switches to which the extracted other server 41 is connected.
(4) Compare the edge switches extracted in (2) and the edge switches extracted in (3) and, when all the edge switches are the same, allocate the server group allocated in (2) to the other server 41.
(5) Repeat (3) and (4) until no more server 41 adjacent to the selected edge switch is left and repeats (1) to (4) until no edge switch is left.
The physical-path creating unit 17 specifies, referring to the connection link management table 12 and the server group management table 16, a set of the links 43 connecting two edge switches as a physical path and creates the physical path table 18. The physical path and two server groups that perform communication using the physical path are registered in the physical path table 18.
As illustrated in
The physical-path creating unit 17 specifies all physical paths by retrieving, for all the edge switches, a path from an edge switch to another edge switch. The physical-path creating unit 17 extracts server groups subordinate to edge switches at both ends of the physical paths referring to the server group management table 16, creates a combination of the server groups, and registers the combination in the physical path table 18 in association with the physical paths.
The specifying unit 19 specifies inter-VM communication affected by an occurred failure. The specifying unit 19 includes an inter-group-communication specifying unit 21 and an inter-VM-communication specifying unit 22.
The inter-group-communication specifying unit 21 specifies inter-server group communication affected by the occurred failure. That is, the inter-group-communication specifying unit 21 specifies a physical path affected by the occurred failure referring to the physical path table 18 and determines whether the specified physical path is active referring to the redundancy management table 11 and the connection link management table 12. When the specified physical path is active, the inter-group-communication specifying unit 21 specifies, referring to the physical path table 18, inter-server group communication corresponding to the physical path and determines whether another physical path is present in the specified inter-server group communication. The inter-group-communication specifying unit 21 specifies inter-server group communication without another physical path in the specified inter-server group communication as inter-server group communication affected by the occurred failure.
The inter-VM-communication specifying unit 22 specifies inter-server communication affected by the failure from the inter-server group communication specified by the inter-group-communication specifying unit 21 and specifies inter-VM communication affected by the failure from the specified inter-server communication. That is, the inter-VM-communication specifying unit 22 respectively extracts, referring to the server management table 15, the servers 41 in two server groups set as targets of the inter-server group communication specified by the inter-group-communication specifying unit 21. The inter-VM-communication specifying unit 22 creates a combination of the servers 41 between different server groups and specifies inter-VM communication affected by the occurred failure referring to the VM management table 13.
In this way, the specifying unit 19 specifies affected inter-VM communication considering whether a physical path affected by the occurred failure is active and, when the physical path is active, considering whether redundant paths are present for affected inter-server group communication or inter-server communication.
The communication between the server groups G#1 and G#3 is not affected by the failure because a standby path passing through the link #6 is present. On the other hand, in the communication between the server groups G#2 and G#3, communication between the servers #2 and #3 is affected by the failure because a standby path is absent. The communication between the VMs #2 and #3 is specified as affected inter-VM communication.
When a failure occurs in a physical path between the server 41 and an edge switch, the inter-group-communication specifying unit 21 specifies a physical path passing through an edge switch connected to a failure part referring to the communication link management table 12 and the physical path table 18. The inter-group-communication specifying unit 21 determines whether the specified physical path is active referring to the redundancy management table 11 and the connection link management table 12. When the specified physical path is active, the inter-group-communication specifying unit 21 specifies inter-server group communication in which the specified physical path is used. However, the inter-server group communication to be specified is communication including a server group to which the server 41 connected to the failure part belongs.
The inter-group-communication specifying unit 21 determines whether another physical path is present in the specified inter-server group communication referring to the physical path table 18. The inter-group-communication specifying unit 21 specifies, as inter-server group communication affected by the occurred failure, inter-server group communication without another physical path in the specified inter-server group communication.
The inter-VM communication specifying unit 22 respectively extracts, referring to the server management table 15, the servers 41 in two server groups set as targets of the inter-server group communication specified by the inter-group-communication specifying unit 21. However, the inter-VM communication specifying unit 22 extracts only the server 41 connected to the failure part from the server group to which the server 41 connected to the failure part belongs. The inter-VM communication specifying unit 22 creates a combination of the servers 41 between the server groups and specifies inter-VM communication affected by the occurred failure referring to the VM management table 13.
When a failure occurs in the path between the server 41 and the edge switch, the inter-VM-communication specifying unit 22 extracts, in a server group to which the server 41 connected to a failure part belongs, a physical path of affected inter-server communication. The inter-VM-communication specifying unit 22 determines whether the extracted physical path is active referring to the redundancy management table 11 and the connection link management table 12. When the extracted physical path is active, the inter-VM-communication specifying unit 22 determines whether another path is present referring to the redundancy management table 11 and the connection link management table 12. When another path is absent, the inter-VM-communication specifying unit 22 extracts the VM 44 constructed by the server 41 set as a target of the affected inter-server communication and specifies a combination of VMs on different servers as affected inter-VM communication.
A flow of processing of creating a server group and creating the physical path table 18 is explained.
As illustrated in
On the other hand, when the processing of retrieving all the switches 42 is completed, the server-group creating unit 14 determines whether processing of specifying a server group is completed for all the edge switches (step S4). As a result, when edge switches on which the processing of specifying a server group is not performed are present, the server-group creating unit 14 selects one edge switch (step S5). The server-group creating unit 14 determines whether the server group allocation to all servers subordinate to the selected edge switch is completed (step S6).
When the server 41 on which the server group allocation is not performed is present, the server-group creating unit 14 extracts the server 41 to which a server group is not allocated, allocates a new server group to the server 41, and registers the new server group in the server management table 15 (step S7). The server-group creating unit 14 determines whether the server group allocation to all the servers subordinate to the selected edge switch is completed (step S8).
When the server 41 on which the server group allocation is not performed is present, the server-group creating unit 14 extracts the server 41 to which a server group is not allocated (step S9). The server-group creating unit 14 determines whether edge switch connection configurations of the extracted server 41 and the server 41 to which the server group is allocated in step S7 are the same (step S10). As a result, when the edge switch connection configurations are the same, the server-group creating unit 14 allocates the same server group to the extracted server 41 and registers the server group in the server management table 15 (step S11) and returns to step S8. When the edge switch connection configurations are not the same, the server-group creating unit 14 returns to step S8.
When determining in step S8 that the server group allocation to all the servers is completed, the server-group creating unit 14 registers the selected edge switches and the allocated server groups in the server group management table 16 (step S12). When determining in step S6 that the server group allocation to all the servers is completed, the server-group creating unit 14 also registers the selected edge switches and the allocated server groups in the server group management tables 16 (step S12). The server-group creating unit 14 returns to step S4.
When determining in step S4 that the processing of specifying a server group is completed for all the edge switches, the server-group creating unit 14 ends the processing. The physical-path creating unit 17 starts the processing of creating the physical path table 18.
As illustrated in
The physical-path creating unit 17 determines whether the selected adjacent node is an edge switch (step S25). When the selected adjacent node is not an edge switch, the physical-path creating unit 17 determines whether the adjacent node is the server 41 (step S26). As a result, when the adjacent node is not the server 41, the physical-path creating unit 17 determines whether the processing of retrieving all adjacent links is completed for the adjacent node (step S27). When adjacent links not retrieved are present, the physical-path creating unit 17 returns to step S24.
On the other hand, when the processing of retrieving all adjacent links is completed for the adjacent node or when the adjacent node is the server 41, the physical-path creating unit 17 returns to step S23. When determining in step S25 that the adjacent node is an edge switch, the physical-path creating unit 17 creates a combination of server groups corresponding to edge switches at both ends of the retrieved physical path and registers the combination in the physical path table 18 together with the physical path (step S28). The physical-path creating unit 17 returns to step S23.
When determining in step S23 that the processing of retrieving all adjacent links is completed, the physical-path creating unit 17 returns to step S21. When determining in step S21 that the processing of specifying a physical path is completed for all the edge switches, the physical-path creating unit 17 deletes overlapping paths from the physical path table 18 (step S29) and ends the processing of creating the physical path table 18.
In this way, the server-group creating unit 14 creates a server group and the physical-path creating unit 17 creates the physical path table 18 based on the server group. Consequently, the specifying unit 19 may specify an influence range of a failure referring to the physical path table 18.
A flow of processing of specifying an influence range is explained.
As illustrated in
On the other hand, when physical paths not confirmed are present, the specifying unit 19 determines whether one of the specified physical paths is active (step S34). When the physical path is not active, the specifying unit 19 returns to step S33. On the other hand, when the physical path is active, the specifying unit 19 determines whether a standby path is present (step S35). When a standby path is present, the specifying unit 19 returns to step S33.
On the other hand, when a standby path is absent, the specifying unit 19 specifies inter-server group communication corresponding to the physical path (step S36) and specifies a combination of the servers 41 that perform communication based on the specified inter-server group communication (step S37). The specifying unit 19 specifies the VMs 44 on the specified servers (step S38) and specifies a combination of the specified VMs 44 as affected inter-VM communication (step S39). The specifying unit 19 returns to step S33.
When determining in step S31 that the failure part is the connection link of the server 41, as illustrated in
The specifying unit 19 determines whether the confirmation of all the physical paths is completed (step S41). When physical paths not confirmed are present, the specifying unit 19 determines whether one of the specified physical paths is active (step S42). When the physical path is not active, the specifying unit 19 returns to step S41. On the other hand, when the physical path is active, the specifying unit 19 determines whether a standby path is present (step S43). When a standby path is present, the specifying unit 19 returns to step S41.
On the other hand, when a standby path is absent, the specifying unit 19 specifies inter-server group communication corresponding to the physical path (step S44) and specifies a combination of the servers 41 that perform communication based on the specified inter-server group communication (step S45). However, in a server group to which the server 41 connected to the failure link belongs, the specifying unit 19 specifies only a combination including the server 41 connected to the failure link. The specifying unit 19 specifies the VMs 44 on the specified servers (step S46) and specifies a combination of the specified VMs 44 as affected inter-VM communication (step S47).
When determining in step S41 that the confirmation of all the physical paths is completed, the specifying unit 19 specifies a physical path among servers including a connected server, which is connected to the failure link, in the server group to which the connected server belongs (step S48). The specifying unit 19 determines whether the confirmation of all the physical paths is completed (step S49). When the confirmation of all the physical paths is completed, the specifying unit 19 ends the processing.
On the other hand, when physical paths not confirmed are present, the specifying unit 19 determines whether one of the specified physical paths is active (step S50). When the physical path is not active, the specifying unit 19 returns to step S49. On the other hand, when the physical path is active, the specifying unit 19 determines whether a standby path is present (step S51). When a standby path is present, the specifying unit 19 returns to step S49.
On the other hand, when a standby path is absent, the specifying unit 19 specifies the VMs 44 on the servers that perform inter-server communication corresponding to the physical path (step S52) and specifies a combination of the specified VMs 44 as affected inter-VM communication (step S53).
In this way, the specifying unit 19 specifies affected inter-server group communication, specifies affected inter-server communication based on the specified inter-server group communication, and specifies affected inter-VM communication based on the specified inter-server communication. Therefore, the specifying unit 19 may reduce a time taken for the processing of specifying affected inter-VM communication.
An example of specifying an influence range is explained with reference to
The server #1 is connected to the switch #1 by the link #1. The server #2 is connected to the switch #1 by the link #2 and connected to the switch #2 by the link #3. The server #3 is connected to the switch #1 by the link #4 and connected to the switch #2 by the link #5. The switches #1 and #3 are connected by the link #6. The switches #2 and #4 are connected by the link #7. The server #4 is connected to the switch #3 by the link #8 and connected to the switch #4 by a link #9.
Connection of the switch #1 to the links #1, #2, #4, and #6 and connection of the switch #2 to the links #3, #5, and #7 are registered in the connection link management table 12. Connection of the switch #3 to the links #6 and #8 and connection of the switch #4 to the links #7 and #9 are registered in the connection link management table 12. Connection of the server #1 to the link #1, connection of the server #2 to the links #2 and #3, connection of the server #3 to the links #4 and #5, and connection of the server #4 to the links #8 and #9 are registered in the connection link management table 12.
Operation of the VM #1 on the server #1, operation of the VM #2 on the server #2, operation of the VM #3 on the server #3, and operation of the VM #4 on the server #4 are registered in the VM management table 13.
First, the physical-path creating unit 17 creates the server management table 15 and the server group management table 16. That is, the physical-path creating unit 17 extracts the servers #1, #2, and #3 as the servers 41 subordinate to the switch #1 based on the connection link management table 12. The physical-path creating unit 17 allocates the server group #1 to the server #1 and allocates the server group #2 to the servers #2 and #3. The physical-path creating unit 17 registers the allocated server groups subordinate to the switch #1 in the server management table 15 and the server group management table 16.
The physical-path creating unit 17 performs the same processing for the switches #2, #3, and #4 to allocate the server group G#3 to the server #4.
Subsequently, the physical-path creating unit 17 creates the physical path table 18. That is, the physical-path creating unit 17 extracts the servers #1, #2, and #3 and the switch #3 as adjacent nodes of the switch #1 based on the connection link management table 12. Only a physical path from the switch #1 to the switch #3 is a physical path from an edge switch to an edge switch. Therefore, the physical-path creating unit 17 registers the link #6 from the switch #1 to the switch #3 in the physical path table 18 as a communication path of a path #1. The physical-path creating unit 17 specifies the server groups G#1 and G#2 as server groups associated with the switch #1 and specifies the server group G#3 as a server group associated with the switch #3 referring to the server group management table 16. The physical-path creating unit 17 registers G#1-G#3 and G#2-G#3 in the physical path table 18 as communication groups corresponding to the path #1.
The physical-path creating unit 17 performs the same processing for the switches #2, #3, and #4 and respectively registers, in the physical path table 18, a path #2 with the link #7 set as a physical path, a path #3 with the link #6 set as a physical path, and a path #4 with the link #7 set as a physical path.
Subsequently, the physical-path creating unit 17 deletes overlapping physical paths from the physical path table 18. In
When a failure occurs, the specifying unit 19 specifies inter-VM communication affected by a failure.
When a failure occurs in the link #6, the specifying unit 19 extracts the path #1 passing through the link #6 referring to the physical path table 18. Since the switches #1 and #3 are active, the specifying unit 19 determines that the path #1 is active referring to the redundancy management table 11. The specifying unit 19 extracts G#1-G#3 and G#2-G#3 as affected inter-server group communications referring to the physical path table 18. The specifying unit 19 confirms whether a standby path is present or not for the affected inter-server group communications referring to the physical path table 18. Then, since the path #2 is present in G#2-G#3, the specifying unit 19 determines that a standby path is present.
For G#1-G#3, the specifying unit 19 extracts communication between the servers #1-#4 as affected inter-server communication referring to the server management table 15. The specifying unit 19 extracts the VMs #1-#4 as affected inter-VM communication referring to the VM management table 13.
The specifying unit 19 extracts the path #1 passing through the switch #1, to which the link #2 is connected, as an effected physical path referring to the connection link management table 12 and the physical path table 18. Since the switches #1 and #3 are active, the specifying unit 19 determines that the path #1 is active referring to the redundancy management table 11. The specifying unit 19 extracts G#2-G#3 as affected inter-server group communication referring to the physical path table 18. Note that, since the specifying unit 19 extracts only a path including the server group G#2 to which the server #2, to which the link #2 is connected, belongs, the specifying unit 19 does not extract G#1-G#3. For G#2-G#3, the specifying unit 19 determines that the path #2 is present as a standby path referring to the physical path table 18. Therefore, for the path #1, the specifying unit 19 determines that inter-server group communication affected by the failure of the link #2 is absent.
The specifying unit 19 creates a physical path of G#1-G#2 between server groups connected to the switch #1 referring to the server group management table 16. Since the switch #1 is active, the specifying unit 19 determines that G#1-G#2 is active referring to the redundancy management table 11. Since the switch 42 connected to the server groups G#1 and G#2 is absent other than the switch #1, the specifying unit 19 determines that a standby path is absent in G#1-G#2 referring to the server group management table 16. For G#1-G#2, the specifying unit 19 extracts the servers #1-#2 as affected inter-server communication referring to the server management table 15. Note that, for G#2, since only the server #2 connected to the link #2 is set as a target, the specifying unit 19 does not extract the servers #1-#3. The specifying unit 19 extracts the VMs #1-#2 as affected inter-VM communication referring to the VM management table 13.
The specifying unit 19 specifies, referring to the server management table 15, the servers #2-#3 as inter-server communication in the server group G#2 to which the server #2 connected to the link #2 belongs. Since the switch #1 is active, the specifying unit 19 determines that a physical path of the servers #2-#3 is active referring to the redundancy management table 11. The specifying unit 19 determines that a standby path is present in the servers #2-#3 referring to the connection link management table 12. Therefore, the specifying unit 19 determines that affected inter-server communication is absent in server groups including the servers 41 connected to the link 43 in which the failure occurs.
Effects obtained when the servers 41 are grouped are explained.
As illustrated in
As explained above, in the first embodiment, the inter-group-communication specifying unit 21 specifies inter-server group communication affected by a failure referring to the physical path table 18 that associates the physical path and the two server groups that performs communication using the physical path. The inter-VM-communication specifying unit 22 specifies inter-server communication affected by the failure, referring to the server management table 15 that associates the servers 41 and the server groups based on the inter-server group communication specified by the inter-group-communication specifying unit 21. The inter-VM-communication specifying unit 22 specifies inter-VM communication affected by the failure referring to the VM management table 13. Therefore, the cloud management apparatus 1 may specify, in a short time, the inter-VM communication affected by the failure and may reduce a time taken for processing of specifying customers affected by the failure.
In the first embodiment, the inter-group-communication specifying unit 21 confirms whether a standby path is present or not for the specified inter-server group communication referring to the physical path table 18. When a standby path is present, the inter-group-communication specifying unit 21 determines that the inter-server group communication is not affected by the failure. Therefore, the cloud management apparatus 1 may accurately specify customers affected by the failure.
In the first embodiment, when a failure occurs in the link 43 between the server 41 and the edge switch, the inter-VM-communication specifying unit 22 specifies only inter-server communication including a connected server as inter-server communication affected by the failure. Therefore, the cloud management apparatus 1 may accurately specify the inter-server communication affected by the failure.
In the first embodiment, when a failure occurs in the link 43 between the server 41 and the edge switch, the inter-VM-communication specifying unit 22 specifies, as inter-server communication affected by the failure, communication performed by the connected server with the other servers 41 in a server group. Therefore, the cloud management apparatus 1 may accurately specify the inter-server communication affected by the failure.
In the first embodiment, the server-group creating unit 14 creates the server group management table 16 referring to the connection link management table 12. The physical-path creating unit 17 creates the physical path table 18 referring to the connection link management table 12 and the server group management table 16. Therefore, the cloud management apparatus 1 may reduce a time taken for the processing of creating the physical path table 18.
Note that, in the first embodiment, the cloud management apparatus 1 is explained. However, an influence range specifying program having the same function may be obtained by realizing, with software, the configuration included in the cloud management apparatus 1. Therefore, a computer that executes the influence range specifying program is explained.
The main memory 51 is a memory that stores a computer program, an execution halfway result of the computer program, and the like. The CPU 52 is a central processing device that reads out the computer program from the main memory 51 and executes the computer program. The CPU 52 includes a chip set including a memory controller.
The LAN interface 53 is an interface for connecting the computer 50 to other computers through a LAN. The HDD 54 is a disk device that stores computer programs and data. The super IO 55 is an interface for connecting input devices such as a mouse and a keyboard. The DVI 56 is an interface for connecting a liquid display device. The ODD 57 is a device that performs reading and writing of a DVD.
The LAN interface 53 is connected to the CPU 52 by a PCI express (PCIe). The HDD 54 and the ODD 57 are connected to the CPU 52 by a serial advanced technology attachment (SATA). The super IO 55 is connected to the CPU 52 by a low pin count (LPC).
The influence range specifying program executed in the computer 50 is stored in the DVD, read out from the DVD by the ODD 57, and installed in the computer 50. Alternatively, the influence range specifying program is stored in a database or the like of another computer system connected via the LAN interface 53, read out from the database, and installed in the computer 50. The installed data processing program is stored in the HDD 54, read out to the main memory 51, and executed by the CPU 52.
Incidentally, in the above explanation in the first embodiment, an L3 relay apparatus that treats packets in the layer 3 or higher is not included in the information processing system. However, the L3 relay apparatus is sometimes included in the information processing system. Communication sometimes turns back at the L3 relay apparatus. Therefore, in the following explanation in a second embodiment, an information processing system includes the L3 relay apparatus.
Therefore, in the information processing system 10b, there is a physical path that reaches the server group G#2 from the server group G#1 turning back in the firewall 62. In the physical path, a packet passes the link #6 twice. Therefore, a cloud management apparatus 6 according to the second embodiment has to create a physical path table including a turning back path.
A cloud system may manage information on an information processing system in a data center but may be unable to manage information on a range exceeding a border edge of the data center. However, in a cloud system that operates in cooperation with an information processing system of a client, when a failure occurs, it is particularly important to specify presence or absence of influence on the information processing system of the client.
Therefore, the cloud management apparatus 6 collects configuration information of the information processing system of the client outside the data center.
Alternatively, as illustrated in
In the case of a network illustrated in
When a border edge on the data center side is represented by B#1 and server groups on the client side are represented by C#1, C#2, and C#3, as configuration information, an agent program on the server in the client environment may export a physical path table illustrated in
The administrator of the client environment passes exported or created data to an administrator of the data center. The administrator of the data center may cause the cloud management apparatus 6 to import the data.
A functional configuration of the cloud management apparatus 6 is explained.
Compared with the storing unit 1a, the storing unit 6a includes a physical path table 68 instead of the physical path table 18 and includes an apparatus management table 70 anew. Compared with the control unit 1b, the control unit 6b includes a physical-path creating unit 67 instead of the physical-path creating unit 17, includes a specifying unit 69 instead of the specifying unit 19, and includes a configuration-information collecting unit 72 anew. Compared with the specifying unit 19, the specifying unit 69 includes an inter-group-communication specifying unit 71 instead of the inter-group-communication specifying unit 21.
In the physical path table 68, when L3 relay apparatuses are not included in a physical path, the physical path and two server groups that perform communication using the physical path are registered. When L3 relay apparatuses are included in the physical path, in the physical path table 68, a physical path between one server group and the L3 relay apparatus, a physical path between the other server group and the L3 relay apparatus, and a physical path between the L3 relay apparatuses are registered.
As illustrated in
As a path between S#1 and S#6 across R#1, a path of G#1-R#1-G#2, that is, S#1-SW#1-R#1-SW#2-S#6 is calculated using information on the paths #1 and #2 of the physical path table 68. As a path between S#1 and S#2 not across R#1, a path of G#1-R#1-G#1, that is, S#1-SW#1-R#1-SW#1-S#2 is calculated using the information on the path #1 twice. Note that the path of S#1-SW#1-S#2 is calculated by the processing explained in the first embodiment.
In the apparatus management table 70, types and setting information of apparatuses are registered.
In the type in
The setting information is used when specifying an influence range. For example, in the case of the switch 42, information on which VLAN-ID is allocated to which link 43 is retained as the setting information. In the case of the router, what kinds of a routing table the router has is managed by the setting information. In the case of the firewall 62, what kinds of filtering is performed is managed by the setting information. A path in which communication is not originally performed according to these kinds of setting information is not used for specifying the influence range.
It is also possible to more finely specify an influence range on the client side by also defining, concerning configuration information of the client environment, which service in the data center the servers on the client side use and linking the definition with the setting information.
Note that, as a method of creating the apparatus management table 70, there is a method of creating the apparatus management table 70 using a simple network management protocol (SNMP). Apparatuses (in the case of the servers 41, OSs) adapted to the SNMP retain, as sysObjectIDs, values of management information bases (MIBs) that may uniquely specify vendors and types. Therefore, the cloud management apparatus 6 may retain, in advance, a table that associates sysObjectIDs and types and create the apparatus management table 70 by linking values of the sysObjectIDs collected from the apparatuses and the types.
The configuration-information collecting unit 72 reads network configuration information from a target system 4 and reads network configuration information from a client environment 5. The configuration-information collecting unit 72 creates the connection link management table 12 including network configuration information of the client environment 5.
Like the physical-path creating unit 17, the physical-path creating unit 67 specifies, referring to the connection link management table 12 and the server group management table 16, a set of the links 43 connecting two edge switches as a physical path and creates the physical path table 68. However, when L3 relay apparatuses are included between the two edge switches, the physical-path creating unit 67 creates the physical path table 68 divided into a path between one edge switch and the L3 relay apparatus, a path between the other edge switch and the L3 relay apparatus, and a path between the L3 relay apparatuses.
When the cloud management apparatus 6 imports the physical path table illustrated in
Like the inter-group-communication specifying unit 21, the inter-group-communication specifying unit 71 specifies inter-server group communication affected by an occurred failure. However, for a physical path in which one end or both ends of a communication group including the link 43 in which the failure occurs are L3 relay apparatuses, the inter-group-communication specifying unit 71 creates an inter-server group physical path crossing across the L3 relay apparatuses or turning back at the L3 relay apparatuses. The inter-group-communication specifying unit 71 specifies, based on information on the created physical path, inter-server group communication affected by the occurred failure.
The inter-group-communication specifying unit 71 excludes a physical path found as not being used according to the setting information of the apparatus management table 70 and specifies inter-server group communication affected by the occurred failure. For example, when a physical path in which the server #1 and the server #2 communicate across the firewall 62 is included as a physical path determined as an influence range, the inter-group-communication specifying unit 71 confirms setting information for the firewall 62 from the apparatus management table 70. When a definition “all packets addressed to the server #2 are discarded” is included in the setting information, the physical path is not used. Therefore, the inter-group-communication specifying unit 71 excludes the physical path from the influence range.
A flow of processing of the cloud management apparatus 6 is explained with reference to
The cloud management apparatus 6 creates a server group and creates the server management table 15 and the server group management table 16 (step S64). The cloud management apparatus 6 specifies a physical path referring to the apparatus management table 70 in addition to the connection link management table 12 and the server group management table 16 and creates the physical path table 68 (step S65).
The physical-path creating unit 67 determines whether the selected adjacent node is an edge switch (step S75). When the selected adjacent node is not an edge switch, the physical-path creating unit 67 determines whether the adjacent node is an L3 relay apparatus (step S76). When the adjacent node is not an L3 relay apparatus, the physical-path creating unit 67 determines whether the adjacent node is the server 41 (step S77). As a result, when the adjacent node is not the server 41, the physical-path creating unit 67 determines whether the processing of retrieving all adjacent links is completed for the adjacent node (step S78). When adjacent links not retrieved are present, the physical-path creating unit 67 returns to step S74.
On the other hand, when the processing of retrieving all adjacent links is completed for the adjacent node or when the adjacent node is the server 41, the physical-path creating unit 67 returns to step S73. When determining in step S76 that the adjacent node is an L3 relay apparatus, the physical-path creating unit 67 creates a combination of a server group corresponding to the edge switch and the L3 relay apparatus and registers the combination in the physical path table 68 together with the physical path (step S80). The physical-path creating unit 67 returns to step S73.
When determining in step S75 that the adjacent node is an edge switch, the physical-path creating unit 67 creates a combination of server groups corresponding to edge switches at both ends of the retrieved physical path and registers the combination in the physical path table 68 together with the physical path (step S79). The physical-path creating unit 67 returns to step S73.
When determining in step S73 that the processing of retrieving all adjacent links is completed, the physical-path creating unit 67 returns to step S71. When determining in step S71 that the processing of specifying a physical path is completed for all the edge switches, the physical-path creating unit 67 deletes overlapping paths from the physical path table 68 (step S81).
As illustrated in
The physical-path creating unit 67 determines whether the selected adjacent node is an edge switch (step S86). When the selected adjacent node is not an edge switch, the physical-path creating unit 67 determines whether the adjacent node is an L3 relay apparatus (step S87). When the adjacent node is not an L3 relay apparatus, the physical-path creating unit 67 determines whether the adjacent node is the server 41 (step S88). As a result, when the adjacent node is not the server 41, the physical-path creating unit 67 determines whether the processing of retrieving all adjacent links is completed for the adjacent node (step S89). When adjacent links not retrieved are present, the physical-path creating unit 67 returns to step S85.
On the other hand, when the processing of retrieving all adjacent links is completed for the adjacent node or when the adjacent node is the server 41, the physical-path creating unit 67 returns to step S84. When determining in step S87 that the adjacent node is an L3 relay apparatus, the physical-path creating unit 67 creates a combination of relay apparatuses at both ends and registers the combination in the physical path table 68 together with the physical path (step S91). The physical-path creating unit 67 returns to step S84.
When determining in step S86 that the adjacent node is an edge switch, the physical-path creating unit 67 creates a combination of a server group corresponding to the edge switch and the relay apparatus and registers the combination in the physical path table 68 together with the physical path (step S90). The physical-path creating unit 67 returns to step S84.
When determining in step S84 that the processing of retrieving all adjacent links is completed, the physical-path creating unit 67 returns to step S82. When determining in step S82 that the processing of specifying a physical path is completed for all the L3 relay apparatuses, the physical-path creating unit 67 deletes overlapping paths from the physical path table 68 (step S92) and ends the processing of creating the physical path table 68.
On the other hand, when physical paths not confirmed are present, the specifying unit 69 determines whether one of the specified physical paths is active (step S104). When the physical path is not active, the specifying unit 69 returns to step S103. On the other hand, when the physical path is active, the specifying unit 69 determines whether a standby path is present (step S105). When a standby path is present, the specifying unit 69 returns to step S103.
On the other hand, when a standby path is absent, the specifying unit 69 determines whether one end or both ends of the physical path are L3 relay apparatuses (step S106). When the one end or both the ends are L3 relay apparatuses, the specifying unit 69 creates, for the physical path, the one end or both the ends of which are the L3 relay apparatuses, a physical path between server groups crossing across the L3 relay apparatuses or turning back at the L3 relay apparatuses (step S107). However, the specifying unit 69 excludes the physical path found as not being used according to the setting information of the management table 70.
The specifying unit 69 specifies inter-server group communication corresponding to the physical path (step S108) and determines, based on the specified inter-server group communication, a combination of the servers 41 that perform communication (step S109). The specifying unit 69 specifies the VMs 44 on the specified servers (step S110) and specifies a combination of the specified VMs 44 as affected inter-VM communication (step S111). The specifying unit 69 returns to step S103.
When determining in step S101 that the failure part is a connection link of the server 41, the specifying unit 69 shifts to step S40 in
In this way, the physical-path creating unit 67 creates, referring to the apparatus management table 70, the physical path table 68 including a communication group, one end or both ends of which are L3 relay apparatuses. In the physical path table 68, when one end or both ends of a communication group corresponding to the physical path including the link 43 in which a failure occurs are L3 relay apparatuses, the specifying unit 69 specifies inter-server group communication turning back at the L3 relay apparatuses or crossing across the L3 relay apparatuses. Therefore, the cloud management apparatus 6 may accurately specify an influence range when a failure occurs in the information processing system 10b including the L3 relay apparatuses.
The cloud management apparatus 6 may specify presence or absence of influence on the client environment 5 during failure occurrence by reading network information of the client environment 5 and creating the physical path table 68. The cloud management apparatus 6 may specify an influence range excluding a physical path not in use by specifying an influence range referring to the setting information of the apparatus management table 70.
An example of specifying an influence range is explained with reference to
As illustrated in
G#21 is connected to S#21, G#22 is connected to S#22, G#23 is connected to S#23, G#24 is connected to S#24, and G#25 is connected to S#25. S#21 is connected to S#20 by L#21, S#22 is connected to S#20 by L#22, S#23 is connected to S#20 by L#23, S#24 is connected to S#20 by L#24, and S#25 is connected to S#20 by L#25. SW#20 is connected to R#20 by L#20. R#20 is connected to R#100 by L#120.
When a failure is detected in L#10 in
Specifically, for the path #1, physical paths including R#10 are the paths #6, #10, #13, #15, and #16 excluding the path #1. Therefore, as inter-server group communications turning back at R#10, G#11-G#12 (the paths #1 and #6), G#11-G#13 (the paths #1 and #10), G#11-G#14 (the paths #1 and #13), and G#11-G#15 (the paths #1 and #15) are specified.
G#11-R#100 (the paths #1 and #16) is specified as a communication group crossing across R#10. Since R#100 is an L3 relay apparatus, G#11-R#20 is specified using a path #17, which is a physical path including R#100 and excluding the path #16. Since R#20 is an L3 relay apparatus, paths #18, #23, #27, #30, and #32 are specified as physical paths including R#20 and excluding the path #17.
G#11-G#21 (the paths #1, #16, #17, and #18) is specified using the path #18. G#11-G#22 (the paths #1, #16, #17, and #23) is specified using the path #23. G#11-G#23 (the paths #1, #16, #17, and t #27) is specified using the path #27. G#11-G#24 (the paths #1, #16, #17, and #30) is specified using the path #30. G#11-G#25 (the paths #1, #16, #17, and #32) is specified using the path #32.
Similarly, for the path #6, as inter-server group communications turning back at R#10, G#12-G#11, G#12-G#13, G#12-G#14, and G#12-G#15 are specified. As inter-server group communications crossing across R#10, R#100, and R#20, G#12-G#21, G#12-G#22, G#12-G#23, G#12-G#24, and G#12-G#25 are specified.
Similarly, for the path #10, as inter-server group communications turning back at R#10, G#13-G#11, G#13-G#12, G#13-G#14, and G#13-G#15 are specified. As inter-server group communications crossing across R#10, R#100, and R#20, G#13-G#21, G#13-G#22, G#13-G#23, G#13-G#24, and G#13-G#25 are specified.
Similarly, for the path #13, as inter-server group communications turning back at R#10, G#14-G#11, G#14-G#12, G#14-G#13, and G#14-G#15 are specified. As inter-server group communications crossing across R#10, R#100, and R#20, G#14-G#21, G#14-G#22, G#14-G#23, G#14-G#24, and G#14-G#25 are specified.
Similarly, for the path #15, as inter-server group communications turning back at R#10, G#15-G#11, G#15-G#12, G#15-G#13, and G#15-G#14 are specified. As inter-server group communications crossing across R#10, R#100, and R#20, G#15-G#21, G#15-G#22, G#15-G#23, G#15-G#24, and G#15-G#25 are specified.
The specifying unit 69 removes overlaps from the specified inter-server group communications and specifies inter-server group communications illustrated in
Note that the specifying unit 69 confirms the setting information of the apparatus management table 70 at timing when inter-server group communication turning back at an L3 relay apparatus or inter-server group communication crossing across the L3 relay apparatus is specified. When communication is not performed, the specifying unit 69 excludes the inter-server group communication.
For example, at timing when inter-server group communication G#11-G#12 of the path #1 is specified, the specifying unit 69 understands that the inter-server group communication passes through R#10, S#10, S#11, and S#12. Therefore, the specifying unit 69 checks setting information of R#10, S#10, S#11, and S#12 from the apparatus management table 70.
Specifically, the specifying unit 69 analyzes setting information of ports of the apparatuses and routing information of R#10. When determining that G#11 and G#12 belong to the same network (on the same VLAN) and do not perform communication through R#10, the specifying unit 69 excludes G#11-G#12 from an influence range. Conversely, when determining that G#11 and G#12 belong to different networks (on different VLANs) and communication is turned back at R#10, the specifying unit 69 does not exclude G#11-G#12.
As explained above, in the second embodiment, when L3 relay apparatuses is included in the target system 4, the physical-path creating unit 67 creates the physical path table 68 including a communication group, one end or both ends of which are the L3 relay apparatuses. The inter-group-communication specifying unit 71 specifies, for a physical path in which one end or both ends of a communication group including the link 43 in which a failure occurs are L3 relay apparatuses, inter-server group communications crossing across the L3 relay apparatuses or turning back at the L3 relay apparatuses. Therefore, when a failure occurs in the target system 4 including the L3 relay apparatuses, the cloud management apparatus 6 may accurately specify customers affected by the failure.
In the second embodiment, the configuration-information collecting unit 72 collects network configuration information of the client environment 5. The physical-path creating unit 67 creates the physical path table 68 including the client environment 5. The inter-group-communication specifying unit 71 specifies, using the physical path table 68, inter-server group communication affected by a failure including the client environment 5. Therefore, when a failure occurs, the cloud management apparatus 6 may specify presence or absence of influence on the client environment 5.
In the second embodiment, when specifying inter-server group communication affected by a failure, the inter-group-communication specifying unit 71 excludes, using the setting information of the apparatus management table 70, inter-server group communication in which communication is not performed. Therefore, the cloud management apparatus 6 may accurately specify customers affected by the failure.
Note that, in the above explanation in the second embodiment, server groups are created and inter-server group communication affected by a failure are specified. However, the present disclosure is not limited to this and may be applied when inter-server communication affected by the failure is specified. For example, the inter-server group communication may be changed to the inter-server communication by providing a server group for each server. Alternatively, the inter-server communication may be specified without performing the creation of a server group by the server-group creating unit 14.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2017-105020 | May 2017 | JP | national |