This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2017-101098, filed on May 22, 2017, the entire contents of which are incorporated herein by reference.
The embodiment discussed herein is related to an information processing device and a method for controlling an information processing device.
For a service to be provided using an information processing system including an information processing device such as a server, countermeasures such as disaster recovery (DR) are taken in some cases. In DR, a site that is similar to a certain site in which a service is actually operated is prepared as a backup site at a location geographically separated from the certain site for a disaster such as an earthquake. If a failure has occurred in the certain site used for the operation of the service upon the occurrence of a disaster, the site prepared as the backup site is used. By using this site, the service is quickly restored upon the occurrence of the disaster.
In addition, a technique for preparing multiple backup sites and using the multiple backup sites to restore, in a distributed manner, multiple information processing systems operating in a site in which a failure has occurred has been proposed. In addition, a technique for setting restoration priorities for information processing systems and restoring the information processing systems in order from an information processing system with the highest priority has been proposed.
Related techniques are disclosed in, for example, Japanese National Publication of International Patent Application No. 2010-530108, Japanese National Publication of International Patent Application No. 2015-510201, and Japanese Laid-open Patent Publication No. 2010-102468.
A function specific to a site, however, is used to provide a service in some cases. An example of the specific function is a function to be provided by a so-called public cloud. If public clouds are used, a region in which a certain public cloud is provided may be limited or a specification indicating that a dedicated line connected to a public cloud is used may be already determined. Thus, in the aforementioned techniques, an information processing system may be restored in a site located outside a region in which a public cloud is provided or may be restored in a site that does not satisfy the aforementioned specification or in which the dedicated line is not usable. As a result, in the site in which the information processing system is restored, the information processing system may not be able to use the service provided by the public cloud. Such problems are not limited to the service provided by the public cloud. Specifically, in the case where a function to be used in a site in which an information processing system is restored is different from a function to be provided in a backup site, the same problems may occur.
According to an aspect of the present invention, provided is an information processing device including a memory and a processor coupled to the memory. The processor is configured to acquire first information indicating functions used by respective information processing systems. The processor is configured to acquire second information indicating functions that are usable by the information processing systems in respective sites. the processor is configured to determine, when a failure has occurred in a first information processing system of the information processing systems installed in a first site of the sites, a second site in which a first function used by the first information processing system is usable based on the acquired first information and the acquired second information. The second site is one of the sites and different from the first site. The second site serves as a first restoration destination in which the first information processing system is restored.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
Hereinafter, an embodiment related to techniques disclosed herein is described with reference to the accompanying drawings. The following detailed description is an example and does not limit configurations described in the embodiment.
As illustrated in
The site A1 includes an information processing system A4, an information processing system B5, and a managing server 10. The site B2 includes an information processing system C6 and a managing server 11. The site C3 includes an information processing system D7, an information processing system E8, and a managing server 12. In the site B2, a single available resource 9 in which an information processing system is able to be installed remains. The managing server 10 is an example of an information processing device that determines a site that serves as a restoration destination of an information processing system and is among multiple sites in which information processing systems operate.
The site A1, the site B2, and the site C3 are connected to a management network 15. Thus, the managing servers 10 to 12 communicate with each other via the management network 15. The sites A1 and B2 are connected to a public cloud 13 providing a service A. The sites B2 and C3 are connected to a public cloud 14 providing a service B. Thus, the information processing systems A4 and B5 included in the site A1 may use the service A provided by the public cloud 13. The information processing system C6 included in the site B2 may use the service A provided by the public cloud 13 and the service B provided by the public cloud 14. The information processing systems D7 and E8 included in the site C3 may use the service B provided by the public cloud 14. The embodiment assumes that the service B provided by the public cloud 14 is not usable in the site A1 and that the service A provided by the public cloud 13 is not usable in the site C3.
A user of the managing server 10 uses the input device 30 to provide various instructions to the managing server 10 and uses the monitor 20 to confirm processing results of the managing server 10. In the embodiment, the CPU 11 loads various programs stored in the HDD 103 into the RAM 102 and executes various processes described later.
In the embodiment, in the HDD 103 of the managing server 10, information on the types of the services usable in the sites and the numbers (capacities) of information processing systems able to be installed in the sites is stored in advance. The capacities are an example of resource amounts for the information processing systems able to be installed in the sites. In the HDD 103, information on the services to be used by the information processing systems and restoration priorities to be used to restore the information processing systems is stored in advance. In the following description, data obtained by organizing, as a table, information on the types of the services usable by the information processing systems in the sites and the capacities is referred to as a site management table. In the following description, data obtained by organizing, as a table, information on the services used by the information processing systems and the restoration priorities to be used to restore the information processing systems is referred to as an information processing system management table.
As illustrated in
In the embodiment, as an example, the managing servers 10 to 12 of the sites A1, B2, and C3 (illustrated in
A process to be executed by the managing server 10 according to the embodiment is described with reference to flowcharts illustrated in
In OP101, the CPU 101 determines whether or not a failure has occurred in one or more of the information processing systems A4 to E8 installed in the sites A1, B2, and C3. Specifically, the managing servers 10 to 12 monitor failure occurrence states of the information processing systems installed in the sites (or the site A1 including the managing server 10, the site B2 including the managing server 11, and the site C3 including the managing server 12) of the managing servers 10 to 12. Then, if any of the managing servers 10 to 12 detects that a failure has occurred in one or more of information processing systems installed in a site including the managing server, the managing server notifies the occurrence of the failure to the other remaining managing servers. Thus, the managing servers 10 to 12 share information on the failure occurrence states of the information processing systems A4 to E8. Thus, in OP101, the CPU 101 may determine whether or not a failure has occurred in the information processing systems A4 to E8.
In the embodiment, if a failure has occurred in any of the information processing systems A4 to E8, information stored in the site management table 50 and related to the capacities is updated and the updated site management table 50 is shared by the managing servers 10 to 12. As an example, it is assumed that a disaster has occurred in the site A1 and a failure has occurred in each of the information processing systems A4 and B5. In this case, the managing server 10 updates the values indicated in the “capacity” item 53 of the site management table 50 stored in the HDD 133 from a state illustrated in
If the CPU 101 determines that the failure has occurred in one or more of the information processing systems A4 to E8 (Yes in OP101), the CPU 101 causes the process to proceed to OP102. On the other hand, if the CPU 101 determines that the failure has not occurred in the information processing systems A4 to E8 (No in OP101), the CPU 101 repeatedly executes the process of OP101.
In OP102, the CPU 101 functions as an acquirer and acquires the site management table 50 and the information processing system management table 60 from the HDD 103. Then, in OP103, the CPU 101 uses the acquired site management table 50 and the acquired information processing system management table 60 to determine restoration destination candidates of the information processing systems A4 to E8.
For example, if a disaster has occurred in the site A and a failure has occurred in each of the information processing systems A4 and B5, the CPU 101 acquires the site management table 50 illustrated in
Examples of the details of the restoration process are restoration, movement, maintenance, and stop. “Restoration” indicates that an information processing system in which a failure has occurred is restored. “Movement” indicates that an information processing system in which a failure does not occur is moved to another site. “Maintenance” indicates that an information processing system in which a failure does not occur is continuously installed in a site in which the information processing system is currently installed. “Stop” indicates that an information processing system is not installed since an available resource does not exist in a site. A stop process is executed based on services usable in the sites, the capacities of the sites, the services used by the information processing systems, the restoration priorities of the information processing systems, and the like if a restoration destination of an information processing system does not exist. In addition, the stop process may be a so-called degradation process or a process of stopping a part of a resource of an information processing system or a process of reducing the performance of an information processing system.
As illustrated in
According to the site management table 50 and the information processing system management table 60, the information processing system A4 is able to be restored in the site B2 or the site C3. The information processing system A4 uses the service A. Thus, if the information processing system A4 is restored in the site B2, the information processing system A4 is able to use the service A. If the information processing system A4 is restored in the site C3, the information processing system A4 is not able to use the service A. Thus, the CPU 101 determines that a restoration destination candidate of the information processing system A4 is the site B2. Similarly, the CPU 101 determines that a restoration destination candidate of the information processing system B5 is the site B2, and the CPU 101 determines that restoration destination candidates of the information processing systems C6 to E8 are the site B2 or the site C3. Then, the CPU 101 causes, based on the results of the determination, the restoration destination candidates of the information processing systems to be stored in the “restoration destination candidates” item 83 of the restoration process table 80, as illustrated in
In OP104, the CPU 101 determines restoration destinations of the information processing systems in order from an information processing system with the highest restoration priority. Specifically, the CPU 101 determines the restoration destinations of the information processing systems based on restoration destination candidates of the information processing systems within the restoration process table 80 created in OP103, the restoration priorities within the information processing system management table 60, and the capacities of the sites within the site management table 50.
A specific example of processes of OP201 and OP202 is described below. If the information processing system management table 60 illustrated in
In the aforementioned manner, the restoration destinations of the information processing systems are determined based on the restoration priorities of the information processing systems in the embodiment. Thus, even if the amount of resources in which information processing systems are able to be installed in each of the sites is reduced upon the occurrence of a disaster, resources may be assigned to the information processing systems in order from the information processing system with the highest restoration priority.
In the embodiment, the restoration destinations of the information processing systems are determined so that resource amounts assigned to the information processing systems installed in the sites do not exceed the capacities. Thus, when the restoration destinations of the information processing systems are determined so that the information processing systems are able to use the services in the restoration destinations, it may be possible to suppress insufficiency of the capacities of the sites and a state in which an information processing system is not able to be installed. The CPU 101 sets, to ON, a flag indicating whether or not the process of determining a restoration destination has been executed on the information processing system A4, and the CPU 101 causes the process to return to OP201.
The process of determining a restoration destination has yet to be executed on the information processing systems B5 to E8. Thus, the CPU 101 causes the process to proceed from OP201 to OP202. Then, in OP202, the CPU 101 determines that the restoration destination of the information processing system B5 is the site B2 in the same manner as the information processing system A4. In addition, the CPU 101 sets, to ON, a flag indicating whether or not the process of determining a restoration destination has been executed on the information processing system B5, and the CPU 101 causes the process to return to OP201.
The process of determining a restoration destination has yet to be executed on the information processing systems C6 to E8. Thus, the CPU 101 causes the process to proceed from OP201 to OP202. Then, in OP202, the CPU 101 determines the restoration destination of the information processing system C6. As indicated in the “restoration destination candidates” item 83 of the restoration process table 80, the restoration destination candidates of the information processing system C6 are the sites B2 and C3. Specifically, the information processing system C6 may be restored in the site B2 or the site C3. The capacity of the site B2 is 2, and the restoration destinations of the information processing systems A4 and B5 have already been determined to be the site B2. Thus, an available resource in which the information processing system C6 is able to be restored does not exist in the site B2. Thus, the CPU 101 determines that the restoration destination of the information processing system C6 is the site C3. The CPU 101 sets, to ON, a flag indicating whether or not the process of determining a restoration destination has been executed on the information processing system C6, and the CPU 101 causes the process to return to OP201.
The process of determining a restoration destination has yet to be executed on the information processing systems D7 and E8. Thus, the CPU 101 causes the process to proceed from OP201 to OP202. Then, in OP202, the CPU 101 determines that the restoration destination of the information processing system D7 is the site C3 in the same manner as the information processing system C6. In addition, the CPU 101 sets, to ON, a flag indicating whether or not the process of determining a restoration destination has been executed on the information processing system D7, and the CPU 101 causes the process to return to OP201.
The process of determining a restoration destination has yet to be executed on the information processing system E8. Thus, the CPU 101 causes the process to proceed from OP201 to OP202. Then, in OP202, the CPU 101 determines the restoration destination of the information processing system E8. In this case, the restoration destination candidates of the information processing system E8 are the sites B2 and C3. An available resource in which the information processing system E8 is able to be restored does not exist in the site B2, like the case of the information processing systems C6 and D7. In addition, the capacity of the site C3 is 2, and the restoration destinations of the information processing systems C6 and D7 have already been determined to be the site C3. Thus, an available resource in which the information processing system E8 is able to be restored does not exist in the site C3. Thus, since the information processing system E8 is not able to be restored in any of the sites A1, B2, and C3, the CPU 101 determines that the restoration destination of the information processing system E8 does not exist.
Then, the CPU 101 sets, to ON, a flag indicating whether or not the process of determining a restoration destination has been executed on the information processing system E8, and the CPU 101 causes the process to return to OP201. In this case, an information processing system that has yet to be subjected to the process of determining a restoration destination does not exist. Thus, the CPU 101 terminates the subroutine process and causes the process to proceed to OP105.
By executing the process of OP202, the information processing systems may use the services in the restoration destinations. Specifically, the restoration destinations of the information processing systems may be determined so that each of the information processing systems is not restored in a site in which the information processing system is not able to use a service.
In OP105, the CPU 101 determines details of the restoration process to be executed on the information processing systems A4 to E8 of which the restoration destinations have been determined in OP104 and causes the determined details of the restoration process to be stored in the “restoration process” item 84 of the restoration process table 80. If the restoration process table 80 illustrated in
In the embodiment, if an information processing system (information processing system C6 in the aforementioned example) with a lower restoration priority is installed in the site B2, an information processing system (information processing systems A4 and B5 in the aforementioned example) with a higher restoration priority is not restored in the site B2. Thus, the restoration destinations are determined so that the information processing system with the lower restoration priority is to be moved from the site (site B2 in the aforementioned example) in which the information processing system with the lower restoration priority is currently installed to a site (site C3 in the aforementioned example) in which a function used by the information processing system with the lower restoration priority is usable. Thus, the restoration destinations of the information processing systems are determined so that the information processing system with the higher restoration priority is able to use a service in a restoration destination and that the information processing system with the lower restoration priority is able to use a service in the restoration destination.
In addition, a failure has not occurred in the information processing system D7, and the information processing system D7 is continuously installed in the site C3 in which the information processing system D7 is currently installed. Thus, the CPU 101 determines that a detail of the restoration process to be executed on the information processing system D7 is “maintenance”. In addition, the restoration destination of the information processing system E8 does not exist. Thus, the CPU 101 determines that a detail of the restoration process to be executed on the information processing system E8 is “stop”.
When the CPU 101 determines the details of the restoration process to be executed on the information processing systems and causes the determined details of the restoration process to be stored in the “restoration process” item 84 of the restoration process table 80, the CPU 101 transmits the restoration process table 80 to the other managing servers 11 and 12 via the management network 15. By executing this, the restoration process table 80 is shared by the managing servers 10 to 12. Then, the CPU 101 causes the process to proceed to OP106. In the following OP106 and OP107, the CPUs of the managing servers 10 to 12 mainly execute processes. The information processing systems installed in the sites are restored in the determined sites by the communication between the managing servers 10 to 12 via the management network 15.
In OP106, the CPUs of the managing servers 10 to 12 execute the restoration process on the information processing systems A4 to E8 based on the restoration process table 80 and the information processing system management table 60. Specifically, first, the CPUs of the managing servers 10 to 12 execute the stop process on an information processing system for which a detail of the restoration process has been set to “stop” in the restoration process table 80. Next, the CPUs of the managing servers 10 to 12 execute a process of moving an information processing system for which a detail of the restoration process has been set to “movement” in the restoration process table 80. Next, the CPUs of the managing servers 10 to 12 execute the process of restoring information processing systems for which details of the restoration process have been set to “restoration” in the restoration process table 80.
As an example, the case where the restoration process table 80 illustrated in
The information processing system for which the detail of the restoration process has been set to “movement” is the information processing system C6, and a destination to which the information processing system C6 is moved is the site C3, as indicated in the “restoration destination candidates” item 83 of the restoration process table 80. Thus, the CPU of the managing server 11 of the site B2 in which the information processing system C6 is installed makes available a resource in which the information processing system C6 has been previously installed. In the embodiment, the managing servers 10 to 12 share information of the information processing systems. Thus, the CPU of the managing server 12 of the site C3 installs the information processing system C6 in the resource made available by executing the stop process on the information processing system E8. As a result, the information processing system C6 previously installed in the site B2 is moved to the site C3.
The information processing systems for which the details of the restoration process have been set to “restoration” are the information processing systems A4 and B5. In addition, the restoration destinations of the information processing systems A4 and B5 are the site B2. The CPUs of the managing servers 10 to 12 execute the process of restoring the information processing systems in order from the information processing system with the higher restoration priority stored in the information processing system management table 60. Thus, first, the CPU of the managing server 11 of the site B2 installs the information processing system A4 in an available resource. Next, the CPU of the managing server 11 of the site B2 installs the information processing system B5 in an available resource. As a result, the information processing systems A4 and B5 that are using the service A are restored in the site B2 in which the service A is usable, and are not restored in the site C3 in which the service A is not usable.
When the process of restoring the information processing systems is completed in OP106, the CPUs of the managing servers 10 to 12 cause the process to proceed to OP107. In OP107, the CPUs of the managing servers 10 to 12 display, on monitors (the managing server 10 displays the results on the monitor 20), the results of executing the restoration process on the aforementioned information processing systems. The results of the restoration process include the details of the restoration process executed on the information processing systems, the sites in which the information processing systems are installed by the restoration process, and information indicating whether the restoration process has been successfully executed or has failed. Users of the managing servers 10 to 12 may confirm, based on the results of the restoration process that are displayed on the monitors, that the information processing systems have been restored in the sites in which the information processing systems are able to continuously use the services.
Although the embodiment is described above, the configurations and processes of the aforementioned servers and the like are not limited to those described in the embodiment and may be variously changed within a scope in which the changes remain identical to the technical idea of the present disclosure. For example, although the embodiment assumes the case where the sites use the services of the public clouds, information on usable hardware and software functions specific to the sites may be shared by the managing servers, like the aforementioned tables. Thus, the restoration destinations of the information processing systems may be determined so that the functions used by the information processing systems are usable in the restoration destinations.
In the embodiment, if a detail of the restoration process to be executed on an information processing system is “stop”, an entire resource of the information processing system is made available. Degradation of stopping a certain resource, however, may be applied as a detail of the restoration process instead of the stop process, for example. In this case, the certain resource stopped by the degradation may be used as a resource in which an information processing system is to be restored.
In the embodiment, a part of the aforementioned processes may be executed by a processor other than the CPUs, for example, by a dedicated processor such as a digital signal processor (DSP), a graphics processing unit (GPU), a numerical processor, a vector processor, or an image processing processor. In addition, a part of the aforementioned units may be an integrated circuit (IC) or another digital circuit. Furthermore, an analog circuit may be included in a part of the aforementioned units. Examples of the integrated circuit include a large-scale integration (LSI), an application specific integrated circuit (ASIC), and a programmable logic device (PLD). Examples of the PLD include a field-programmable gate array (FPGA). The aforementioned units may be a combination of the processor and the integrated circuit. The combination is referred to as microcontroller (MCU), system-on-a-chip (SoC), system LSI, chipset, or the like, for example.
Computer-Readable Recording Medium
A program for achieving a management tool for setting the aforementioned servers in a computer, another machine, or another device (hereinafter referred to as computer or the like), an operating system (OS), and a like may be stored in a computer-readable recording medium that is readable by the computer or the like. The program stored in the recording medium is read into the computer or the like and executed by the computer or the like, thereby providing the functions.
The computer-readable recording medium electrically, magnetically, optically, mechanically, or chemically accumulates information such as data, programs, and the like and is readable by the computer or the like. The recording medium may be detachable from the computer or the like. The detachable recording medium is, for example, a flexible disk, a magneto-optical disc, a CD-ROM, a CD-R, a CD-W, a DVD, a Blu-ray disc, a DAT, an 8 mm tape, a memory card such as a flash memory, or the like. The recording medium may be fixed to the computer or the like. The recording medium fixed to the computer or the like is a hard disk, a ROM, or the like.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiment of the present invention has been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2017-101098 | May 2017 | JP | national |