The present invention relates to an information collection technique in an IT system.
In recent years, society has been increasingly dependent on IT systems as the IT systems have been applied to more areas. As a result, the IT systems have been turned into a social infrastructure. As the international community has been globalized, data centers that make up the IT systems are spread across the world. In response to such circumstances, the operational control and failure response operations of the IT systems are conducted in more complicated ways and on larger scale. Given the above fact, it is necessary to carry out investigation into failures on the basis of the assumption that the operation of a system is complex as well as to collect information in order to carry out investigation into failures.
As techniques of collecting information needed to carry out investigation into failures as described above, a collection tool and a collection agent are well known. The following describes the techniques with reference to the accompanying drawings.
As illustrated in
What is known as a conventional technique related to the present invention is a data collection device, a data collection system and a data collection method that: calculates a usage amount rate of each factor for each to-be-collected data item; generates a schedule that enables data to be collected at a time when a calculated operating rate is low; and collects the predetermined data on the basis of the generated schedule for carrying out collection (see Patent Document 1, for example). [Patent Document 1] Japanese Laid-open Patent Publication No. 2006-79488
However, the problem with the information collection method of the conventional IT system is that due to the delay of network communication and different loads of information collection operations on the agents 904 of the servers 901 to 903, a time lag occurs between the pieces of information acquired by the collection tool 801 as a result. Another problem is that since the time lag occurs between the acquired pieces of information, it is difficult to grasp the causal relationship between the pieces of information and, as circumstantial evidence, the reliability is low. Moreover, according to the information collection technique of the conventional IT system, information is collected at predetermined intervals or at a predetermined time. Therefore, for example, the problem is that it is impossible to collect information immediately after a failure occurs or at other arbitrary timings.
According to an aspect of the invention, there is provided an information collection device that acquires information from a plurality of devices connected through a network, including: a first addition unit that adds, for each of the plurality of devices, an information collection duration that is a duration required to collect information of the devices to a communication duration that is a duration required to communicate with the devices; a second addition unit that adds a maximum duration that is the largest among the durations added by the first addition unit to a predetermined time; a subtraction unit that subtracts the information collection duration of each of the plurality of devices from a first time that is a time obtained after the maximum duration is added by the second addition unit to the predetermined time; a first setting unit that sets a time obtained after the subtraction unit subtracts the information collection duration of each of the plurality of devices from the first time as a second time that is when information collection starts at each of the plurality of devices; and an acquisition unit that acquires information obtained by the information collection that starts at the second time set by the first setting unit.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
Hereinafter, an embodiment of the present invention will be described with reference to the accompanying drawings.
The following describes the configuration of an information collection device according to the present embodiment.
As illustrated in
The CMDB 102 stores information about resources of devices that make up a connected system, such as the server A 20 and server B 21. The CMDB 102 also stores various kinds of information, including collection history information, collection item information and device information.
Among information stored in the CMDB 102, the collection history information, the collection item information and the device information will be described. Incidentally, the items will be described later. As illustrated in
As illustrated in
A device list illustrated in
Incidentally, in the collection history information, the collection item information and the device information, the following are commands used to acquire information of resources of the group of servers: ps, df, sar, and free (collection items or collection targets). ps is a command used to acquire a list of processes running on a device. df is a command used to acquire the state of how a disk is used. sar is a command used to acquire various kinds of performance information of a device (The duration of one observation process and the number of times the observation process takes place may be specified). free is a command used to acquire the usage information of a memory. dmesg and fstab are also commands used to acquire information about a device. Accordingly, from a practical perspective, the items represent information acquired by the above commands. Incidentally, any information stored in the CMDB 102 other than the collection history information, the collection item information and the device information will be described later.
The data control unit 101 controls the CMDB 102. The calculation unit 105 calculates the frequency of change on the basis of the number of times an item is acquired and the number of times an item is changed recorded in the collection item information. The calculation unit 105 also calculates the estimated delay time in the device information, or the delay time of each server, on the basis of the delay time recorded in the collection history information. The collection plan unit 103 makes a schedule of collection on the basis of the information calculated by the calculation unit 105. The collection execution unit 104 notifies an execution 201 that each server has, which is described later, of a collection start time of an item in accordance with the schedule made by the collection plan unit 103. The collection execution unit 104 may collect data from each server in accordance with schedule made by the collection plan unit 103.
The server A 20 and server B 21 are equipped with execution units 201. After being notified by the collection execution unit of the information collection device 10, the execution units 201 collect the specified information. The client terminal 30 requests device information from the information collection device 10.
The following describes the operation of the information collection device according to the present embodiment.
First, the calculation unit 105 performs a delay calculation process in which the client terminal 30 calculates the delay time and the frequency of change pertaining to the specified device (S101). Then, on the basis of the delay time and frequency of change calculated by the calculation unit 105 in the delay calculation process, the collection plan unit 103 performs a scheduling process to make a schedule of collection (S102). After a schedule of collection is made by the collection plan unit 103 in the scheduling process, the collection execution unit 104 performs a collection instruction process to give the execution units 201 instructions to collect (S103). After the information about devices are transmitted from all the execution units 201, the transmitted information is stored in the CMDB 102 via the data control unit 101 (S104; acquisition step). Incidentally, the operations of the delay calculation process, the scheduling process, and the collection instruction process will be described later in detail.
The following describes the delay calculation process performed by the calculation unit.
First, the calculation unit 105 makes reference to the device information and makes a determination as to whether there is an unselected device in the device information (S201).
When there is an unselected device in the device information (S201, YES), the calculation unit 105 selects the unselected device from the device information (S202) and calculates an estimated delay time (communication duration) for the device added to the device information (S203). The estimated delay time is calculated so that the estimated delay time represents the average delay time of all the collection items of the collection history information of each device. The calculated estimated delay time is linked to the device and the collection target in the device information.
The calculation unit 105 then makes a determination as to whether there is an unselected collection target item for the selected device (S204).
When there is an unselected collection target item (S204, YES), the calculation unit 105 selects, in the collection item information, one unselected collection item from among collection targets in the device information as illustrated in
When there is no unselected collection target item, i.e. when all collection targets in the device information have been selected in the collection item information (S204, NO), the calculation unit 105 turns the collection history information of the collection item that is selected based on the collection target of the device information into a collection item list as illustrated in
When there is no unselected device in the device information at step S201 (S201, NO), the calculation unit 105 rearranges the device information in descending order of delay information (S207) and stores the device information and the collection item list in the CMDB 102 through the data control unit 101 (S208).
The following describes the scheduling process performed by the collection plan unit.
First, the collection plan unit 103 refers to the device information stored in the CMDB 102 through the data control unit 101 to make a determination as to whether there is an unselected device in the device information (S301).
When there is an unselected device in the device information (S301, YES), the collection plan unit 103 selects one unselected device from the device information and refers to the collection item list of the selected device. The collection plan unit 103 makes a determination as to whether there is an unselected collection item in the collection item list (S303).
When there is an unselected collection item in the collection item list (S303, YES), the collection plan unit 103 selects one unselected collection item from the collection item list and makes a determination as to whether the collection duration t′ of the selected collection item is greater than maximum collection duration t (S305). Incidentally, the maximum collection duration t is a variable in which stored is the longest collection duration in the collection item list.
When the collection duration t′ of the selected collection item is greater than the maximum collection duration t (S305, YES), the collection plan unit 103 sets t=t′ (S306).
When the collection duration t′ of the selected collection item is less than or equal to the maximum collection duration t (S305, NO), the collection plan unit 103 makes a determination again as to whether there is an unselected collection item in the collection item list (S303).
When there is no unselected collection item in the collection item list at step S303 (S303, NO), the collection plan unit 103 adds the delay time of the device's device list and the maximum collection duration t and adds the result to the device list as notification duration (S307; first addition step). The collection plan unit 103 makes a determination again as to whether there is an unselected device in the device list (S301).
When there is no unselected device in the device list at step S301 (S301, NO), the collection plan unit 103 rearranges the device list in descending order of notification duration to create a notification schedule as illustrated in
The following describes a collection instruction process performed by the collection execution unit.
First, the collection execution unit 104 refers to the notification schedule, adds the longest notification duration (maximum duration) in the notification schedule, i.e. the notification duration linked to the top device, to the current time, and regards the result as a data acquisition time (first time) (S401; second addition unit). Then, the collection execution unit 104 makes a determination as to whether there is an unselected device in the notification schedule (S402).
When there is an unselected device in the notification schedule (S402, YES), the collection execution unit 104 selects the device (S403) and makes a determination as to whether there is an unselected collection item in the collection item list of the selected device (S404).
When there is an unselected collection item in the collection item list (S404, YES), the collection execution unit 104 selects the collection item and sets the time calculated by subtracting the collection duration of the selected collection item from the data acquisition time as the time (second time) that is when collection starts in such a way that the time is linked to the collection item (S405; subtraction step and setting step). The collection execution unit 104 then makes a determination again as to whether there is an unselected collection item in the collection item list (S404).
When there is no unselected collection item in the collection item list (S404, NO), the collection execution unit 104 makes a determination again as to whether there is an unselected device in the notification schedule (S402).
When there is no unselected collection item in the notification schedule at step S402 (S402, NO), the collection execution unit 104 regards the following list as what is illustrated as a collection schedule in
The following describes the operation of the execution units.
First, the execution units 201 start collecting each collection item recorded in the collection schedule transmitted from the collection execution unit 104 at the corresponding collection time (S501). The execution units 201 wait until the collection end time (S502) and transmit the collected data of the collection items to the collection execution unit 104 (S503). Incidentally, at step S501, as for the collection items having the same collection start time, the execution units 201 start collecting in the order that the collection items are arranged in the collection schedule, i.e. in descending order of the frequency of update.
As described above, the information collection device 10 of the present embodiment collects information in accordance with the collection item for which the sum of the collection duration and the delay time of the device is largest. The following provides a more detailed description of the effects of the present invention with reference to the accompanying drawings.
As illustrated in
When the collection items have the same collection duration, the information collection device 10 of the present embodiment acquires information about the collection items in descending order of the frequency of update, i.e. in ascending order of update intervals. Therefore, as illustrated in
As described above, according to the present invention, for example, when the failure of a system consisting of a plurality of devices occurs, it is possible to synchronously collect device information as soon as possible at an arbitrary timing.
The present invention may be embodied in other various forms without departing from the spirit and essential characteristics thereof. The embodiments described therefore are to be considered in all respects as illustrative and not restrictive. The scope of the invention is indicated by the appended claims, not by the foregoing description. Furthermore, all variations, various modifications, alternatives and alterations which come within the meaning and range of equivalency of the claims are all intended to be embraced within the scope of the present invention.
A program that executes each of the above steps on a computer that makes up the information collection device may be provided as an information collection program. The above program is stored in a computer-readable storage medium so that the computer that makes up the information collection device may execute the program. The above computer-readable storage media include: an internal storage device installed in a computer, such as a ROM or RAM; a portable storage medium, such as a CD-ROM, flexible disk, DVD disk, magnetic optical disk or IC card; a data base that stores computer programs; and another computer and a database thereof.
According to the present invention, it is possible to synchronously collect information from a plurality of devices at an arbitrary timing.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiment of the present inventions has been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
This application is a continuation application, filed under 35 U.S.C. §111(a), of PCT Application No. PCT/JP2008/062189, filed Jul. 4, 2008, the disclosure of which is herein incorporated in its entirety by reference.
Number | Name | Date | Kind |
---|---|---|---|
5748098 | Grace | May 1998 | A |
5822535 | Takase et al. | Oct 1998 | A |
6212171 | LaFollette et al. | Apr 2001 | B1 |
6389549 | Murase | May 2002 | B1 |
6466007 | Prazeres da Costa et al. | Oct 2002 | B1 |
6718537 | Miles | Apr 2004 | B1 |
6886020 | Zahavi et al. | Apr 2005 | B1 |
7051026 | Berry et al. | May 2006 | B2 |
7890620 | Masuda et al. | Feb 2011 | B2 |
8214488 | Machida | Jul 2012 | B2 |
8589339 | Nakamura | Nov 2013 | B2 |
20050010667 | Moriki et al. | Jan 2005 | A1 |
20060080433 | Caselli et al. | Apr 2006 | A1 |
20060104220 | Yamazaki et al. | May 2006 | A1 |
20060259905 | Diao et al. | Nov 2006 | A1 |
20060265497 | Ohata et al. | Nov 2006 | A1 |
20070132477 | Balog et al. | Jun 2007 | A1 |
20070192473 | Fukuda et al. | Aug 2007 | A1 |
20080063216 | Sakata et al. | Mar 2008 | A1 |
20090141646 | Legg | Jun 2009 | A1 |
20090182534 | Loboz | Jul 2009 | A1 |
Number | Date | Country |
---|---|---|
2004-199410 | Jul 2004 | JP |
2004-355061 | Dec 2004 | JP |
2006-079488 | Mar 2006 | JP |
2006-344003 | Dec 2006 | JP |
2007-128122 | May 2007 | JP |
WO 2008056670 | May 2008 | WO |
Entry |
---|
English translation of the International Preliminary Report on Patentability issued in International App. No. PCT/JP2008/062189, issued Feb. 8, 2011. |
International Search Report issued in International App. No. PCT/JP2008/062189, mailed Aug. 12, 2008. |
Number | Date | Country | |
---|---|---|---|
20110131323 A1 | Jun 2011 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2008/062189 | Jul 2008 | US |
Child | 12979113 | US |