This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2016-022700, filed on Feb. 9, 2016, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to a technology which classifies apparatuses in relation to a characteristic of a resource use.
For example, in an information system in which a plurality of virtual servers simultaneously operate, resources such as a central processing unit (CPU), a disk, and a network are assigned to the virtual servers. That is, each virtual server performs a process of the virtual server in a state where usage (hereinafter, referred to as permissible usage) which is permitted to each resource is secured.
However, in a case where the permissible usage of each of the resources assigned to the corresponding virtual server is inappropriate, the resources may be insufficient in some virtual servers, and a process in such apparatus may be delayed.
Related technologies are disclosed in, for example, Japanese Laid-open Patent Publication No. 2015-11362, Japanese Laid-open Patent Publication No. 2010-277208, International Publication Pamphlet No. WO 2013/140524, Japanese Laid-open Patent Publication No. 2004-206495, and Japanese Laid-open Patent Publication No. 2014-191365.
According to an aspect of the invention, a non-transitory computer-readable storage medium storing a program for causing a computer to execute a process, the process includes calculating, for each of a plurality of apparatuses, a first feature amount that indicates an association between resource uses according to a combination of resources based on first logs related to the resources which are respectively used by the plurality of apparatuses, performing first clustering on the first feature amount of each of the plurality of apparatuses, generating a first rule related to the association based on a first result of the first clustering, the first rule corresponding to a procedure that produces a substantially equal result to the first result of the first clustering, storing the first rule into a memory, calculating, based on the first logs, a second feature amount that indicates a resource usage in each time slot for each of the resources which are respectively used by the plurality of apparatuses, performing second clustering on the second feature amount of each of the plurality of apparatuses, generating a second rule related to the resource usage based on a second result of the second clustering, the second rule corresponding to a procedure that produces a substantially equal result to the second result of the second clustering, storing the second rule into the memory, performing third clustering on the plurality of apparatuses based on the first result of the first clustering and the second result of the second clustering, generating a third rule related to attributes based on a third result of the third clustering, the attributions indicating types of the plurality of apparatus, and storing the third rule into the memory.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
In a case where assignment of the resources is controlled such that each of the virtual servers smoothly operates, it is desired to grasp a characteristic type, to which the virtual server belongs, in relation to a resource use. However, it is not easy for a person who is not well aware of process content and an operational form of each of the virtual servers to classify virtual servers according to a characteristic of the resource use.
An object of a technology disclosed in embodiments is to generate an apparatus classification rule which is more suitable for classification focused on a characteristic of a resource use.
It is assumed that a plurality of virtual servers are deployed in the physical servers 101. Each of the virtual servers uses common resources. That is, each of the virtual servers shares resources such as a CPU, a memory, a disk, and a network.
In addition, a management apparatus that manages the physical servers 101 is deployed in any one of the physical servers 101. The management apparatus manages permissible usage of each resource which is assigned to, for example, each of the virtual servers.
A status of a use of each resource in each virtual server is recorded as a log. The log is maintained in the management apparatus or the virtual server.
A classification apparatus 103 is an apparatus that classifies the virtual server according to the status of the use of the resource in each virtual server. The classification apparatus 103 operates through two phases.
In a first phase, the classification apparatus 103 performs clustering based on a log of each virtual server included in a sample set, and generates a rule for classifying a virtual server which is a classification target, similarly to the clustering.
In a second phase, the classification apparatus 103 classifies the virtual server which is the classification target according to the rule based on the log of the virtual server which is the classification target. That is, in the embodiment, in the second phase, a virtual server other than the sample is classified by generating the classification rule in the first phase.
The first phase unit 201 performs a first phase process. The first classification rule storage unit 203 stores a first classification rule. The second classification rule storage unit 205 stores a second classification rule. The third classification rule storage unit 207 stores a third classification rule. The second phase unit 209 performs a second phase process. The first phase process, the second phase process, the first classification rule, the second classification rule, and the third classification rule will be described later. The output unit 211 outputs a result of classification.
The above-described first phase unit 201, the second phase unit 209, and the output unit 211 are realized using hardware resources (for example, see
The above-described first classification rule storage unit 203, the second classification rule storage unit 205, and the third classification rule storage unit 207 are realized using the hardware resources (for example, see
The first acquisition unit 301 acquires logs of respective resources in the virtual server which belongs to the sample set. The first feature amount calculation unit 303 performs a first feature amount calculation process. The first clustering unit 305 performs a first clustering process. The first generation unit 307 performs a first generation process. The second feature amount calculation unit 309 performs a second feature amount calculation process. The second clustering unit 311 performs a second clustering process. The second generation unit 313 performs a second generation process. The third clustering unit 315 performs a third clustering process. The third generation unit 317 performs a third generation process. Also, the first feature amount calculation process, the second feature amount calculation process, the first clustering process, the second clustering process, the third clustering process, the first generation process, the second generation process, and the third generation process will be described later.
The first log storage unit 321 stores the logs of the respective resources in the virtual server which belongs to the sample set.
The first feature amount storage unit 323 stores a correlation coefficient, as an example of an association amount, which is the first feature amount. Specifically, the first feature amount storage unit 323 stores a correlation coefficient table in which the correlation coefficient is set.
The first result storage unit 325 stores a result of first clustering. Specifically, the first result storage unit 325 stores a correlation cluster table in which an ID of a correlation cluster is set.
The second feature amount storage unit 327 stores resource usage for each time slot which is a second feature amount. Specifically, the second feature amount storage unit 327 stores a resource usage table (a CPU usage table, a disk usage table, and a network usage table) in which the resource usage for each resource is set for each time slot. Also, the time slot is a prescribed period from a start time to an end time in an arbitrary day.
The second result storage unit 329 stores a result of second clustering. Specifically, the second result storage unit 329 stores a time cluster table in which an ID of a time cluster (an ID of a CPU cluster, an ID of a disk cluster, and an ID of a network cluster) is set based on a temporal characteristic of the resource use.
The third result storage unit 331 stores a result of third clustering. Specifically, the third result storage unit 331 stores an integration cluster table in which an ID of an integration cluster is set based on the result of the first clustering and the result of the second clustering.
The above-described first acquisition unit 301, the first feature amount calculation unit 303, the first clustering unit 305, the first generation unit 307, the second feature amount calculation unit 309, the second clustering unit 311, the second generation unit 313, the third clustering unit 315, and the third generation unit 317 are realized using the hardware resources (for example, see
The above-described first log storage unit 321, the first feature amount storage unit 323, the first result storage unit 325, the second feature amount storage unit 327, the second result storage unit 329, and the third result storage unit 331 are realized using the hardware resources (for example, see
The acquired logs are stored in the first log storage unit 321. In the example, the log of the CPU, the log of the disk, and the log of the network, which are related to a virtual server “server A”, are stored in the first log storage unit 321. Similarly, the log of the CPU, the log of the disk, and the log of the network, which are related to a virtual server “server B”, are stored in the first log storage unit 321. Similarly, the log of the CPU, the log of the disk, and the log of the network, which are related to a virtual server “server C”, are stored in the first log storage unit 321. Similarly, the log of the CPU, the log of the disk, and the log of the network, which are related to a virtual server “server D”, are stored in the first log storage unit 321. Similarly, the log of the CPU, the log of the disk, and the log of the network, which are related to a virtual server “server E”, are stored in the first log storage unit 321. Furthermore, the log of the CPU, the log of the disk, and the log of the network, which are related to a virtual server “server F” are stored in the first log storage unit 321.
The first feature amount calculation unit 303 performs the first feature amount calculation process (S403). In the first feature amount calculation process, a correlation coefficient of the resource use is calculated in relation to a combination of two resources. In the example, a correlation coefficient between a CPU use and a network use (mentioned as a CPU-network correlation coefficient), a correlation coefficient between the CPU use and a disk use (mentioned as a CPU-disk correlation coefficient), and a correlation coefficient between the disk use and the network use (mentioned as a disk-network correlation coefficient) are calculated.
For example, in a case of a virtual server which performs a batch process, both the CPU use and the network use are large while the batch process is performed. In addition, while the batch process is not performed, both the CPU use and the network use are small. Accordingly, the correlation coefficient between the CPU use and the network use becomes a large positive value.
Also, in a case where the memory is focused, the first feature amount calculation unit 303 may calculate a correlation coefficient between the memory use and the CPU use, a correlation coefficient between the memory use and a network use, and the correlation coefficient between the memory use and the disk use. The correlation coefficients are stored in the first feature amount storage unit 323 in the form of the correlation coefficient table which will be described below.
The example illustrates that the virtual server “server A” and the virtual server “server B” have a strong correlation related to the CPU use and the disk use. In addition, the example illustrates that the virtual server “server C” and the virtual server “server F” have a strong correlation related to the CPU use and the network use.
Subsequently, the first feature amount calculation unit 303 calculates the correlation coefficient between the CPU use and the disk use based on the log of the CPU and the log of the disk for each virtual server which belongs to the sample set (S603).
Finally, the first feature amount calculation unit 303 calculates the correlation coefficient between the disk use and the network use based on the log of the disk and the log of the network for each virtual server which belongs to the sample set (S605). In a case where the first feature amount calculation process ends, the process returns to the calling first phase process.
Returning to description of
The example illustrates that the virtual server “server A” and the virtual server “server B” belong to a correlation cluster which has an ID “1-1”. In addition, the example illustrates that the virtual server “server C”, the virtual server “server D”, the virtual server “server E”, and the virtual server “server F” belong to a correlation cluster which has an ID “1-2”.
Returning to description of
First, it is determined whether or not the correlation coefficient between the CPU use and the network use is equal to or less than 0.5. In a case where the correlation coefficient between the CPU use and the network use is equal to or less than 0.5, a virtual server which is the classification target belongs to the correlation cluster which has the ID “1-1”. In contrast, in a case where the correlation coefficient between the CPU use and the network use is larger than 0.5, the virtual server which is the classification target belongs to the correlation cluster which has the ID “1-2”.
Returning to description of
For example, a CPU usage in a time slot from 0 am to 3 am is an average of the CPU use rate in the time slot from 0 am to 3 am/the CPU use rate in total time slots. Also, the CPU use rate in the time slot is a representative value (for example, an average value, a maximum value, or a central value) of the CPU use rate which is measured in the time slot.
For example, a disk usage in the time slot from 0 am to 3 am is an average of “disk I/O” in the time slot from 0 am to 3 am/“disk I/O” in the total time slots”. Also, the “disk I/O” in the time slot is the sum of write data volume and read data volume in the time slot.
For example, a network usage in the time slot from 0 am to 3 am is an average of “network I/O” in the time slot from 0 am to 3 am/“network I/O” in the total time slots. Also, the “network I/O” in the time slot is the sum of transmission data volume and reception data volume in the time slot.
However, the usage may be a value which is not normalized. That is, the usage may be expressed using a relative value, or the usage may be expressed by an absolute value. In addition, in a case where the memory is focused, a memory usage in each time slot may be calculated.
The CPU usage in each time slot is stored in the second feature amount storage unit 327 in the format of the CPU usage table which will be described below. The disk usage in each time slot is stored in the second feature amount storage unit 327 in the format of the disk usage table which will be described below. The network usage in each time slot is stored in the second feature amount storage unit 327 in the format of the network usage table which will be described below.
The example illustrates that the virtual server “server A” and the virtual server “server B” have high CPU usage in a time slot corresponding to a part of the night. In addition, the virtual server “server C” and the virtual server “server F” have relatively high CPU usage in a time slot corresponding to daytime. Furthermore, the virtual server “server D” and the virtual server “server E” have stable CPU usage throughout the day.
The example illustrates that the virtual server “server A” and the virtual server “server B” have high disk usage in the time slot corresponding to a part of the night. In addition, the virtual server “server C”, the virtual server “server D”, the virtual server “server E”, and the virtual server “server F” have stable disk usage throughout the day.
The example illustrates that the virtual server “server A”, the virtual server “server B”, the virtual server “server D”, and the virtual server “server E” have stable network usage throughout the day. In addition, the example illustrates that the virtual server “server C” and the virtual server “server F” have relatively high network usage in the time slot corresponding to daytime.
The second feature amount calculation unit 309 calculates the disk usage in each time slot set based on the log of the disk for each virtual server which belongs to the sample set (S1203). Specifically, the second feature amount calculation unit 309 performs a process illustrated below on each virtual server. The second feature amount calculation unit 309 first acquires the sum of the write data volume and the read data volume in each time slot. Subsequently, the second feature amount calculation unit 309 acquires an average of the sum. Furthermore, the second feature amount calculation unit 309 calculates a quotient (disk usage in the time slot) by dividing the sum of the time slots by the average of the sum in each time slot.
The second feature amount calculation unit 309 calculates the network usage in each time slot based on the log of the network for each virtual server which belongs to the sample set (S1205). Specifically, the second feature amount calculation unit 309 performs a process illustrated below on each virtual server. The second feature amount calculation unit 309 first acquires the sum of the transmission data volume and the reception data volume in each time slot. Subsequently, the second feature amount calculation unit 309 acquires an average of the sum. Furthermore, the second feature amount calculation unit 309 calculates a quotient (network usage in the time slot) by dividing the sum of the time slot for each time slots by the average of the sum. In a case where the second feature amount calculation process ends, the process returns to the calling first phase process.
Returning to description of
A cluster generated by the second clustering based on the temporal characteristic of the resource use is referred to as the time cluster. In the example, a CPU cluster based on a temporal characteristic of the CPU use, a disk cluster based on a temporal characteristic of the disk use, and a network cluster based on a temporal characteristic of the network use correspond to the time cluster. Furthermore, the result of the second clustering is stored in the second result storage unit 329 as the time cluster table.
For example, a first record indicates that the virtual server “server A” belongs to a CPU cluster having an ID “2-1”, belongs to a disk cluster having an ID “3-1”, and, further, belongs to a network cluster having an ID “4-1”.
Returning to description of
The second clustering unit 311 finally performs a clustering process related to the network usage in each time slot (S1305). That is, the second clustering unit 311 generates a cluster (network cluster), to which the virtual servers of the sample belongs, using the network usage in the time slot as the feature amount. Furthermore, an ID of a network cluster, to which the respective virtual servers of the sample belong, is set in the time cluster table. In a case where the second clustering process ends, the process returns to the calling first phase process.
Returning to description of
In the example, first, it is determined whether or not the CPU usage in the time slot from 0 am to 3 am is equal to or larger than 1.5. In a case where the CPU usage in the time slot from 0 am to 3 am is equal to or larger than 1.5, the virtual server which is the classification target belongs to the CPU cluster having the ID “2-1”. In contrast, in a case where the CPU usage in the time slot from 0 am to 3 am is smaller than 1.5, it is determined whether or not the CPU usage in the time slot from 9 am to 0 pm is equal to or larger than 1.5.
In a case where the CPU usage in the time slot from 9 am to 0 pm is equal to or larger than 1.5, the virtual server which is the classification target belongs to a CPU cluster having an ID “2-2”. In contrast, in a case where the CPU usage in the time slot from 9 am to 0 pm is smaller than 1.5, the virtual server which is the classification target belongs to a CPU cluster having an ID “2-3”.
In the example, it is determined whether or not the disk usage in the time slot from 0 am to 3 am is larger than 1.5. In a case where the disk usage in the time slot from 0 am to 3 am is larger than 1.5, the virtual server which is the classification target belongs to a disk cluster having an ID “3-1”. In contrast, in a case where the disk usage in the time slot from 0 am to 3 am is equal to or lower than 1.5, the virtual server which is the classification target belongs to a disk cluster having an ID “3-2”.
In the example, it is determined whether or not the network usage in the time slot from 9 am to 0 pm is smaller than 1.5. In a case where the network usage in the time slot from 9 am to 0 pm is smaller than 1.5, the virtual server which is the classification target belongs to a network cluster having an ID “4-1”. In contrast, the network usage in the time slot from 9 am to 0 pm is equal to or larger than 1.5, the virtual server which is the classification target belongs to a network cluster having an ID “4-2”.
Returning to description of
The example illustrates that the virtual server “server A” and the virtual server “server B” belong to the integration cluster (ID: “5-1”). That is, the virtual server “server A” and the virtual server “server B” have the same or a similar characteristic related to the resource use.
In addition, the example illustrates that the virtual server “server C” and the virtual server “server F” belong to the same integration cluster (ID: “5-2”). That is, the virtual server “server C” and the virtual server “server F” include the same or a similar characteristic related to the resource use.
Furthermore, the example illustrates that the virtual server “server D” and the virtual server “server E” belong to the same integration cluster (ID: “5-3”). That is, the virtual server “server D” and the virtual server “server E” have the same or a similar characteristic related to the resource use.
Returning to description of
In the example, first, it is determined whether the ID of the correlation cluster is “1-1” or “1-2”. In a case where the ID of the correlation cluster is “1-1”, the virtual server which is the classification target belongs to an integration cluster having an ID “5-1”. In contrast, in a case where the ID of the correlation cluster is “1-2”, it is determined whether the ID of the CPU cluster is “2-2” or “2-3”.
In a case where the ID of the CPU cluster is “2-2”, the virtual server which is the classification target belongs to an integration cluster having an ID “5-2”. In contrast, in a case where the ID of the CPU cluster is “2-3”, the virtual server which is the classification target belongs to an integration cluster having an ID “5-3”.
In the example, the ID “5-1” of the integration cluster corresponds to a type of a virtual server which performs a batch process of writing data into a disk at night. The ID “5-2” of the integration cluster corresponds to a type of a virtual server which provides an on-line service in the daytime. The ID “5-3” of the integration cluster corresponds to a type of a virtual server which provides the on-line service all day. As described above, in a case where the type of the virtual server is specified, the ID of the integration cluster may be associated with a type name (for example, a “batch process type”, a “daytime on-line type”, and “all day on-line type”).
Returning to description of
Subsequently, the second phase process will be described.
The second acquisition unit 2001 acquires a log of each resource in the virtual server which is the classification target. The third feature amount calculation unit 2003 performs a third feature amount calculation process. The first applying unit 2005 performs a first application process. The fourth feature amount calculation unit 2007 performs a fourth feature amount calculation process. The second applying unit 2009 performs a second application process.
The third applying unit 2011 performs a third application process. Also, the third feature amount calculation process, the fourth feature amount calculation process, the first application process, the second application process, and the third application process will be described below.
The second log storage unit 2021 stores the log of each resource in the virtual server which is the classification target.
The third feature amount storage unit 2023 stores a correlation coefficient which is the third feature amount. Specifically, the third feature amount storage unit 2023 stores a correlation coefficient table in which the correlation coefficient is set.
The fourth feature amount storage unit 2025 stores time-based resource usage which is a fourth feature amount. Specifically, the fourth feature amount storage unit 2025 stores a resource usage tables (in the example, a CPU usage table, a disk usage table, and a network usage table) in which resource usage is set for each resource in each time slot.
The cluster storage unit 2027 stores an ID of a cluster to which the virtual server which is the classification target belongs. Specifically, the cluster storage unit 2027 stores the cluster table. The cluster table will be described later.
The above-described second acquisition unit 2001, the third feature amount calculation unit 2003, the first applying unit 2005, the fourth feature amount calculation unit 2007, the second applying unit 2009, and the third applying unit 2011 are realized using the hardware resources (for example, see
The above-described second log storage unit 2021, the third feature amount storage unit 2023, the fourth feature amount storage unit 2025, and the cluster storage unit 2027 are realized using the hardware resources (for example, see
The third feature amount calculation unit 2003 performs the third feature amount calculation process (S2103). In the third feature amount calculation process, a correlation coefficient is calculated in relation to a combination of two resources, similarly to the case of the first feature amount calculation process. In the example, a correlation coefficient between the CPU use and the network use, a correlation coefficient between the CPU use and the disk use, and a correlation coefficient between the disk use and the network use are calculated.
Subsequently, the third feature amount calculation unit 2003 calculates the correlation coefficient between the CPU use and the disk use based on the log of the CPU and the log of the disk for the virtual server which is the classification target (S2203).
Finally, the third feature amount calculation unit 2003 calculates the correlation coefficient between the disk use and the network use based on the log of the disk and the log of the network for the virtual server which is the classification target (S2205).
In a case where a memory is focused, the third feature amount calculation unit 2003 may calculate a correlation coefficient between the memory use and the CPU use, a correlation coefficient between the memory use and the network use, and a correlation coefficient between the memory use and the disk use. The correlation coefficient is stored in the third feature amount storage unit 2023 using a correlation coefficient table format which will be described below.
The example illustrates that there is a strong correlation related to the CPU use and the network use for the virtual server “server G” which is the classification target.
Returning to the description of
Returning to the description of
Returning to the description of
The CPU usage in each time slot is stored in the fourth feature amount storage unit 2025, in the format of the CPU usage table. The disk usage in each time slot is stored in the fourth feature amount storage unit 2025 in the format of the disk usage table. The network usage in each time slot is stored in the fourth feature amount storage unit 2025 in the format of the network usage table.
Subsequently, the fourth feature amount calculation unit 2007 calculates the disk usage in each time slot for the virtual server which is the classification target (S2503). A procedure of calculating the disk usage in each time slot is similar to the case of the second feature amount calculation process.
Finally, the fourth feature amount calculation unit 2007 calculates the network usage in each time slot for the virtual server which is the classification target (S2505). A procedure of calculating the network usage in each time slot is similar to a case of the second feature amount calculation process. In a case where the fourth feature amount calculation process ends, the process returns to the calling second phase process.
The example illustrates that the CPU usage in the time slot corresponding to daytime is relatively high in the virtual server “server G” which is the classification target.
The example illustrates that the disk usage is stable throughout the day in the virtual server “server G” which is the classification target.
The example illustrates that the network usage is relatively high in the time slot corresponding to daytime in the virtual server “server G” which is the classification target.
Returning to the description of
Subsequently, the third applying unit 2011 performs the third application process (S2111). In the third application process, the virtual server which is the classification target is classified by applying an ID of a correlation cluster, an ID of a CPU cluster, an ID of a disk cluster, and an ID of a network cluster, to which the virtual server which is the classification target belongs, to the third classification rule. Specifically, the third applying unit 2011 determines an integration cluster to which the virtual server, which is the classification target, belongs.
In the example, in a case where the ID of the correlation cluster “1-2” and the ID of the CPU cluster “2-2”, which are illustrated in
The output unit 211 outputs the ID of the integration cluster corresponding to the virtual server which is the classification target (S2113). The output unit 211 may output a type name corresponding to the ID of the integration cluster. In addition, the output unit 211 may output the ID of the correlation cluster and the ID of the time cluster.
Here, utilization of a result of the classification according to the embodiment in optimization of resource distribution will be described. In a case of a virtual server that performs a process (for example, the batch process) of simultaneously using a plurality of resources with high frequency, the virtual server may not operate as expected even though sufficient permissible usage related to only one resource is secured.
In addition, it is inefficient in a case where large permissible usage related to the resource is secured all the time in relation to a virtual server (for example, a server which provides an on-line service in daytime) which performs a process of using a specific resource with high frequency only in a specific time slot.
However, according to the embodiment, the virtual server is classified based on the correlation characteristic and the temporal characteristic of the resource use. In a case where assignment of the resource is controlled with reference to a result of the classification, the above-described problems are easily solved. That is, in a case where the type of the virtual server is estimated based on the result of the classification, each virtual server is smoothly operated, thereby being helpful to effectively use the resource.
As described above, according to the embodiment, it is possible to generate an apparatus classification rule, which is further suitable to classification in which the characteristic of the resource use is focused, by generating the first phase process.
Specifically, it is possible to generate a rule for classifying the virtual server by combining the correlation characteristic of the resource use with the temporal characteristic of each resource use through the combination.
In addition, it is possible to more correctly classify the virtual server in relation to the characteristic of the resource use through the second phase process.
In an embodiment, a form, in which a second phase process is performed in a classification apparatus 103 that is separate from the classification apparatus 103 which performs the first phase process, will be described.
The first phase unit 201 of the classification apparatus 103a performs a first phase process while using a virtual server, which is deployed in a physical server 101a of the first LAN, as a sample. A first classification rule, which is generated in the first phase process, is stored in the first classification rule storage unit 203. A second classification rule, which is generated in the first phase process, is stored in the second classification rule storage unit 205. A third classification rule, which is generated in the first phase process, is stored in the third classification rule storage unit 207.
Furthermore, the output unit 211 of the classification apparatus 103a outputs the first classification rule, the second classification rule, and the third classification rule which are generated in the first phase process. A form of the output includes, for example, transmission to a network or recording in a storage medium.
A classification apparatus 103b, which is coupled to a second LAN, performs the second phase process. As illustrated in
The reception unit 3101 receives the first classification rule, the second classification rule, and the third classification rule. A form of the reception includes, for example, reception from the network or reading from the storage medium. The first classification rule is stored in the first classification rule storage unit 203. The second classification rule is stored in the second classification rule storage unit 205. The third classification rule is stored in the third classification rule storage unit 207.
The second phase unit 209 of the classification apparatus 103b performs the second phase process while using the virtual server, which is deployed in the physical server 101b, as a classification target according to the first classification rule, the second classification rule, and the third classification rule. Furthermore, the output unit 211 outputs a cluster of the virtual server which is deployed in the physical server 101b.
According to the embodiment, it is easy to apply a rule for classifying an apparatus in relation to the characteristic of the resource use.
In the examples of Embodiments 1 and 2, an example in which the virtual server becomes the classification target is described. However, classification may be performed on a physical server apparatus. In addition, a virtual information processing apparatus other than the server may become the classification target. Furthermore, a physical information processing apparatus other than the server may become the classification target.
Although the embodiments have been described above, the embodiment is not limited thereto. For example, there is a case where the above-described functional block configuration does not coincide with a program module configuration.
In addition, the above-described configuration of each storage area is an example, and the embodiments may not be limited to the above-described configuration. Furthermore, in the flow of the process, as far as a result of the process is not changed, the sequence of the process may be replaced or a plurality of processes may be performed in parallel.
Meanwhile, the above-described classification apparatus 103 is a computer apparatus. As illustrated in
The above-described embodiments are summarized as below.
A generation method according to the embodiment includes (A) calculating, for each of a plurality of apparatuses, a first feature amount that indicates a correlation between resource uses according to a combination of resources which are used by the apparatus based on first logs related to the plurality of resources which are respectively used by the plurality of apparatuses; (B) performing first clustering on the plurality of apparatuses based on the first feature amount; (C) generating a first apparatus classification rule based on a first result of the first clustering; (D) calculating a second feature amount that indicates a resource usage in each time slot for each of the resources which are respectively used by the plurality of apparatuses based on the first logs; (E) performing second clustering on the plurality of apparatuses based on the second feature amount; (F) generating a second apparatus classification rule based on a second result of the second clustering; (G) performing third clustering on the plurality of apparatuses based on the first result of the first clustering and the second result of the second clustering; and (H) generating a third apparatus classification rule based on a third result of the third clustering.
In this manner, it is possible to generate an apparatus classification rule which is more suitable for classification focused on a characteristic of the resource use. Specifically, it is possible to generate a rule for classifying an apparatus by combining the correlation characteristic of the resource use and the temporal characteristic of each resource use according to the combination.
Furthermore, in the third clustering, first cluster identification information according to the first clustering and second cluster identification information according to the second clustering may be used as attributes, and the third apparatus classification rule may include at least one of the first cluster identification information and the second cluster identification information as a judgment condition parameter.
In this manner, it is easy to classify the apparatus by combining the correlation characteristic of the resource use and the temporal characteristic of each resource use according to the combination.
Furthermore, the generation method may further include: (I) calculating a third feature amount that indicates a correlation between resource uses according to a combination of resources which are used by a classification target apparatus based on the second logs respectively related to the plurality of resources which are used by the classification target apparatus; (J) classifying the classification target apparatus by applying the third feature amount to the first apparatus classification rule; (K) calculating a fourth feature amount that indicates the resource usage in each time slot for each of the resources which are used by the classification target apparatus based on the second logs; (L) classifying the classification target apparatus by applying the fourth feature amount to the second apparatus classification rule; and (M) classifying the classification target apparatus by applying a result of third classification according to application of the first apparatus classification rule and a result of fourth classification according to application of the second apparatus classification rule to the third apparatus classification rule.
In this manner, it is possible to more accurately classify the apparatus in relation to the characteristic of the resource use.
Furthermore, a generation method which is executed by a computer that stores a first apparatus classification rule, a second apparatus classification rule, and a third apparatus classification rule which are generated in the above-described process, the generation method may include: (I) calculating a third feature amount that indicates a correlation between resource uses according to a combination of resources which are used by a classification target apparatus based on the second logs respectively related to the plurality of resources which are used by the classification target apparatus; (J) classifying the classification target apparatus by applying the third feature amount to the first apparatus classification rule; (K) calculating a fourth feature amount that indicates the resource usage in each time slot for each of the resources which are used by the classification target apparatus based on the second logs; (L) classifying the classification target apparatus by applying the fourth feature amount to the second apparatus classification rule; and (M) classifying the classification target apparatus by applying a result of third classification according to application of the first apparatus classification rule and a result of fourth classification according to application of the second apparatus classification rule to the third apparatus classification rule.
In this manner, it is easy to apply the rule for classifying the apparatus in relation to the characteristic of the resource use.
Meanwhile, it is possible to prepare a program which causes a computer to perform a process according to the method, and the program may be stored in, for example, a computer-readable storage medium or a storage apparatus such as a flexible disk, a CD-ROM, a magneto-optic disk, a semiconductor memory, or a hard disk. Also, an intermediate processing result is temporally stored in an apparatus such as a general main memory.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2016-022700 | Feb 2016 | JP | national |