It is sometimes desirable to be able to approximate a performance characteristic of a device when only limited information about the device is known. This kind of approximation may be desirable, for example, when planning to maintain or upgrade an Information Technology (IT) infrastructure. When planning IT infrastructure it may be helpful to know performance characteristics of devices that might be added to the IT infrastructure, devices such as soon-to-be-available disk drives and CPUs for which performance statistics, perhaps measured, are not yet available. Performance statistics are often not available when a producer of a device has announced a new device but has not yet made the device available for testing or benchmarking. Sometimes, devices that have been available for testing and benchmarking may not have been tested or benchmarked due to the very large number of devices and the effort needed to perform the testing or benchmarking. Of course, performance statistics are also not available for projected or hypothetical future devices. While a hypothetical future device may be assumed to have some basic characteristics or parameters (e.g. disk RPMs, processor clock speed, etc.), performance measures (e.g., a benchmark test data) for such a device will not exist, making it difficult to plan IT changes around such a device. In sum, there is a need to be able to approximate performance characteristics of devices (whether actual or hypothetical) when perhaps the device can't be physically tested or when information about the device is incomplete.
The following summary is included only to introduce some concepts discussed in the Detailed Description below. This summary is not comprehensive and is not intended to delineate the scope of the claimed subject matter, which is set forth by the claims presented at the end.
Embodiments relate to determining a value of a type of performance parameter of a target device configuration that has known values of various types of configuration attributes. Reference device configurations can be obtained that respectively having known values for types of configuration attributes corresponding to the types of configuration attributes of the target device and respectively having known values of the type of performance parameter whose value is to be determined for the target device. The performance parameter values of the reference device configurations can be weighted based on the reference device configurations' respective distances from the target device configuration in a space defined by the types of configuration attributes, where the types of configuration attributes correspond to respective dimensions of the space. The weighted performance parameter values of the reference device configurations can be used to determine the performance parameter value of the target device.
Many of the attendant features will be more readily appreciated by referring to the following detailed description considered in connection with the accompanying drawings.
Like reference numerals are used to designate like parts in the accompanying Drawings.
The following description will discuss several embodiments for approximating performance parameters for any type of device. This will be followed by discussion of a specific embodiment directed to approximation of disk parameters and discussion of a specific embodiment for approximating CPU parameters.
Because some types of devices have different critical parameters or performance parameters that depend on different combinations of types of configuration parameters, there may be a step of dividing 153 the configuration space into one or more subspaces. The dividing 153 results in a different approximation computation being performed for different target performance parameters, with approximation each being based on a different subset of configuration parameters of reference devices (even possibly different subsets of reference devices).
Whether the problem space is divided 153 or not, the process proceeds to filter 154 reference devices from a reference device library 156 to obtain a set of devices that will be the basis for obtaining an approximation. The reference device library 156 may store different types of device configurations (see
If the filtering 154 produce a sufficient base of reference devices, then the approximation process moves on to compute 160 the distance from each reference device (that passed through the filtering 154). As discussed previously, this can be performed by taking some measure (e.g., Euclidean, Manhattan, etc.) of the distances between the reference devices and the target device in the n-space defined by the n configuration parameters relevant to the current subspace being processed (i.e., configuration parameters for which the target device has known values). The reference devices can also be filtered based on their distance and the reference devices can be weighted according to their distance. In other words, a distance function can be used to both qualify and weigh reference devices in resultant approximations. As an example, distances can be used to compute normalized weights based on the inverse of the distances. If, for example, the distances from the target device to two reference devices were 3 and 4, the normalized weights would be ⅓/(⅓+¼)=0.57 and ¼/(⅓+¼)=0.43. Given weighted distances, the reference devices can be selected in any number of ways. The reference devices can be ordered by their weighted distances and a maximum of the first N devices can be selected. The first N devices with a weighted distance under a threshold can be selected. Or, the first N devices within a statistical range can be selected.
It may be desirable for some types of devices to scale 162 the performance parameters of the reference devices. This can allow greater emphasis to be given where needed. For example, if the target performance parameter type is a SPEC benchmark, it may be that CPU performance generally scales directly with a CPU's clock speed. In such a case, a post-filtering scale factor of (new device clock speed)/(reference device clock speed) would be applied to scale 162 a reference CPU's SPEC value.
After scaling 162, optional curve fitting 164 may be applied to the filtered reference devices' performance parameter values to obtain a function that is evaluated with the new device's various known configuration parameters (or at least those that are part of the current subspace if the problem has been subspace-divided).
Finally, the performance parameter values, as possibly altered in accordance with steps discussed above, are used to obtain 166 an approximation of the new device's performance parameter(s). For example, the approximation may be a weighted average of the performance parameter values of the reference devices. Other approaches will be discussed later with reference to CPU and disk examples. If there are multiple problem subspaces, then the approximation process may be repeated. Otherwise, the process is finished 168.
Following is a discussion of an embodiment for disk drives.
If the rotational speed is considered to be a particularly crucial performance parameter, then the reference configurations used in the approximation should be filtered by this parameter. Filtering can be accomplished by selecting disk configurations which are not approximated (i.e., those which have measured values for performance parameters c0 and c1) and which satisfy RotationalSpeednew=RotationSpeedexisting. In other words, from among disks 1 through 9, configurations are selected for which the rotational speed value matches (or perhaps is within some range of) the rotational speed value of new disk 10. This filtering reduces the dimensionality for the configuration subspace from three dimensions to two dimensions. This 2-dimensional subspace 200 is shown in
To help relate the subspace 190 to filtering steps,
In general, suppose the set of filtered disk configurations contains n such configurations where c0,1 . . . c0,n and c1,1 . . . c1,n represent the benchmark constants c0 and c1, respectively, for configurations 1 . . . n. Further suppose, a new disk configuration is to be added to the library, but that its benchmarks c0,new and c1,new are unknown.
For each filtered configuration i, denote its seek time and transfer rate by seeki and transi, respectively. Then, given the selected disk configurations, the configuration points (vectors) in the subspace may be scaled according to:
where α and β are configuration parameter biases and maximum values are computed over all filtered configurations. This rescales parameter values to unitless quantities and enables subsequent control of parameter value biasing in the subspace approximation.
It can be shown that the mapping of trans into its inverse leads to a more accurate approximation. Also, this inversion more consistently orders points in the subspace so that the configuration distance from the origin (norm) is (inversely) related to the configuration performance. More specifically, each coordinate becomes a measure of latency.
Configuration distances to the new disk may then be calculated. For each filtered configuration xi in the mapped configuration subspace, the distance di to the new configuration xnew is calculated. For example, d may be computed using the 2-norm ∥·∥2. That is,
where x=(se{tilde over (e)}k,trãns) is a point (vector) in the mapped configuration subspace.
To help relate the subspace 200 to approximation steps,
Performance parameter values may then be weighted. For each filtered configuration xi in the mapped configuration subspace, performance weights are defined as
where w0,i are the weights used in the approximation for c0,new, and w1,i are the weights used in the approximation for c1,new. These weights amplify performance contributions of nearby neighbor configurations. Although it may be assumed that d=∥xi−xnew∥2, other distance metrics may be used. Furthermore, although it may be assumed that α0=620=1.0 and α1=β1=1.0, other approaches may be used if the correlations between c0,new and c1,new and SeekTime and TransferRate are known. For example, α0=1.0, β0=0.1 would favor SeekTime in the approximation for c0,new. And, α1=0.1, β1=1.0 would favor TransferRate in the approximation of c1,new. Note that this biasing creates two 2-dimensional configuration subspaces. Note also that if a filtered configuration exists which is identical to the new configuration in the critical subspace, then the amplification is infinite. In this case, performance values are also taken to be identical and the approximation process can terminate.
Finally, performance parameters (in this case, benchmarks) for the target device can be approximated. Each benchmark of the new configuration is calculated as a weighted average: c0,new=1/w0·Σi=1nc0,i·w0,i and c1,new=1·Σi=1nc1,i·w1,i, where w0=Σi=1n w0,i and w1=Σi=1n w1,i. These calculations are performed for each combination of IoOperation and IoPattern.
Following is a discussion of an embodiment for CPUs.
Existing CPU configurations are filtered. Filtering is done by selecting from the library of CPU configurations 220 those CPU configurations which are not approximated and which satisfy Manufacturenew=Manufacturerexisting and Modelnew=Modelexisting. These parameters are preferably identical for existing configurations to be considered as candidates in the approximation. If no existing configurations satisfy this filter, then the approximation attempt is aborted. If additional parameters are also identical, then the approximation can be improved. In the limiting case that all parameters are identical, the configurations are considered identical including their benchmarks. A filter relaxation scheme, discussed below, can be used to attempt to further restrict existing configurations considered in the approximation.
The ordering of bit-masks in the filtration precedence array 240 specifies the filter relaxation (back-off) sequence, or precedence. For example, (1, 1, 1) is a preferred filter compared to (1, 1, 0), and so on. Preferably, the relaxation sequence terminates the first time at least one matching configuration is found. All configurations corresponding to the first nonempty bit-mask are the configurations used in the approximation.
To illustrate the filter relaxation scheme, suppose there are no existing configurations which satisfy bit-masks (1, 1, 1) and (1, 1, 0), but there are three existing configurations that satisfy bit-mask (1, 0, 0). Then only these three existing configurations pass the filtration and are considered in the approximation. It should be noted that in one embodiment, the relaxation scheme determines weights of the various distances, which are discussed below.
The next stage in the SPEC approximation process is to define the configuration subspace. The de facto configuration subspace corresponding to the filtering above is defined by ProcessorCount, CoresPer ProcessorCount, HyperthreadsPerCoreCount, and ProcessorSpeed. Note that if multi-core scalability is assumed to be sufficiently close to multi-processor scalability, then the configuration subspace can be reduced further to ProcessorSpeed, CoreCount, and HyperthreadsPerCoreCount, where CoreCount=ProcessorCount·CoresPer ProcessorCount.
Configuration subspace resealing and configuration distance calculation can be performed as with the disk configuration resealing and distancing discussed in the example above. Note that BusSpeed, L3CacheSize, and L2CacheSize may have null values. In such cases, the distance metric is affected as shown in
The next step of SPEC approximation is to select the closest configuration. That is, the process finds the filtered configuration in the computed subspace with the minimum distance to the new configuration. This selected configuration is then used as the reference configuration in the resealing of the SPEC benchmark for the new configuration.
The SPEC benchmarks (performance parameter values) of the reference configurations are then rescaled. To this end, the SPEC benchmark for the new configuration, specnew, is related to the SPEC benchmark for the reference configuration, specref, as follows: specnew=specref·scalespeed·scaleprocessor·scalecore·scalethread, where scalespeed is a function parameterizing speed resealing, scaleprocessor is a function parameterizing processor scalability, scalecore is a function parameterizing multi-core scalability, and scalethread is a function parameterizing hyper-threading (tm) scalability. For purposes of this example, it may be assumed that scale
The processor scaling factor factorprocessor is computed by considering configurations which are identical with respect to all parameter values except ProcessorCount. Technically, this filter could be relaxed to admit system configurations which also differ in ProcessorSpeed, but this would introduce an additional resealing step. Note, n1→2 is the number of samples used to compute the average ratio between 2-processor and 1-processor SPEC benchmarks, and n2→4 is the number of samples used to compute the average ratio between 4-processor and 2-processor SPEC benchmarks.
The processor core scaling factor factorcore can be taken to be the same as factorprocessor if there is sparse availability of benchmark data and/or good scalability is observed for multi-core configurations.
The constant 1.22 is from the WebBench benchmark. Currently, factorthread reduces to 1.22 since hyper-threading (tm) technology only supports two threads per physical processor. In the future, if hyper-threading (tm) supports more threads, then ideally this benchmark should be updated and one or more new constants introduced, but is not strictly required.
The WebBench benchmark is used since the inventory of SPEC benchmarks for hyper-threaded configurations is very limited. As more SPEC benchmarks for hyper-threaded configurations become available, reliance on WebBench will become unnecessary.
In the future, to improve scaling accuracy it would be worthwhile to consider introducing scaling factors which are a function of the scaling regime, e.g., factorprocessor is calculated for scaling from 1 to 2 processors only, is separately calculated for scaling from 2 to 4 processors only, and so on.
As discussed above, processes for approximating device parameters can be embodied in any variety of computation systems or media for enabling computation systems to perform such processes. Furthermore, some portions of such processes may actually be performed manually or via operator input to a computation system. For example, for an approximation, an operator might determine whether filtration should occur or what filtration criteria should be used. An operator might also select which types of parameters should be used for the subspace(s) of the selected or filtered pool of reference device configurations.
In conclusion, those skilled in the art will realize that storage devices used to store program instructions for implementing embodiments described above can be distributed across a network. For example a remote computer may store an example of a process described as software. A local or terminal computer may access the remote computer and download a part or all of the software to run the program. Alternatively the local computer may download pieces of the software as needed, or distributively process by executing some software instructions at the local terminal and some at the remote computer (or computer network). Those skilled in the art will also realize that by utilizing conventional techniques known to those skilled in the art, all or a portion of the software instructions may be carried out by a dedicated circuit, such as a DSP, programmable logic array, or the like.
All of the embodiments and features discussed above can be realized in the form of information stored in volatile or non-volatile computer or device readable medium. This is deemed to include at least media such as CD-ROM, magnetic media, flash ROM, etc., storing machine executable instructions, or source code, or any other information that can be used to enable or configure computing devices to perform the various embodiments discussed above. This is also deemed to include at least volatile memory such as RAM storing information such as CPU instructions during execution of a program carrying out an embodiment.