Various embodiments described herein relate to computer software, and in particular to systems and methods for generating simulation data for predicting system performance and capacity of computing systems.
Information technology (IT) systems include a large number of components, such as servers, storage devices, routers, gateways, and other equipment. When an IT system is designed, an architecture is specified to meet various functional requirements, such as capacity, throughput, availability, and redundancy. In order to determine if a proposed system architecture can meet the functional performance requirements, it is desirable to simulate operation of the system before it is built, as building and testing an IT system before deployment may be cost prohibitive, particularly if a production-like test environment is built. This process is sometimes referred to as performance modeling, which refers to creating a computer model that emulates the performance of a computer system.
Performance modeling may be used to test the performance of an IT system before it is built. In general, capacity management requires predicting future needs based on historical results. This approach requires having performance data for the system available in order to calibrate the model. The accuracy of the modeling results depends on the availability of reliable and plausible simulation data.
Performance modeling can also be used as part of capacity planning to plan for future growth of current systems. Today most data centers are under-utilized, and server over-provisioning is often used an expensive means to ensure fulfillment of service level agreements (SLAs) in order to keep up with increasing business demands for faster delivery of IT services. Data center growth can cause significant strain on IT budgets and management overhead. IT organizations bear the capital expenditure and operating costs of this equipment and are looking for safe, predictable and cost-effective ways to consolidate and optimize their data center infrastructure. Many organizations have turned to virtualization to consolidate servers and reclaim precious data center space in hope to realize higher utilization rates and increased operational efficiency. Without proper tools and processes, IT organizations are experiencing “VM sprawl,” increasing software license costs and complexity.
As those skilled in the art will appreciate, performance modeling can be used to predict and analyze the effect of various factors on the modeled system. These factors include changes to the input load, or to the configuration of hardware and/or software. Indeed, performance modeling has many benefits, including performance debugging (identifying which, if any, system components are performing at unacceptable levels, and why they are underperforming), capacity planning (applying projected loads to the model to analyze what hardware or configurations would be needed to support the projected load), prospective analysis (the ability to test “what if” scenarios with respect to the system, its configuration, and its workload), and system “health” monitoring (determining whether the computer system is operating according to expected behaviors and levels).
While performance modeling provides tremendous benefits, currently, good performance modeling is difficult to obtain. More particularly, it is very difficult to accurately and adequately create a performance model for a typical system in all its complexity. As such, generating performance models have largely been the purview of consultants and others with specialized expertise in this arena. Even more, performance modeling is currently the product of laboratory, controlled environment analysis. As such, even the best performance models only approximate what actually occurs in the “live,” deployed and operating system.
A method according to some embodiments includes providing a hierarchy of pattern definitions, wherein each pattern definition in the hierarchy of pattern definitions is associated with a parameter that is used to simulate operation of a computer system, and wherein each pattern definition in the hierarchy of pattern definitions comprises at least a value producer and a time interval, and traversing the hierarchy of pattern definitions for each parameter. Traversing the hierarchy of pattern definitions includes repeating, until a final pattern definition is selected, steps of: (a) retrieving a first pattern definition, (b) determining if a simulation time falls within the time interval associated with the first pattern definition, (c) in response to determining that the simulation time falls within the time interval associated with the first pattern definition, determining if the first pattern definition is overridden by a subsequent pattern definition in the hierarchy of pattern definitions, (d) in response to determining that the first pattern definition is overridden by a subsequent pattern definition, retrieving the subsequent pattern definition, and (e) in response to determining that the first pattern definition is not overridden by a subsequent pattern definition, selecting the first pattern definition as the final pattern definition. The method further includes generating an event associated with the parameter in accordance with the value generator of the selected pattern definition, and transmitting the event to a system testing platform.
The method may further include sequentially selecting a system element from a plurality of system elements in the computer system, and generating events related to the selected system element.
The method may further include generating a plurality of events associated with the parameter in accordance with the value generator, and transmitting the plurality of events to the system testing platform.
The hierarchy of pattern definitions for a given parameter may include a list of pattern definitions arranged in hierarchical order, and wherein traversing the hierarchy of pattern definitions comprises processing the list sequentially until a pattern definition is found that is not overridden by a subsequent pattern definition.
The value producer may define a type of value produced and a range of value produced. In particular embodiments, the value producer may include one of a linear deterministic value producer and a nonlinear deterministic value producer. The value producer may include a random value producer.
The value producer may include a random value producer and a deterministic value producer, wherein the value is produced as a sum of the output of the random value producer and the deterministic value producer.
Related systems and computer program products are provided.
It is noted that aspects described herein with respect to one embodiment may be incorporated in different embodiments although not specifically described relative thereto. That is, all embodiments and/or features of any embodiments can be combined in any way and/or combination. Moreover, other systems, methods, and/or computer program products according to embodiments will be or become apparent to one with skill in the art upon review of the following drawings and detailed description. It is intended that all such additional systems, methods, and/or computer program products be included within this description, be within the scope of the present disclosure, and be protected by the accompanying claims.
Aspects described herein are illustrated by way of example and are not limited by the accompanying figures, with like references indicating like elements.
Some embodiments of the inventive concepts provide systems and methods that generate simulation data for testing IT system architectures in the design, planning or production phase. Testing the quality of predicted performance requires a flexible tool that can produce controlled data. Moreover, testing features such as business hours requires the ability to generate sophisticated data patterns. Accordingly, to generate simulated data, such as network traffic levels, processor utilization, etc., for an IT system, it is desirable to have a tool that generates realistic and controlled data.
Some embodiments described herein provide a flexible method for generating time series data based on hierarchical rule based configuration with pluggable value producers. In particular, various embodiments described herein provide a method and a tool for generating time series data that can be used for testing an IT system.
Various embodiments employ a hierarchical rule configuration with pluggable value producers. The value producers may have a configurable randomization level that provides a way to define sophisticated data patterns.
The embodiments described herein can make the process of performance modeling faster and/or more efficient by providing ways to model very complicated dependencies quickly and with minimal configuration.
Capacity manager testing requires months of metric data. Collecting real time data can take a prohibitively long time, and the data collected real time is often unpredictable and uncontrolled. Moreover, real time data may not be appropriate for stressing an IT system model in ways that system planners would like to see the system stressed. For example, real time data that reflects ordinary system loading may not stress the IT system in a way that adequately reveals the system's ability to deal with extraordinary system loading.
In general, when generating metric data for use in performance modeling, it is extremely useful to be able to generate metrics with an arbitrary level of daily and hourly behavior, as it is desirable to be able to test the system against a variety of time series data patterns.
A capacity management product, such as CA Capacity Manager by Computer Associates, Inc., Islandia, N.Y., predicts future needs based on historical results. Testing the quality of predictions requires a flexible tool that can produce controlled data. Moreover, the testing of features that depend on business hours requires the ability to generate sophisticated data patterns. For demonstration and/or planning purposes, it is desirable to have a tool that can feed the capacity management product with realistic and controlled data that can help highlight various product features.
The events generated by the event generator 180 or derived from the system under test 200 may include, for example, network trace data 210, web log data 220 and/or resource utilization data 230, such as CPU usage, memory usage, throughput, communication link bandwidth usage, etc.
The event generator 180 includes a database 120 that stores at least one discrete event model 130 that is used to generate simulation data according to various embodiments described herein. The event generator 180 further includes a discrete event simulator 110 that generates the generate simulation data according to various embodiments described herein by processing and applying the discrete event model 130.
The performance modeling system 100 may include at least a data collection module 140 that collects event data from the event generator 180 and/or a system under test 200, and a performance modeling module 150 that applies the event data to a system model 145 stored or accessible by the performance modeling system 100. The performance modeling system 100 may provide network element information to the event generator 180. In response, the event generator 180 may generate simulation data in the form of simulated events for the identified network elements and transmit the simulated events back to the performance modeling system 100.
There may be a number of parameters associated with each network element. For each network element, the event generator may generate events associated with each parameter of the network element. Thus, in order to generate simulation data for an IT system, the event generator 180 may generate a series of simulated events for each parameter of each network element in the system.
The parameters for various types of network elements may differ based on the type of network element in question. For example, a processor may have as its parameters CPU utilization, cache utilization, thread usage, etc.
According to some embodiments, a metric configuration is provided for each parameter which includes a nested set of pattern definitions (rules). For each parameter, the configurational hierarchy is processed top to bottom with the next rule overwriting the previous rule for the time period where the rules overlap.
Each pattern definition is a rule that supports a value producer and defines logical expressions describing the time period during which the rule should be applied. While multiple value producers can be implemented within the tool (e.g., Linear, Random, Sine, ArcTan) the tool also provides an open architecture to which custom value producers can be added.
In some embodiments, the value producer may include one of a linear deterministic value producer and a nonlinear deterministic value producer. The value producer may include a random value producer.
In some embodiments, the value producer may include both a random value producer and a deterministic value producer, wherein the value is produced as a sum of the output of the random value producer and the deterministic value producer.
A pattern definition may be stated, for example, in XML (extensible markup language) for ease of application. However, the inventive concepts are not limited thereto, and other formats, such as key/value files, can be used to arrange the pattern definitions.
Each pattern definition may include, for example, the fields shown in Table 1, below:
The value producer field may have a number of sub-fields, such as those shown in Table 2, below.
Table 2 shows an example configuration file according to some embodiments that defines a metric, referred to as “Total CPU Utilization,” that specifies CPU utilization pattern with three rules. The first rule specifies that where CPU usage grows linearly 10 to 15 with a 10% randomness factor. The second rule specifies that every day between 6 pm and 9 pm the CPU usage value is random within 30-40 range. Thus, the second rule overrides the first rule every day between 6 PM and 9 PM. The third rule specifies that on Thursdays, the CPU usage equals 60 between 12 and 2 pm. The third rule overrides both the first and second rules on Thursdays between 12 PM and 2 PM.
Configuration files containing rule definitions may be stored as models associated with particular network elements or types of network elements in the database 120 (
Referring to
The performance modeling system 100 may then transmit a name/ID/type of network element to the event generator 180. Alternatively or additionally, the performance modeling system 100 may transmit a model or a model name to the event generator 180 for use in generating the events (block 515).
For each network element, the performance modeling system 100 then requests the event generator 180 to generate a set of events (block 520) as needed for simulation. For example, the performance modeling system 100 may send a model or model name associated with a CPU to the event generator 180 and request the event generator 180 to generate events in accordance with the model for a first period of time. The event generator 180 generates the requested events and provides them to the performance modeling system 100 in accordance with the methods described herein.
The performance modeling system 100 then checks at block 530 to see if more events are needed, such as events for a second period of time, and if more events are needed, operations return to block 520 where the performance modeling system 100 obtains a further set of events associated with the selected network element from the event generator 180.
If there are no more events needed for the selected network element, operations proceed to block 540, where the performance modeling system 100 stores the event data for the selected network element. The performance modeling system 100 then checks at block 550 to see if there are more network elements for which events need to be generated. If so, operations return to block 510, and the performance modeling system 100 selects the next network element from the system model 145.
If event data has been obtained for all network elements, operations proceed to block 560, where the performance modeling system 100 models performance of the IT system using the generated events. In some embodiments, the performance modeling system 100 models performance of the IT system using both the generated events and real events derived from a system under test 200 (
Operations of an event generator 180 according to some embodiments are illustrated in
For the selected parameter, the event generator 180 selects a next pattern definition from the hierarchy of pattern definitions associated with the parameter (block 420). Based on the simulation time, the event generator 180 then checks the hierarchy of pattern definitions to determine if the selected pattern definition has been overridden (block 430). If so, the event generator 180 repeats the selection of a next pattern definition from the hierarchy of pattern definitions until a pattern definition that has not been overridden is found.
Once a pattern definition that has not been overridden has been selected, the event generator 180 generates an event according to the selected pattern definition (block 440). The event generator 180 stores the event at block 450, and then checks to see if there are any further parameters that need to be simulated for the selected network element in block 460. if so, operations return to block 410 and the next parameter is selected. If not, then at block 470 the generated events are then transmitted to the performance modeling system.
It will be appreciated that many different implementations are possible within the scope of the inventive concepts. For example, although illustrated as separate entities, the event generator 180 may be implemented within the performance modeling system 100, such as in the form of a functional module within the performance modeling system 100. Moreover, many of the elements and functions illustrated as belonging to the event generator 180 and the performance modeling system 100 may be implemented in other ways. For example, the discrete event models 130 could be stored within the performance modeling system 100, and/or the system model 145 could be provided to the event generator 180. In some embodiments, the event generator 180 may store all events associated with a particular network element and transmit them together to the performance modeling system 100, while in other embodiments the event generator may transmit the events to the performance modeling system 100 as they are generated.
The storage system 910 may include, for example, a hard disk drive or a solid state drive, and may a data storage 952 for storing generated events and a model storage 954 for storing the event models.
The storage system 1010 may include, for example, a hard disk drive or a solid state drive, and may a data storage 1052 for storing events received from the event generator 180.
In the above-description of various embodiments, various aspects may be illustrated and described herein in any of a number of patentable classes or contexts including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof Accordingly, various embodiments described herein may be implemented entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.) or by combining software and hardware implementation that may all generally be referred to herein as a “circuit,” “module,” “component,” or “system.” Furthermore, various embodiments described herein may take the form of a computer program product comprising one or more computer readable media having computer readable program code embodied thereon.
Any combination of one or more computer readable media may be used. The computer readable media may be a computer readable signal medium or a non-transitory computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an appropriate optical fiber with a repeater, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible non-transitory medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB.NET, Python or the like, conventional procedural programming languages, such as the “C” programming language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a Software as a Service (SaaS).
, Various embodiments were described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), devices and computer program products according to various embodiments described herein. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable instruction execution apparatus, create a mechanism for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a non-transitory computer readable medium that when executed can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions when stored in the computer readable medium produce an article of manufacture including instructions which when executed, cause a computer to implement the function/act specified in the flowchart and/or block diagram block or blocks. The computer program instructions may also be loaded onto a computer, other programmable instruction execution apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatuses or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various aspects of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terminology used herein is for the purpose of describing particular aspects only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items and may be designated as “/”. Like reference numbers signify like elements throughout the description of the figures.
The description herein has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. The aspects of the disclosure herein were chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure with various modifications as are suited to the particular use contemplated.