The present invention relates to evaluating system performance.
Information services and data processing industries in general have rapidly expanded as a result of the need for computer systems to manage and store large amounts of data. As an example, financial service companies such as banks, mutual fund companies and the like now, more than ever before, require access to many hundreds of gigabytes or even terabytes of data and files stored in high capacity data storage systems. Other types of service companies have similar needs for data storage.
Data storage system developers have responded to the increased need for storage by integrating high capacity data storage systems, data communications devices (e.g., switches), and computer systems (e.g., host computers or servers) into so-called “storage networks” or “Storage Area Networks” (SANs.)
In general, a storage area network is a collection of data storage systems that are networked together via a switching fabric to a number of host computer systems operating as servers. The host computers access data stored in the data storage systems (of a respective storage area network) on behalf of client computers that request data from the data storage systems. For example, according to conventional applications, upon receiving a storage access request, a respective host computer in the storage area network accesses a large repository of storage through the switching fabric of the storage area network on behalf of the requesting client. Thus, via the host computer (e.g., server), a client has access to the shared storage system through the host computer. In many applications, storage area networks support hi-speed acquisitions of data so that the host servers are able to promptly retrieve and store data from the data storage system.
Today's computer systems can be complex. Effectively evaluating the performance of computer systems, including data storage systems, helps ensure acceptable performance of these complex systems. Towards this goal, many tools have been developed to monitor system resources, system performance, and application performance. For example, a tool may be used to determine the cause of a bottleneck or performance issue.
A method and system for use in evaluating system performance is disclosed. In at least one embodiment, the method and system comprises collecting system performance, management operations, and system events data for a computer system; correlating the management operations and the system events data with the performance data; and based on the correlation, providing a graphical user interface for enabling performance evaluations of the computer system by graphically displaying the management operations and the system events data overlaying the performance data.
Features and advantages of the present invention will become more apparent from the following detailed description of exemplary embodiments thereof taken in conjunction with the accompanying drawings in which:
Described below is a technique for use in evaluating system performance. In at least one embodiment of the technique, the technique helps correlate management operations and system events with system performance using a graphical user interface. The correlation may help a user improve system performance in various ways. For example, the correlation may enable a user to more easily pinpoint causes of system bottlenecks and errors, and drive future user behavior that results in improved system performance.
Referring now to
Each of the host systems 14a-14n and the data storage system 12 included in the computer system 10 may be connected to the communication medium 18 by any one of a variety of connections as may be provided and supported in accordance with the type of communication medium 18. Similarly, the management system 16 may be connected to the communication medium 20 by any one of variety of connections in accordance with the type of communication medium 20. The processors included in the host computer systems 14a-14n and management system 16 may be any one of a variety of proprietary or commercially available single or multi-processor system, such as an Intel-based processor, or other type of commercially available processor able to support traffic in accordance with each particular embodiment and application.
It should be noted that the particular examples of the hardware and software that may be included in the data storage system 12 are described herein in more detail, and may vary with each particular embodiment. Each of the host computers 14a-14n, the management system 16 and data storage system may all be located at the same physical site, or, alternatively, may also be located in different physical locations. In connection with communication mediums 18 and 20, a variety of different communication protocols may be used such as, e.g., SCSI, FC, and iSCSI. Some or all of the connections by which the hosts, management system, and data storage system may be connected to their respective communication medium may pass through other communication devices, such as a Connectrix or other switching equipment that may exist such as a phone line, a repeater, a multiplexer or even a satellite. In one embodiment, the hosts may communicate with the data storage system over an iSCSI or fibre channel connection and the management system may communicate with the data storage systems over a separate network connection using TCP/IP. It should be noted that although
Each of the host computer systems may perform different types of data operations in accordance with different types of tasks. In the embodiment of
The management system 16 may be used in connection with management of the data storage system 12. The management system 16 may include hardware and/or software components. The management system 16 may include one or more computer processors connected to one or more I/O devices such as, for example, a display or other output device, and an input device such as, for example, a keyboard, mouse, and the like. A data storage system manager may, for example, view information about a current storage volume configuration on a display device of the management system 16.
An embodiment of the data storage system 12 may include one or more data storage systems. Each of the data storage systems may include one or more data storage devices, such as disks. One or more data storage systems may be manufactured by one or more different vendors. Each of the data storage systems that may be included in 12 may be inter-connected (not shown). Additionally, the data storage systems may also be connected to the host systems through any one or more communication connections that may vary with each particular embodiment and device in accordance with the different protocols used in a particular embodiment. The type of communication connection used may vary with certain system parameters and requirements, such as those related to bandwidth and throughput required in accordance with a rate of I/O requests as may be issued by the host computer systems, for example, to the data storage systems 12.
It should be noted that each of the data storage systems may operate stand-alone, or may also be included as part of a storage area network (SAN) that includes, for example, other components such as other data storage systems.
Data storage system 12 may include a plurality of disk devices or volumes. The particular data storage system and examples as described herein for purposes of illustration should not be construed as a limitation. Other types of commercially available data storage systems, as well as processors and hardware controlling access to these particular devices, may also be included in an embodiment.
Servers or host systems, such as 14a-14n, provide data and access control information through channels to the storage system 12, and the storage system 12 may also provide data to the host systems also through the channels. The host systems do not address the disk drives of the storage system 12 directly, but rather access to data may be provided to one or more host systems from what the host systems view as a plurality of logical devices or logical volumes (LVs). The LVs may or may not correspond to the actual disk drives. For example, one or more LVs may reside on a single physical disk drive. Data in a storage system may be accessed by multiple hosts allowing the hosts to share the data residing therein. An LV or LUN (logical unit number) may be used to refer to one of the foregoing logically defined devices or volumes.
In accordance with an embodiment of the current technique, management system 16 may provide a graphical user interface (GUI) that allows a user to visually monitor and analyze the performance of computer system 10. Generally, performance may be thought of as the amount of work accomplished by a system compared to the time and resources used. Some common terms used when measuring system performance include bandwidth, throughput, response time, availability, capacity, and recovery time.
In at least one embodiment, a GUI may present information derived from system logs that may track, for example, historical system performance pertaining to the storage system 12, the hosts 14a-14n, and the storage network as a whole. For example, the logs may be comprised of metrics data related to CPUs, memory, and I/O resources associated with one or both of storage system 12 and hosts 14a-14n, and the storage devices of storage system 12. In some embodiments, the metrics data may be collected by hardware and software located on one or more of data storage system 12, management system 16, and hosts 14a-14n. For example, at least some of the data may be obtained using a performance analysis software tool installed on management system 16 or a host (e.g., host 14a). The tool may gather necessary data stored within storage system 12 to conduct performance evaluations.
In some embodiments, management operations data may also be tracked in the system logs. Example management operations include, without limitation, configuring and provisioning storage in a data storage system for use with a particular application; backing up, moving, reorganizing, protecting, analyzing, modifying, and repairing objects stored within a data storage system; and upgrading software associated with the data storage system. The logs may also include data relating to any applications that are associated with the various management operations as well as performance metrics data collected before, during, and after execution of a management operation that may be associated with any applications, management operations, and data storage systems. Further, the system logs may also include data relating to the level of expertise, role, and permissions of users invoking and carrying out management operations.
The system logs may also track system events that may or may not be user-invoked. For example, the logs may include data pertaining to system- or application-invoked operations, a user log-in or log-out, system or application alerts, and software or hardware failures.
Depending on the embodiment, the data contained within the system logs may be used, combined, analyzed, and displayed within a GUI for visual inspection by a user. In at least one embodiment, using the system logs, management operations and events may be correlated with system performance metrics over time to enable the visual display of the management operations and events overlaying the system performance metrics data. In this embodiment, users may be able to better determine the impact of management operations and events on system performance.
Referring now to
In the embodiment of
The management operations and events overlaying system performance metrics data may enable a user to more easily determine how particular operations or events impact the system. For example, with an inspection of GUI 200 of
In some embodiments, a user of GUI 200 may be presented with more information pertaining to a management operation or event by, for example, hovering over a point with a mouse pointer or selecting a point. For example, selecting point 205 may display pop-up window 215, which displays the name of the shared folder, the date the operation was initiated, and the user who initiated the operation. Selecting point 205 or 210 may display pop-up window 220, corresponding to the point in time at which the operation completed, as shown by point 210. Pop-up window 220 may display the same information as pop-up window 215 but may also display other information pertaining to the operation such as, for example, the time needed to complete the operation. Selecting point 205 or point 210 may also display more detailed information in area 225 such as, for example, the time the operation was initiated or completed, the user role of the user who initiated the operation, and a brief description of the operation.
It should be noted that different embodiments may provide additional features in accordance with the current technique. One such feature is the ability to zoom in on a GUI such as that illustrated in
Using the icons in GUI 300 a user may notice that system performance declined over a period of several days following the creation of a virtual machine on March 11, as indicated by icon 310, and following a provisioning operation on April 1, as indicated by icon 320. Following the creation of a shared folder on March 21, as indicated by icon 315, a user may also notice that system performance slightly declined before improving considerably over a three or four day period.
The zoom-in feature may allow evaluation of system performance over a short time period. The zoomed-out feature may display management operations and events that have most affected system performance over a longer period. With each zoom, additional management operations and events may be added and displayed. Each added and displayed operation and event may have affected system performance less than the operations and events displayed at the previous level. The zoom feature may also display sub steps that comprise a management operation or event. In this example, a user may be able to more specifically determine what step of a management operation or event caused a particular system response.
Similar to GUI 200 of
Other features that may be provided in one or more embodiments include:
While the invention has been disclosed in connection with preferred embodiments shown and described in detail, their modifications and improvements thereon will become readily apparent to those skilled in the art. For example, the technique described herein may be applied to any computer system. Accordingly, the spirit and scope of the present invention should be limited only by the following claims.
Number | Name | Date | Kind |
---|---|---|---|
5105372 | Provost et al. | Apr 1992 | A |
5689708 | Regnier et al. | Nov 1997 | A |
6126329 | Bennett et al. | Oct 2000 | A |
6188992 | French | Feb 2001 | B1 |
6356256 | Leftwich | Mar 2002 | B1 |
6473407 | Ditmer et al. | Oct 2002 | B1 |
6622221 | Zahavi | Sep 2003 | B1 |
6707454 | Barg | Mar 2004 | B1 |
6772375 | Banga | Aug 2004 | B1 |
6898556 | Smocha et al. | May 2005 | B2 |
6975963 | Hamilton et al. | Dec 2005 | B2 |
7134093 | Etgen et al. | Nov 2006 | B2 |
7348981 | Buck | Mar 2008 | B1 |
7849263 | French | Dec 2010 | B1 |
8015454 | Harrison | Sep 2011 | B1 |
8024542 | Chatterjee | Sep 2011 | B1 |
8099674 | Mackinlay et al. | Jan 2012 | B2 |
8209218 | Basu et al. | Jun 2012 | B1 |
8386301 | Rajasingham | Feb 2013 | B2 |
8566736 | Jacob | Oct 2013 | B1 |
8667385 | Mui | Mar 2014 | B1 |
9336340 | Dong | May 2016 | B1 |
20030071814 | Jou | Apr 2003 | A1 |
20040015579 | Cooper et al. | Jan 2004 | A1 |
20040041838 | Adusumilli et al. | Mar 2004 | A1 |
20040095349 | Bito | May 2004 | A1 |
20050065756 | Hanaman et al. | Mar 2005 | A1 |
20060189330 | Nelson et al. | Aug 2006 | A1 |
20060277087 | Error | Dec 2006 | A1 |
20060277197 | Bailey | Dec 2006 | A1 |
20070043861 | Baron et al. | Feb 2007 | A1 |
20070061450 | Burnley et al. | Mar 2007 | A1 |
20070230682 | Meghan et al. | Oct 2007 | A1 |
20080091669 | Anderson et al. | Apr 2008 | A1 |
20080244453 | Cafer | Oct 2008 | A1 |
20090024979 | Chessell et al. | Jan 2009 | A1 |
20090172272 | Yamane et al. | Jul 2009 | A1 |
20090300544 | Psenka et al. | Dec 2009 | A1 |
20100114493 | Vestal | May 2010 | A1 |
20100121707 | Goeldi | May 2010 | A1 |
20100274531 | Bondi | Oct 2010 | A1 |
20100274599 | DeRoller | Oct 2010 | A1 |
20110029853 | Garrity et al. | Feb 2011 | A1 |
20110167433 | Appelbaum et al. | Jul 2011 | A1 |
20110173342 | Cooper et al. | Jul 2011 | A1 |
20110185280 | Nakash et al. | Jul 2011 | A1 |
20110197122 | Chan et al. | Aug 2011 | A1 |
20110208564 | Ballow et al. | Aug 2011 | A1 |
20110252214 | Naganuma et al. | Oct 2011 | A1 |
20120023429 | Medhi | Jan 2012 | A1 |
20130097662 | Pearcy et al. | Apr 2013 | A1 |
Entry |
---|
“Website Traffic Analysis Engine and User Interface,” U.S. Appl. No. 60/688,076, by Catherine Wong, Brett Error, Chris Error, and Josh Ezro, filed Jun. 6, 2005. |
Infragistics, “Telling a Good Story with Interactive Line Charts,” Feb. 2, 2010, http://www.infragistics.com/community/blogs/ux/archive/2010/02/02/telling-a-good-story-with-interactive-line-charts.aspx. |
Lee et al., “Load testing Web applications using IBM Rational Performance Tester: Part 4. Test results analysis reports,” Sep. 14, 2007, http://www.ibm.com/developerworks/rational/library/09/loadtestwebapps—part4/loadtestwebapps—part4-pdf.pdf. |