The subject matter of this disclosure is generally related to data storage systems.
Electronic data storage is a critical infrastructure for organizations that rely on software for business processes. A typical datacenter includes clusters of server computers and data storage nodes that are interconnected via network switches. The data storage nodes may include, or be part of, storage arrays, storage area networks (SANs), and network-attached storage (NAS), for example, and without limitation. The servers run instances of host applications that support organizational processes such as email, accounting, inventory control, e-business, and engineering. Host application data is maintained by the storage nodes. Input-output (IO) commands are sent by the servers to the storage nodes to access storage objects on which the host application data is logically stored. Storage node performance can be measured in terms of IO access latency, which is elapsed time between receipt of an IO command from a server and transmission of a corresponding response (data or ACK) to the server.
A method in accordance with some implementations comprises: monitoring per-thread utilization of central processing unit (CPU) cycles in a storage system that comprises at least one storage director with CPU complexes configured to run threads comprising threads of emulations that perform storage-related tasks; identifying clusters of the threads that are most highly correlated with utilizations of code paths comprising sets of multiple thread types that function together to perform a specific task; and using the identified clusters of threads to identify at least one of the threads as a cause of a performance problem of the storage system.
An apparatus in accordance with some implementations comprises: a storage system comprising at least one storage director with central processing unit (CPU) complexes configured to run threads comprising threads of emulations that perform storage-related tasks and a process configured to: monitor per-thread utilization of CPU cycles; identify clusters of the threads that are most highly correlated with utilizations of code paths comprising sets of multiple thread types that function together to perform a specific task; and use the identified clusters of threads to identify at least one of the threads as a cause of a performance problem of the storage system
In accordance with some implementations, a non-transitory computer-readable storage medium stores instructions that when executed by a computer perform a method comprising: monitoring per-thread utilization of central processing unit (CPU) cycles in a storage system that comprises at least one storage director with CPU complexes configured to run threads comprising threads of emulations that perform storage-related tasks; identifying clusters of the threads that are most highly correlated with utilizations of code paths comprising sets of multiple thread types that function together to perform a specific task; and using the identified clusters of threads to identify at least one of the threads as a cause of a performance problem of the storage system.
This summary is not intended to limit the scope of the claims or the disclosure. Other aspects, features, and implementations will become apparent in view of the detailed description and figures. Moreover, all the examples, aspects, implementations, and features can be combined in any technically possible way.
The terminology used in this disclosure is intended to be interpreted broadly within the limits of subject matter eligibility. The terms “disk,” “drive,” and “disk drive” are used interchangeably to refer to non-volatile storage media and are not intended to refer to any specific type of non-volatile storage media. The terms “logical” and “virtual” are used to refer to features that are abstractions of other features, for example, and without limitation, abstractions of tangible features. The term “physical” is used to refer to tangible features that possibly include, but are not limited to, electronic hardware. For example, multiple virtual computers could operate simultaneously on one physical computer. The term “logic” is used to refer to special purpose physical circuit elements, firmware, software, computer instructions that are stored on a non-transitory computer-readable medium and implemented by multi-purpose tangible processors, and any combinations thereof. Aspects of the inventive concepts are described as being implemented in a data storage system that includes host servers and storage arrays. Such implementations should not be viewed as limiting. Those of ordinary skill in the art will recognize that there are a wide variety of implementations of inventive concepts in view of the teachings of the present disclosure.
Some aspects, features, and implementations described herein may include machines such as computers, electronic components, optical components, and processes such as computer-implemented procedures and steps. It will be apparent to those of ordinary skill in the art that the computer-implemented procedures and steps may be stored as computer-executable instructions on a non-transitory computer-readable medium. Furthermore, it will be understood by those of ordinary skill in the art that the computer-executable instructions may be executed on a variety of tangible processor devices, i.e., physical hardware. For practical reasons, not every step, device, and component that may be part of a computer or data storage system is described herein. Those of ordinary skill in the art will recognize such steps, devices, and components in view of the teachings of the present disclosure and the knowledge generally available to those of ordinary skill in the art. The corresponding machines and processes are therefore enabled and within the scope of the disclosure.
Referring to
The CPU complexes simultaneously run thousands of threads. Among the emulation threads, consumption of CPU cycles is a function of a wide variety of characteristics of IO workloads. For example, CPU utilization by the various emulation threads varies based on IO type, IO size, and IOs per second (IOPS) received, among other characteristics. Moreover, CPU utilization by all instances of a thread type are not necessarily equal, and thread CPU utilization is time-variable. Still further, CPU utilization by different types of threads may be related or interdependent. Consequently, it is difficult to identify specific threads as likely sources of storage array performance problems.
In order to use thread CPU utilization patterns to analyze performance problems, the threads running on the CPU complexes 200 are monitored as IO workloads 250 are serviced by the storage array. Specifically, individual threads are monitored to compute per-thread CPU utilization statistics 252 and counts 254 of utilizations of code paths. A code path is a procedure that uses multiple threads in a coordinated way to perform a specific task. For example, a front-end emulation thread receiving an IO command that a data services emulation thread helps to service by locating the associated metadata that maps the logical block addresses of the IO and that a back-end emulation thread then helps to service by fetching the corresponding data from the managed drives is a code path including three distinct worker threads. Other examples of code paths have already been described above. Each code path may be characterized by a set of thread types that work together in a predetermined order to perform a specific task. A given use of a code path does not necessarily involve the same worker threads instances each time the code path is used but does involve instances of the same types of worker threads. A separate counter is created for each code path. The counter for a code path is incremented whenever the associated code path is invoked, e.g., by enqueueing an IO to be inputted to the code path. The counters may be polled periodically and reset after storing the monitored counts 254.
Specific examples have been presented to provide context and convey inventive concepts. The specific examples are not to be considered as limiting. A wide variety of modifications may be made without departing from the scope of the inventive concepts described herein. Moreover, the features, aspects, and implementations described herein may be combined in any technically possible way. Accordingly, modifications and combinations are within the scope of the following claims.