The systems and methods generally relate to the field of data traffic load analysis in storage networks.
A SAN or storage area network, sometimes called a storage network environment, is a network dedicated to enabling multiple applications on multiple hosts to access, i.e., read and write, data stored in consolidated shared storage infrastructures. A SAN consists of SAN devices, for example, different types of switches, which are interlinked, and is based on a number of possible transfer protocols such as Fiber Channel and iSCSI. Each server is connected to a SAN with one or more network cards, for example, an HBA. Application data is stored as data objects on storage devices in storage units e.g. LUNs. The storage device may be used to store data related to the applications on the host.
Storage network environments enable applications to be executed on hosts and communicate with the storage environment components to gain read and writer access to their data objects. Thus, a storage network environment may comprise various types of components such as interconnected network components and storage device components. The storage environment may also comprise storage environment components for storing data objects (such as storage devices of different types) and for enabling read and wrote access to stored data objects (such as switches and directors of different types and other types of storage networking devices).
Enterprises are deploying increasingly large-scale SANs to gain economies-of-scale business benefits, and are performing and planning massive business-critical migration processes to these new environments. These SAN are increasingly large and complex. A typical SAN environment in a Fortune 500 company may contain hundreds or thousands of servers and tens or hundreds of switches and storage devices of different types. Furthermore these SAN environments are undergoing a large amount of change and growth.
The large size and rate of growth of SANs has led to added complexity. The number of components and links which may be associated with the data transfer from each given application and one or more of its data units may increase exponentially with the size of the SAN.
In turn, this complexity leads to difficulties in managing and configuring resources in the SAN. Compounded by the heterogeneity of the SAN devices, this complexity leads to high risk and inefficiency. The associated risk of failures in SANs is high, and the consequences of failures may be crippling. To this end, there is a significant need to tie the level of storage service provided to applications in storage environments to the amount of resources required to provide that level of service. In addition, there is a need to consider the quality levels of the individual components and the joint attributes along data flow paths, thereby allowing for better service levels as well as resource consumption.
The complexity in SANS could also lead to large imbalances in data traffic flow through the SAN. Traffic load imbalances may lead to failures in the SAN. To this end, there is a need to consider the traffic load through a given point in the storage network environment. Traffic load is the amount of data transferred through a point in the network, e.g., a given port of a given component in the network, over a specified interval of time. This interval of time may be fixed, such that the specified intervals occur periodically in time.
In a storage infrastructure environments, frequent mismatches occur between actual data traffic load and projected data traffic load. These imbalances or mismatches occur as a result of either congestion in the network environment, or a hardware, software, or configuration problem within one or more of the network components. This may be because typical data traffic monitoring approaches are too resource-specific or point-in-time oriented. Hence, these approaches cannot consider, in a time-consistent and application end-to-end fashion, the actual status of data traffic flow through in storage networks. These approaches also cannot account for the complete relationship among network applications, storage service levels, traffic load levels, and resource capacity.
Note that an access path or a logical access path will encompass a logical channel between a given application and a given data object. It may include several components, each of which must be configured appropriately for data traffic to flow through the component.
Because of the potentially very large number of components in the storage network environment, very frequent storage network environment changes, and large amount of local state information of each component, and because of the complexity of performing the correlation of the information and analysis of the end-to-end access paths and attributes, any data traffic load monitoring approach needs to be very efficient to perform the task of managing data traffic loads and resources in SANs effectively in realistic environments.
Currently, there are no adequate technological solutions to assist SAN administrators in managing data traffic load in storage environment. There are no solutions which considers the end to end service levels of applications, the end to end access paths for data flow, and the tier levels of resources and combination of resources. As such, there exists a need for systems and methods capable of providing dynamic traffic load monitoring and/or management. In particular, there is a need for a solution to the problem of efficiently managing the data traffic load through components in storage area network environments and mapping these loads to access paths and storage service levels for applications and/or hosts.
The systems and methods described herein include, among other things, processes for periodically analyzing and storing the data traffic load associated with applications in a storage network environment. The systems and methods presented include collecting, analyzing, and presenting traffic loads in each part of a storage area network. These methods and systems account for various resource types, logical access paths, and relationships among different storage environment components. Data traffic flow may be managed in terms of resource planning and consumption. The aggregated information is stored, and may be used to estimate future data traffic loads or determine deviations between projected and actual traffic load status from which adjustments may be made to better predict and manage future data traffic load.
In one aspect, the invention provides methods for analyzing the data traffic loads associated with applications in a storage network environment. In one embodiment, this method includes storing a data traffic load policy, collecting current state configuration information and current data traffic flow information from sources in the network environment, correlating this information, and deriving access paths in the network environment. In a further embodiment, the method includes standardizing formats of the current state configuration information and the current data traffic flow information and reconciling conflicts in the formats, and storing the current state configuration information and the current data traffic flow information. In another embodiment, the method includes processing the collected information, computing hierarchical traffic load distributions over a pre-selected period of time, and providing notification messages about deviations between the processed information and the data traffic load policy.
In some embodiments, processing the collected information comprises comparing the current state configuration information to a previously-stored state configuration information, identifying logical access paths in the network environment, comparing the current data traffic flow information to a previously-stored data traffic flow information, validating the current data traffic flow information against the data traffic load policy, and identifying any data traffic load policy discrepancies or violations.
In other embodiments, processing the collected information comprises comparing the current state configuration information to a previously-stored state configuration information, identifying logical access paths in the network environment, comparing the current data traffic flow information to a previously-stored data traffic flow information, validating the current data traffic flow information against the data traffic load policy, and identifying any data traffic load policy discrepancies or violations.
In some embodiments, the hierarchical traffic load distributions include a computation of absolute and relative traffic loads through each port of a first network environment component over the pre-selected period of time. In certain embodiments, the hierarchical traffic load distributions over a pre-selected period of time include a computation of absolute and relative traffic loads between a first network environment component and a second network environment component in the network environment over the pre-selected period of time. In other embodiments, the hierarchical traffic load distributions over a pre-selected period of time include a computation of absolute and relative traffic loads between network environment components on a logical access path in the network environment over the pre-selected period of time. In some embodiments, the hierarchical traffic load distributions over a pre-selected period of time include a computation of absolute and relative traffic loads between a group of associated network environment components in the network environment over the pre-selected period of time.
In an embodiment, the data traffic load policy includes a description of expected absolute and relative traffic loads through each port of a first network environment component over the pre-selected period of time. In other embodiments, the data traffic load policy includes a description of expected absolute and relative traffic loads between a first network environment component and a second network environment component in the network environment over the pre-selected period of time. In certain embodiments, the data traffic load policy includes a description of expected absolute and relative traffic loads between network environment components on a logical access path in the network environment over the pre-selected period of time. In other embodiments, the data traffic load policy includes a description of expected absolute and relative traffic loads between a group of associated network environment components in the network environment over the pre-selected period of time.
In some embodiments, collecting the current data traffic flow information includes computing for each access path the total amount of data traffic load associated with the access path. In other embodiments, collecting the current data traffic flow information includes collecting information about all the data traffic loads for each host application in the network environment.
In an embodiment, the deviations between the processed information and the data traffic load policy include the computation of a traffic load associated with a network component that exceeds a first pre-selected threshold. In another embodiment, the deviations between the processed information and the data traffic load policy include the computation of a variance between traffic loads based on the hierarchical traffic load distributions that exceeds a second pre-selected threshold. In certain embodiments, the deviations between the processed information and the data traffic load policy include a computation of a traffic load associated with a logical access path that exceeds a third pre-selected threshold, or the discovery of a zero traffic load associated with a logical access path.
In another aspect, the invention relates to a system for periodically analyzing the data traffic loads associated with applications in a storage network environment. In an embodiment, the system includes one or more hosts, one or more switches in communication with the hosts, one or more data storage devices in communication with the hosts, a user interface, a display, a memory for storing computer executable instructions; and a processor in communication with the user interface, the hosts, and the data storage devices. In some embodiments, the processor is configured to receive a user input, store a user-defined data traffic load policy, collect current state configuration information and current data traffic flow information from sources in the network environment, correlate the information and derive access paths in the network environment, standardize formats of the current state configuration information and the current data traffic flow information and reconcile conflicts, store the current state configuration information and the current data traffic flow information, process the collected information to compute hierarchical traffic load distributions over a pre-selected period of time, display different views and capabilities of the computed hierarchical traffic load distributions, and display notification messages about deviations between the processed information and the data traffic load policy on the display.
In an embodiment, the processor collects information comparing the current state configuration information to a previously-stored state configuration information, identifying logical access paths in the network environment, comparing the current data traffic flow information to a previously-stored data traffic flow information, validating the current data traffic flow information against the data traffic load policy, and identifying any data traffic load policy discrepancies or violations.
In another embodiment, the processor computes hierarchical traffic load distributions over a pre-selected period of time including absolute and relative traffic loads through each port of a first network environment component over the pre-selected period of time. In a further embodiment, the processor computes hierarchical traffic load distributions over a pre-selected period of time including absolute and relative traffic loads between a first network environment component and a second network environment component in the network environment over the pre-selected period of time. In another embodiment, the processor computes hierarchical traffic load distributions over a pre-selected period of time including absolute and relative traffic loads between network environment components on a logical access path in the network environment over the pre-selected period of time. In certain embodiments, the processor computes hierarchical traffic load distributions over a pre-selected period of time including absolute and relative traffic loads between a group of associated network environment components in the network environment over the pre-selected period of time.
In an embodiment, the processor computes for each access path the total amount of data traffic load associated with the access path. In a further embodiment, the processor collects information about all the data traffic loads for each host application in the network environment. In an embodiment, the user defines the pre-selected thresholds in the stored data traffic load policy.
In certain embodiments, the different views and capabilities of the computed hierarchical traffic load distributions include displaying traffic load information grouped by component connectivity, displaying traffic load information grouped by the associated logical access path, displaying traffic load information as they correlate to communication errors in the network, displaying traffic load information network environment components, displaying notification messages, displaying correlation information, displaying traffic load information on different time scales, and displaying traffic load summary information.
The above and other aspects and advantages of the embodiments will be apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, which may not be drawn to scale, and in which:
The systems and methods, in various embodiments, provide, among other things, processes for data traffic load management. Although the invention is described below with reference to a Storage Area Network (SAN), the description does not limit the invention, and the various embodiments set out below and depicted in the figures are merely provided for the purposes of illustrating certain embodiments of these systems and methods and for describing examples of such systems and methods. It will be apparent to those of skill in the art that the systems and methods described herein may, in certain forms, be employed in other types of storage infrastructure environments or any other networks for which access paths are defined and monitored. Thus, the scope of the invention is at least the scope defined by the appended claims and equivalents.
System 100 displays the processed information on display 108. Display 108 may be any display device capable of interfacing with processor 106, e.g., an LCD display or a CRT monitor. One or more human users may interact with display 108 via user interface 102. For instance, system 100 could receive user input via user interface 102 from devices such as a mouse and a keyboard. The user input could also originate from devices connected to user interface 102 remotely, e.g., via a network connection.
System 100 can be used to implement a method for analyzing data traffic loads associated with applications in a storage network. Data traffic or data traffic load is the amount of data transferred through a point in the network, e.g., a given port of a given component in the network, over a specified interval of time, as will be described in reference to
Processor 106 in system 100 is configured to operate on information 112 from the storage network 104. In particular, processor 106 is configured to communicate with storage network 104 to identify logical access paths in the storage network infrastructure 104, as will be described below in reference to
SAN 200 in
Each SAN component in SAN 200 has a certain type which defines its category (e.g. disk storage device, tape storage device, etc), its manufacturer (e.g. vendor name, such as EMC, IBM, Netapp, etc), its product name (e.g. Symmetrics, Clarion, Shark, etc)., and its model (e.g. its version, add-ons, etc).
Each storage network component in SAN 200 also has an internal state. The internal state of each storage network environment component at each point of time contains values for various execution state variables (such as for example amount of data that flowed through a certain port in a recent interval, or the data stored at a particular location) as well as configuration state variables (such as which ports are enabled, which other component is connected via each ports, what are the set transfer rates, which zones are defined, which are the members of each zone, etc). Changes to execution state variables occur as a result of data flow related activities, whereas changes to the configuration state variables occur as a result of planned or unplanned configuration actions.
Each storage network component in SAN 200 may have multiple attributes associated with it, that characterize various aspects of the functionality of that component. For example the attributes of a switch may include among others the maximum number of ports, the maximum data transfer rates, etc. The attributes of a storage device component may include among others the maximum capacity, the maximum rate of data reads or writes, the RAID level, etc. The value of some of these attributes can be obtained by querying the component directly, whereas the value of other values can be deduced from the component type (that is from the information about manufacturer, product, model, etc.).
An access path or a logical access path in the SAN 200 encompasses a logical channel between a given application and a given data object, e.g. LUN, along which data may flow. In other words, a logical access path is typically, although not exclusively, a sequence of components starting with a specific application on a specific server via, for example, an HBA, and a sequence of one or more switches and physical links leading to a storage controller and a storage device containing a data object e.g. a LUN. The logical or configuration state of each component along the way in that sequence, for example, the HBA, the storage controller, or the switches, is set such as to not disable data flow between that specific application and that specific data object along that specific sequence.
Access paths in SAN 200 and their related access characteristics actually need to be realized by setting up multiple underlying devices of different types. These underlying operations include multiple physical and logical basic set up actions which need to be set up in different locations and device types and with mutual consistency. Nonetheless, the end-points in SAN flows generally have a relatively strong exclusive access relationship. That is, each application on a SAN-connected host typically requires access, and often exclusive access, only to some specific SAN data objects (LUNs). Consequently, in storage area networks each source end point, i.e., the application on a host, will typically need to interact only, and often exclusively, with a specific, small number of target end points, e.g., the LUNs on the network storage devices.
In preferred embodiments, the sequence of components between an application on a host and one of its data objects stored on a storage device, their types, attributes, state set up, and connectivity between them determine the level of storage service provided to that application. That level of service includes, for example, aspects of performance and availability for data flow. An access path between an application on a host and a data object on a storage device may be a sequence of components as described above which are set to enable information flow between the application flow on the host and the data object on the storage device. Attributes associated with each such end-to-end access path determine the level of storage service provided to that application.
Part of the internal; configuration state of each component in SAN 200 contains information about the allocation of each resource, or set of resources, of that component for the exclusive use of one or more external entities, such as an application, a set of applications, other components, etc.
Resources of a components in SAN 200 which are not allocated are considered available. Allocated resources at a component can be de-allocated and the internal configuration state updated accordingly, and afterwards they can be allocated again to particular applications or components.
A resource is allocated to an access path in SAN 200 if it is allocated either to an application or to a component which is part of that access path. A resource is associated with an application if it is allocated to that application or to a component on an access path associated with that application.
For instance, in the exemplary embodiment in
To discover all the access paths in the storage network 200, compute their end-to-end attributes, and establish that they are consistent with the set policy requirements, information needs to be obtained from the different components regarding the types, state, and connectivity. These aspects, among others, are described in commonly-assigned U.S. patent application Ser. Nos. 10/693,632, 11/529,748, 12/006,125 and 11/965,392, the contents of which are hereby incorporated herein in their entirety.
In certain embodiments, the information on end-to-end attributes of access paths in SAN 200 is correlated and analyzed by mapping to an abstract graph-model representation in which each node represents a component and links between nodes represent connectivity between components and internal or configuration state information in each component. Data flow between two nodes in the graph is deemed possible if and only if there exists an access path between the two nodes in the model representation, and the attributes of that data flow are determined by the attributes of the different nodes and links associated with that path. If an access path exists between two nodes in the graph, or, if it is desired that an access path exist between two nodes in a graph, these two nodes may be called end nodes. Thus, logical access paths may be derived or identified in this manner and an abstract graph representation of the SAN may be constructed. The connection and configuration state information from each of the devices may be used in an aggregated process to generate an abstract graph representation of the network representing the logical access paths in the SAN.
For instance, each SAN device in SAN 200 may be represented as a node in the graph. End-nodes represent applications/servers (source end-points) and storage/data objects e.g. Volumes or LUNs (target end-points). In the first part of the abstract graph construction each edge between nodes represents an existing physical link between the SAN devices (or between a SAN device and a SAN end-points). In the next part of the constructions edges are eliminated in each case of a logical constraint, as defined in a device configuration, which disable flows on that link. The result of this iterative construction is an abstraction in which a logical access path between one application on a server and a data object e.g. Volume or LUN on a storage device exists if and only if a path exist in the abstract graph between the corresponding end nodes. An intermediate node is a node that is connected to two or more end nodes.
For the sake of process efficiency, for SAN 200 the iterative step of graph edge elimination or pruning based on logical constraints implied by device configuration set-up is performed in a order designed to achieve as much pruning as early as possible. For that purpose SAN semantics are utilized to determine the order in which device constraints are considered. For example, a LUN masking constraints on one device which constraints most of the potential data traffic flows along the physical paths, may be used to prune the graph before a zoning constraint on another which restricts a smaller number of data traffic flows.
Access path attributes for the access paths in SAN 200 may be computed for each of the existing logical access paths. The attribute values include, inter alia: level of end-to-end redundancy; type of redundancy; number of hops; and number of allocated ports.
System 100 of
Specifically, system 100 of
System 100 of
Once the logical access paths have been identified in step 302, system 100 of
Step 306 in process 300 of
Once step 306 has been completed, system 100 of
Steps 312, 314, and 316 in process 300 of
System 100 of
Processor 106 of system 100 in
While the invention has been disclosed in connection with the embodiments shown and described in detail, various modifications and improvements may be made thereto without departing from the spirit and scope of the invention. By way of example, although the illustrative embodiments are depicted with reference to a storage area network (SAN), this need not be the case. The principles of the invention can also be applied in a similar way to additional types of networks and infrastructures. For example, a similar analysis can be applied to storage arrays or network in which there is replicated data. Instead, other storage infrastructures with defined access paths may employ the method of the invention, and the network fabric may include any type of device that provides the described connectivity between storage environment components. Accordingly, the spirit and scope of the present invention is to be limited only by the following claims.
This application claims the benefit under 35 U.S.C. §119(e) of U.S. Provisional Application No. 60/922,264 filed Apr. 6, 2007, which is hereby incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
5043866 | Myre, Jr. et al. | Aug 1991 | A |
5280611 | Mohan et al. | Jan 1994 | A |
5327556 | Mohan et al. | Jul 1994 | A |
5381545 | Baker et al. | Jan 1995 | A |
5684967 | McKenna et al. | Nov 1997 | A |
5774337 | Lee et al. | Jun 1998 | A |
5825772 | Dobbins et al. | Oct 1998 | A |
5940819 | Beavin et al. | Aug 1999 | A |
6014673 | Davis et al. | Jan 2000 | A |
6223176 | Ricard et al. | Apr 2001 | B1 |
6233240 | Barbas et al. | May 2001 | B1 |
6240463 | Benmohamed et al. | May 2001 | B1 |
6327598 | Kelley et al. | Dec 2001 | B1 |
6347335 | Shagam et al. | Feb 2002 | B1 |
6434626 | Prakash et al. | Aug 2002 | B1 |
6535517 | Arkko et al. | Mar 2003 | B1 |
6636981 | Barnett et al. | Oct 2003 | B1 |
6691169 | D'Souza | Feb 2004 | B1 |
6751228 | Okamura | Jun 2004 | B1 |
6792503 | Yagi et al. | Sep 2004 | B2 |
6795399 | Benmohamed et al. | Sep 2004 | B1 |
6801949 | Bruck et al. | Oct 2004 | B1 |
6816927 | Bouchet | Nov 2004 | B2 |
6904143 | Peterson et al. | Jun 2005 | B1 |
6909700 | Benmohamed et al. | Jun 2005 | B1 |
7051029 | Fayyad et al. | May 2006 | B1 |
7058702 | Hogan | Jun 2006 | B2 |
7062559 | Yoshimura et al. | Jun 2006 | B2 |
7069480 | Lovy et al. | Jun 2006 | B1 |
7103653 | Iwatani | Sep 2006 | B2 |
7103712 | Mizuno | Sep 2006 | B2 |
7120654 | Bromley | Oct 2006 | B2 |
7127633 | Olson et al. | Oct 2006 | B1 |
7149886 | Fujibayashi et al. | Dec 2006 | B2 |
7194538 | Rabe et al. | Mar 2007 | B1 |
7216263 | Takaoka et al. | May 2007 | B2 |
7260628 | Yamamoto et al. | Aug 2007 | B2 |
7376937 | Srivastava et al. | May 2008 | B1 |
7380239 | Srivastava et al. | May 2008 | B1 |
7512954 | Srivastava et al. | Mar 2009 | B2 |
7546333 | Alon et al. | Jun 2009 | B2 |
7617320 | Alon et al. | Nov 2009 | B2 |
7656812 | Tadimeti et al. | Feb 2010 | B2 |
20020145981 | Klinker et al. | Oct 2002 | A1 |
20030005119 | Mercier et al. | Jan 2003 | A1 |
20030018619 | Bae et al. | Jan 2003 | A1 |
20030055932 | Brisse | Mar 2003 | A1 |
20030131077 | Hogan | Jul 2003 | A1 |
20030191992 | Kaminsky et al. | Oct 2003 | A1 |
20030208589 | Yamamoto | Nov 2003 | A1 |
20030237017 | Jibbe | Dec 2003 | A1 |
20040019833 | Riedl | Jan 2004 | A1 |
20040030768 | Krishnamoorthy et al. | Feb 2004 | A1 |
20040049564 | Ng et al. | Mar 2004 | A1 |
20040075680 | Grace et al. | Apr 2004 | A1 |
20040103254 | Satoyama et al. | May 2004 | A1 |
20040215749 | Tsao | Oct 2004 | A1 |
20040243699 | Koclanes et al. | Dec 2004 | A1 |
20050010682 | Amir et al. | Jan 2005 | A1 |
20050044088 | Lindsay et al. | Feb 2005 | A1 |
20050055436 | Yamada et al. | Mar 2005 | A1 |
20050097471 | Faraday et al. | May 2005 | A1 |
20050114403 | Atchison | May 2005 | A1 |
20050160431 | Srivastava et al. | Jul 2005 | A1 |
20050262233 | Alon et al. | Nov 2005 | A1 |
20060004830 | Lora et al. | Jan 2006 | A1 |
20060106938 | Dini et al. | May 2006 | A1 |
20060143492 | LeDuc et al. | Jun 2006 | A1 |
20060161883 | Lubrecht et al. | Jul 2006 | A1 |
20060161884 | Lubrecht et al. | Jul 2006 | A1 |
20060218366 | Fukuda et al. | Sep 2006 | A1 |
20060265497 | Ohata et al. | Nov 2006 | A1 |
20070050684 | Takaoka et al. | Mar 2007 | A1 |
20070088763 | Yahalom et al. | Apr 2007 | A1 |
20070094378 | Baldwin et al. | Apr 2007 | A1 |
20070112883 | Asano et al. | May 2007 | A1 |
20070169177 | MacKenzie et al. | Jul 2007 | A1 |
20070198722 | Kottomtharayil et al. | Aug 2007 | A1 |
20070206509 | Vedanabhatla et al. | Sep 2007 | A1 |
20070294562 | Takamatsu et al. | Dec 2007 | A1 |
20080025322 | Tadimeti et al. | Jan 2008 | A1 |
20090172666 | Yahalom et al. | Jul 2009 | A1 |
20090313367 | Alon et al. | Dec 2009 | A1 |
Number | Date | Country |
---|---|---|
WO-0182077 | Nov 2001 | WO |
WO-02088947 | Nov 2002 | WO |
WO-03054711 | Jul 2003 | WO |
WO-2004111765 | Dec 2004 | WO |
Entry |
---|
“Storage Management and the Continued Importance of CIM,” White Paper, Data Mobility Group (Jan. 2004). |
“Softek SANView: Simplify the discovery and management of multi-vendor SANs,” Fujitsu Softek (May 2002). |
“Information Lifecycle Management: An Automated Approach,” Technical White Paper, EMC2 (Dec. 8, 2003). |
“Kasten Chase Unveils Advanced Security Architecture,” GRIDtoday, v.1, n. 18; www.gridtoday.com/02/101/100546.html, (Oct. 14, 2002), printed from Internet on Oct. 16, 2003. |
“Assurency: Comprehensive, Persistent Security for Storage Area Networks,” Kasten Chase (2002). |
“Radiant Data Server Technology Overview,” White Paper, Radiant Data Corporation (2003). |
Lee et al., “Storage Network Management Software—The Critical Enabler of Maximum ROI,” Storage Consulting Group (Dec. 16, 2002). |
Number | Date | Country | |
---|---|---|---|
60922264 | Apr 2007 | US |