1. Field of the Invention
The invention relates generally to the field of data storage in computer systems and, more specifically, to a technique for enabling a client to access the data storage resources of different servers without having specific knowledge of which server owns which resources.
2. Description of the Related Art
Computer storage devices such as storage servers have high-capacity disk arrays to backup data from external host systems, such as host servers. For example, a large corporation or other enterprise may have a network of servers that each store data for a number of workstations used by individual employees. Periodically, the data on the host servers is backed up to the high-capacity storage server to avoid data loss if the host servers malfunction. A storage server may also backup data from another storage server, such as at a remote site. Furthermore, it is known to employ redundant server clusters in a data storage system to provide additional safeguards against data loss. The IBM Enterprise Storage Server (ESS) is an example of such a data storage system.
A problem occurs in a client-server environment where requests are sent from the client to multiple servers. The requests include operations to be performed at the servers using the server's resources. Each server owns a specific set of resources and is responsible for the work performed on those resources. In one approach, the client provides separate requests to each server according to the resources needed. The client sends a separate request to each server involving that server's resources, such as a request to perform copy operations among different volumes, and waits for a response from each server. However, this requires the client to know which servers owns which resources, and results in reduced performance since multiple, different requests are generated. Moreover, difficulties arise when the client requires access to the resources of a failed server whose work has been taken over by another server, such as in a dual cluster system, when the client does not know of the failure.
To overcome these and other deficiencies in the prior art, the present invention describes a technique for enabling a client to perform operations involving the resources of different servers without having specific knowledge of which server has which resources.
In one aspect of the invention, at least one program storage device tangibly embodies a program of instructions executable by at least one processor to perform a method at a server for accessing an associated data storage resource. The method includes receiving a copy of a request, sent from a client, that identifies at least one operation to be performed, processing the request to determine whether the at least one operation requires access to the associated data storage resource, and accessing the associated data storage resource to perform the at least one operation if the at least one operation requires access to the associated data storage resource.
In another aspect of the invention, a method is provided for accessing a plurality of data storage resources at a plurality of servers, wherein each server is associated with at least one of the plurality of data storage resources. The method includes receiving, at each server, a copy of a request from a client that identifies at least one operation to be performed, at each server, processing the request to determine whether the at least one operation requires access to the associated data storage resource, and at each server for which the at least one operation requires access to the associated data storage resource, accessing the associated data storage resource to perform the at least one operation
In a further aspect of the invention, at least one program storage device tangibly embodies a program of instructions executable by a machine to perform a method at a client for communicating with a plurality of servers, wherein each server has an associated data storage resource. The method includes generating multiple copies of a request that identifies at least one operation to be performed, sending a copy of the request to each server, wherein the servers access the associated data storage resources, and at least one of the servers accesses it data storage resource to perform the at least one operation, and sends a response to the client indicating that the at least one operation has been performed, and receiving the response.
Related computer-implemented methods, systems and program storage devises may be provided.
These and other features, benefits and advantages of the present invention will become apparent by reference to the following text and figures, with like reference numbers referring to like structures across the views, wherein:
The present invention describes a technique for enabling a client to access the resources of different servers without having specific knowledge of which server owns which resources. The invention solves the problem at the server level as opposed to the client level so that the client does not need to be concerned with which resources are owned by which servers. In particular, the invention works by replicating a request, and providing it to all servers involved instead of breaking up a request into smaller requests which are tailored for each server. Upon the receipt of a request, the server only acts upon those resources identified in the request for which it is the owner. If the server has no work to do, e.g., its does not own any of the identified resources, it sends an empty response immediately to the client.
Any server that has work to do performs the work by accessing its resources, and sends a corresponding response indicating the work performed to the client. The client then merges all responses from the different servers to determine that the request has been fulfilled. The invention is also applicable to the case where one server takes over the responsibilities of another, paired server, such as in a dual-cluster system. In this case, the two paired servers communicate with one another so that one server is informed when the other server fails, end each server knows the other's resources. When one server fails and takes over the other server's work, the surviving server will execute more of the actions in the client's request because it owns more of the resources. Advantageously, performance is improved at the client side because the client can invoke a single request that impacts resources on several different servers.
Each of the servers 150, 160 includes a network interface 158, 168 such as a network interface card for communicating with the client host 100, such as to receive requests from the client host 100 and to provide responses to the client host 100. Note that these requests and response may be provided using any type of network communication protocol. A processor 154, 164 with memory 156, 166 coordinates the communications via the network interfaces 158, 168 and handles reading and writing of data from and to respective data storage resources 152, 162. In particular, the data storage resources 152, 162 may comprise arrays of disks or other storage media. In the dual-cluster data storage system 140, each server cluster 150, 160 owns particular storage resources. In normal operations, with both clusters 150, 160 functional, each server cluster has write access only to the storage resources it owns, but has read access to all storage resources in the device 140. In the event of a cluster failure, the surviving cluster assumes ownership of the storage resources of the failed cluster. For example, the dashed line 170 indicates that server A 150 can assume ownership of the data storage resource B 162 when server B 162 fails.
Furthermore, the data storage resources 152, 162 may be arranged in logical subsystems (LSSs), which are comprised of volumes. The LSS is a topological construct that includes a group of logical devices such as logical volumes, which represent some amount of usable space, most likely spread across multiple physical disks. For example, a logical volume in a RAID array may be spread over different tracks in the disks in the array. Each cluster 150, 160 may therefore own a number of logical volumes as its data storage resource. In the normal, dual cluster mode, when both clusters 150, 160 are functional, ownership of the volumes or LSSs can be evenly divided between the clusters. When one of the clusters 150 or 160 fails, the data storage system 140 will operate in a fail safe, single cluster mode, by assigning ownership of all volumes or LSSs to the surviving cluster. The fail-safe mode reduces the chance of data loss and downtime. Moreover, as mentioned, the invention may also be carried out in servers 150, 160 that are independent, and do not have the ability to access each other's data storage resources.
The general operation and configuration of the memories 112, 156 and 166, processors 110, 154 and 164, and network interfaces 120, 158 and 168 is well known in the art and is therefore not described in detail. The functionality described herein can be achieved by configuring the hosts 100, 150 and 160 with appropriate instructions, e.g., software, firmware or micro code, in the memories 112, 156 and 166, for execution by the respective processors 110, 154 and 164. The memories 112, 156 and 166 may therefore be considered to be program storage devices for carrying out a method for achieving the functionality described herein.
Appropriate user interfaces may also be provided to allow a user to interact with the client 100 and servers 150 and 160 such as by entering commands and viewing status information.
At block 210, the client replicates the request, for example, to provide two copies of the request, one for each of the clusters 150 and 160. At block 220, the client transmits a separate copy of the request to each server 150 and 160. The client only has to know which group of servers to send the request to. It may do this by using a unique serial number that identifies each data storage system, for example. This serial number is provided in each request. Once the client knows the serial number, code at the client handles sending the request to both servers in the specified data storage system. The request need not be transmitted to other servers or data storage systems with which the client may have the ability to communicate. In this manner, it is not necessary for the client to know which server 150, 160 owns the data storage resource or resources that are involved in carrying out the request.
At block 230, each server that receives a copy of the request processes it to determine whether the operations identified in the request require access to the server's associated storage resource. This may involve, e.g., comparing identifiers of the volumes involved in a requested copy operation with a list of volumes that the server owns. The identifiers of the involved volumes may be included in the request, for instance. If access is not required (block 260), the server sends an empty response to the client. If access is required (block 240), the server accesses its data storage resource to perform at least one operation, and (block 250) sends a response to the client indicating that the at least one operation has been performed. It is possible for a single server to perform all of the necessary operations identified in a request if access to the data storage resource of another server is not required. Or, each server may act on part of the request.
A request can be a complicated, involving more than one operation. For example, a request may be to copy volume A to volume B, volume C to volume D, and volume E to volume F. Assume a first server owns volumes A and B, and a second server, within the same data storage system, owns resources C through F. At the client, the request is duplicated and sent to both servers. The first server looks through the entire request and sees it can perform the copy from volume A to B. The second server looks through the same request and sees that it can perform the copy from volume C to D, and from volume E to F. Both servers thus can do part of the work involved in a request and send a corresponding response back to the client when the work is completed. For example, the first server can send a response indicating that it has performed the copy from volume A to B, and the second server can send a response indicating that it has performed the copy from volume C to D, and from volume E to F. The two responses can then be merged at the client (block 270) to enable the client to ascertain that the entire request has been fulfilled.
The invention thus alleviates the need for the client to prepare a first request for the first server involving the copy from volume A to B, and a separate, second request for the second server involving the copy from volume C to D, and from volume E to F.
While the invention has been illustrated in terms of a dual cluster storage server, it is applicable as well to multi-cluster systems having higher levels of redundancy, as well as to individual servers that are operatively connected or independent.
The invention has been described herein with reference to particular exemplary embodiments. Certain alterations and modifications may be apparent to those skilled in the art, without departing from the scope of the invention. The exemplary embodiments are meant to be illustrative, not limiting of the scope of the invention, which is defined by the appended claims.