The present application claims priority from Japanese Patent Application No. JP 2005-65717 filed on Mar. 9, 2005, the content of which is hereby incorporated by reference into this application.
The present invention relates to a storage network system and more particularly a technology which is effective for application to a network system having a Frame relay device between an upper level apparatus (e.g. host machine) and a storage device.
In a storage network system (e.g. a Storage Area Network (SAN) or a storage network adopting Internet Protocol (IP)) which can be connected to plurality of host machines, improvement of access response to the plural host machines is required with rapid expansion of the storage network today. In most cases, the connection style of the above-described system is of “client to server system” type. And the storage apparatus is usually operated with many read requests more than write requests, like a Web Server system.
On the other hand, to meet such demand for high loads, the storage apparatus contains a high speed cache memory but the cache memory capacity is limited. Further, because of increase of apparatus cost, generally, the cache memory capacity is not so large except for some high-end apparatuses. Due to rapid increase of disk capacity and high-density mounting on the storage device, which are another factor, cache hit rate drops due to a drop in the ratio between the cache memory capacity and disk capacity and consequently, deterioration of response is feared.
Such an expansion of the network represented by the SAN generates an increase in the numbers of host machines capable of accessing the storage system. Because each of the respective host machines accesses to the storage apparatus for requiring a different data (files), it is impossible to store all of those files in the cache memory and the low cache hit rate causes slowdowns of response.
Further, even if the storage device includes a large capacity cache memory, the data transmission band of its interface is limited. Thus, to improve the band performance, the storage device needs to include a plurality of interface ports and combine with load distribution (load balance) control or the like, which is a large burden in cost field.
Thus, as described in patent document 1 (Japanese Patent Application Laid-Open Publication No. 11-112541), there is such a method that a Frame relay device having a cache memory is realized and data is cached depending on decision of the Frame relay device itself. Further, as described in patent document 2 (Japanese Patent Application Laid-Open publication No. 2000-353125), there is available technology concerning unified control on cache memories disposed dispersedly on respective hierarchized control units.
According to the technology described in the patent document 1, the Frame relay device itself monitors a packet or the like and cache it. Thus, if there exist Frame relay devices of the same type on the same route, it comes that each relay device caches a plurality of same data in the cache memory of them. Therefore, the efficiency is poor in the above case. Further, it can be considered that only partial data that a host machine requires is cashed on the Frame relay device in some cases. If nothing but part of data demanded form the host machine is cached, the Frame relay device needs to operate such a complicated process as converting commands issued form the host machine to commands for storage apparatus for the purpose to send remaining data of the partial data, and then an excessively large-scale hardware is needed for such a complicated processing.
Thus, the present invention has solved the above-mentioned problem. A first object of the present invention is to provide technology relating to make simple the hardware (circuit scale) of the Frame relay device and that process to be simple on a storage network system which is constructed by the Frame relay devices, the host machines and the storage apparatus.
According to the technology described in the patent document 2, its purpose is to improve the cache hit rate by increasing the cache memory capacity with I/O cache memory of an I/O control unit which is a memory resource existing inside the storage apparatus, controller cache memory of the storage controller, and device cache memory of the storage device. However it is not considered about reduction of the data transmission band,
In other words, storage apparatus discussed in document 2 is equivalent to the conventional storage apparatus in which the respective cache memory capacities are summed up, in the situation where their performances are compared in terms of the maximum effect without considering the overhead and the transmission band limit of each I/F bus. Thus, because the storage apparatus discussed in document 2 is more excellent only in terms of cost, it is difficult to meet demands, in the conventional for quickly responding when many accesses from plural host machines concentrated, which the present invention handles, like the conventional unit.
It comparing further in realistic way, these technology discussed in document 2 shares an interface with the I/O control unit and device except the controller cache memory to access a memory. Thus, the transmission band of each interface bus is consumed by memory and I/O access. If the occupation ratio of the interface bus increases, it becomes difficult to perform multiplexing and parallel execution of commands and then causes low performance. That because data transmission exceeding the transmission band of the interface bus is impossible. Thus, a bottle neck in the performance is likely to occur. To avoid these disadvantages, it is necessary to add interface buses or improve the band. In such a situation, it cannot be said that it is always more excellent than the conventional apparatus in terms of cost.
Accordingly, the present invention has solved the above-mentioned problem. A second object of the invention is to provide technology which enables improvement of response to a demand from the host machine without consuming the data transmission band by avoiding an event that the storage apparatus itself turns to a bottle neck in the system performance.
Of the invention disclosed in this patent application, typical aspects thereof are as follows.
The present invention is applied to a storage network system which includes Frame relay devices for relaying data read/write requests from a host machine to a storage apparatus. The Frame relay device is provided between the host machine and the storage apparatus that comprises a storage device for storing data and a storage controller for controlling the data read/write to the storage device, and has the following features.
(1) The Frame relay device contains a cache memory for storing the read data transmitted from the storage apparatus. The storage apparatus carries out processing of before the data sending to the Frame relay device, issue the command for storing the data in the cache memory of the Frame relay device when receiving data read request from the host machine. Then any host machine issues a read request for the same data again, the storage apparatus issues a command to the Frame relay device to send that data to a specified host machine. On the other hand, the Frame relay device carries out processing of sending the data stored in the cache memory of itself to the specified host machine.
(2) The Frame relay device contains a battery for supplying voltage at the time of power failure.
(3) The Frame relay device carries out processing of storing part of the data when it caches the data. Then, the storage apparatus issues a command to the Frame relay device for transmitting part of the data, and read the remaining data in the storage device while the Frame relay device is transmitting part of the data to a specified host machine, and send that read data.
(4) The Frame relay devices are provided in plural numbers and the respective Frame relay devices are disposed between the storage apparatus and the host machine on plural routes.
(5) The storage apparatus includes a means for selecting the Frame relay device and a table for controlling position information of the Frame relay device in order to issue a command to the Frame relay device.
(6) Each of the respective Frame relay devices on plural routes executes processing of storing different data,
(7) Processing of storing specified same data in plural Frame relay devices on the same route within the plural routes is carried out.
(8) A first Frame relay device specified by a specification of the storage controller sends data to a second Frame relay device. The second Frame relay device verifies data,
(9) Each Frame relay device carries out processing of destroying cached data in any case where no transmission request is made for specific data over a predetermined time, where the entry of storage apparatus is deleted from a route table, where the remaining capacity in the cache memory is smaller than data requested to be cached, or abnormality is detected at the time of data verification.
(10) Each Frame relay device carries out processing of destroying the cached data according to a command of the storage apparatus.
(11) Each Frame relay device assigns a cache area of a specified volume to each of the storage controller preliminarily. The storage controller grasps a capacity assigned to itself by inquiry and controls each Frame relay device not to execute cache request over the assigned capacity.
(12) Each Frame relay device, when a change over a specified level occurs in an area in which the cache area is capable of being assigned by deleting the cached data, carried out processing of sending a message to the storage controller.
The storage controller carries out processing of grasping a capacity assigned to itself by inquiry in accordance with receipt of the message.
(13) The storage apparatus selects a first port for each Frame relay device and if data transmission load to the first port is over a specified volume, the storage apparatus sets permission for assignment of data to the second port corresponding to the first port.
(14) Each Frame relay device automatically controls to output a part of the data to the second port corresponding to the first port when the data transmission load to the first port is over a threshold set by the storage apparatus.
(15) Of the plural Frame relay devices, a third Frame relay device controls different interfaces of two or more kinds by independent switch functions. The cache memory provided on the third Frame relay device is accessible from each of the different interfaces.
(16) The third Frame relay device is disposed on the back end side of the storage apparatus and connected to a device in the storage apparatus.
(17) The third Frame relay device transfers two or more duplicated data to plural devices specified by a specification of the storage controller.
(18) The third Frame relay devices are provided in plural numbers. Of the plural third Frame relay devices, one Frame relay device specified by a specification of the storage controller sends data to redundant Frame relay device of other system. The Frame relay device of the other system verifies data.
(19) The third Frame relay devices are provided in plural numbers. Of the plural third Frame relay devices, a fourth Frame relay device has a protocol converting unit for converting between different interfaces of two or more kinds. And the fourth Frame relay device executes mutual data exchange between the different interfaces mutually.
(20) The protocol converting unit converts an ordered set with control code in transmission/receiving. The control code after conversion is transmitted as data having a different identifier,
The effects attained by the typical aspects of the invention disclosed by this application are as follows.
According to the present invention, by simplifying the processing of the Frame relay device in the storage network system having Frame relay devices provided between the host machines and storage apparatus, the hardware (circuit scale) of the Frame relay device can be constructed relatively simply.
Further according to the present invention, by avoiding such an event that the storage apparatus itself turns to a bottle neck in the system performance, the response to a request from the host machine can be improved without consuming the data transmission band.
<Concept of the Present Invention>
The storage network system of the present invention possesses a Frame relay device between a host machine and a storage apparatus. The Frame relay device contains a cache memory. When receiving a data read request from the host machine, the storage apparatus sends the data with a command for storing that data in a cache memory of the Frame relay device. If the host machines issues an access request for the same data again, the storage apparatus transmits to the Frame relay device a command for sending the data to a specified host machine. Then, the Frame relay device sends data cached by its own cache memory to the host machine. Consequently, speed-up of response to data read request from the host machine is carried out. Further, because the Frame relay device only executes a specific response based on a command from the storage apparatus side, its hardware (circuit scale) can be formed in a relatively simple structure.
In the processes, as for large data, the Frame relay device caches, for example, part of the large data from the top. While the Frame relay device is sending the data to the host machine, the storage apparatus accesses remaining data in the storage device such as an incorporated hard disk drive (HDD). Consequently, the overhead is reduced.
Plural Frame relay devices are disposed with respect to plural host machines on plural routes. The storage apparatus makes the respective Frame relay devices store different data. Consequently, the cache memory capacity can be increased easily.
The method of the present invention can be widely applied to a network Frame relay device of a network switch, rooter in network attached storage (NAS), a fiber channel switch in the storage area network (SAN), an expander of serial attached SCSI (SAS) and the like.
For example, if it is designed that the cache function can be utilized through such a frame converting unit as fiber channel to SAN (FC-SAN), internet protocol to SAN (IP-SAN) and the like, an objective network coverage area can be expanded further. For example, an application like near line storages such as SAS can be connected through wide area network (WAN).
The present invention can be applied to a switch unit which is disposed between a storage controller unit in the storage apparatus and the case of a storage device such as HDD. In this case, not only this is incorporated but also it is possible to constitute a switch under a blade server such as NAS header.
Further, the present invention can be applied to automatic load balance control by recognizing a Frame relay device and port based on a bridge ID or the like without necessity of support by host program or the like.
Hereinafter, the embodiments and application examples of the present invention will be described in detail with reference to the accompanying drawings. In the meantime, same reference numerals are attached to the same components in all drawings for explaining the embodiments and application examples, however repeated description thereof is omitted.
[Model of Storage Network System]
An example of the model of the storage network system according to an embodiment of the present invention will be described with reference to
The model of the storage network system of this embodiment comprises a storage apparatus 10 composed of a disk array or the like, eight Frame relay devices A-H (30) which relay data write and read request from the host machine to this storage apparatus 10 and eight host machines A-H (50) connected to this Frame relay device 30.
If the technology of the patent document 1 described previously is applied to the model of the storage network system, following occurs. For example, if a host machine A reads data from the storage apparatus 10 in conditions in which nothing is cached in the Frame relay device 30. Frame relay devices located on a route relay the read data to the host machine A and at the same time, caches to themselves. As a result, it comes that the Frame relay devices A, B, C store the same data. This means that the cache memory of the Frame relay device 30 cannot be used efficiently.
Further, it comes that the Frame relay device C responds to a read request for the same data from the host machine A on the cache memory of the Frame relay device C. At this time, if a host machine D having a route not through the Frame relay device C has updated that data, the Frame relay device C cannot sense the update. Thus, it comes that the Frame relay device C responds with old data through its own cache memory and there exists a case where no proper data can be obtained.
According to this embodiment, although described in detail later, the Frame relay device 30 caches only read data of the host machine 50 according to a command of the storage apparatus 10. By providing the Frame relay device 30 with this cache memory, data flowing at a point A is only data at the time of cache misfit. By controlling the cache state of each Frame relay device 30 by the storage apparatus 10 side, a system ensuring an improved memory usage efficiency can be built up. The Frame rely device 30 does not cache any write data from the host machine 50. This is because data lost on the route should be prevented. Generally, in accessing to the storage apparatus 10, data read cases are larger than data write cases (write: read=about 1:5). In case of Web data and the like, it is not rare that 90% or more requests are for read. From those viewpoints, the response performance can be improved by caching only read data.
[Appearance of Frame Relay Device]
An example of the stand-alone type Frame relay device in the storage network system of this embodiment will be described with reference to
The Frame relay device 30 is of, for example, stand-alone type and comprises a power unit composed of AC/DC power supply, uninterruptible power supply (UPS) composed of battery, a control unit composed of CPU, data controller, memory, and director, cache memory module unit composed of cache memory and the like and is received in a module unit. Under this structure, dual power units are received so as to constitute of duplex.
The cache memory module units can be installed additionally, if needed. This cache memory module unit receives plural memory modules. This cache memory module unit is connected with a connector such as peripheral component interconnect (PCI)-express or the like, which is preferred to-be removable. Further, the cache memory module unit can be replaced with an I/F unit which is a different type unit.
An I/F connector port connected to the director is provided on a front face of this Frame relay device 30. A power unit switch, a fan and an AC connector to which an AC cable is to be connected are provided on a rear face of the Frame relay device 30.
[Configuration of Storage Network system]
An example of the storage network system of this embodiment will be described with reference to
The storage network system shown in
The storage apparatus 10 comprises a storage device 11 for storing data such as HDD, and a storage controller unit 12 for controlling data write/read with respect to this storage device 11. In this example, an expansion chassis is provided as well as a base chassis so that the memory capacity of the storage device 11 is increased.
The storage controller unit 12 comprises a channel control unit 13 which receives data input/output request from the host machine 50, a disk control unit 14 for controlling data write/read with respect to the storage device 11, a data controller 15 for controlling data transmission between the channel control unit 13 and the disk control unit 14, a CPU 16 for controlling the storage controller unit 12, a cache memory 17 in which data exchanged between the channel control unit 13 and the disk control unit 14 is stored temporarily, a memory 18 for storing control information and the like. Particularly, the cache memory 17 is provided with various kinds of tables 17a such as a route table, an external cache table and the like, although their details will be described later.
The Frame relay device 30 comprises an AC/DC power supply 31 for converting an AC power to DC power, a battery 32 for use when supply of the AC power is failed, a CPU 33 for controlling the Frame relay device 30, a data controller 34 for controlling data transmission, a cache memory 35 for storing data exchanged between the host machine 30 and the storage apparatus 10 temporarily, a memory 36 for storing control information, a director 37 composed of an I/IF and the like. Particularly, although described later, the cache memory 35 is provided with various kinds of tables 35a such as the route table, the cache control table and the like.
[Route Table and External Cache Table in the Storage Apparatus and Creation Methods Thereof]
Examples of the route table and external cache table in the storage apparatus will be described with reference to
(1) Route Table
As shown in
If plural host machines 50 exist on the same route, they may be controlled altogether as a node group. In this case, as shown in
(2) External Cache Table
As shown in
(3) Creation Method of Table
First, when linkage between each port of the storage apparatus 10 and the Frame relay device 30 is established, the storage apparatus 10 recognizes the Frame relay device 30 which is connected directly to each port. Next, when the host machine 50 of the node A is initialized, the storage device 11 is scanned with inquiry command or the like.
Further, when the storage apparatus 10 receives an initial command from the host machine 50, the storage apparatus 10 responds to that command and makes an upper level Frame relay device 30 report a Frame relay device identifier (bridge ID, port ID) to the node. Then, the storage apparatus 10 stores the identifiers reported by each Frame relay device 30 in its table and assigns an arbitrary handle to each identifier. At this time, by assigning the handles in ascending order from a side near the storage apparatus 10 for each node, they are controlled as well as the position of the Frame relay device 30.
In the meantime, a load balance control method by expansion of this function will be described with an application example 1 described later. Further, the assignment about which Frame relay device 30 is made to cache data in order to reduce data volume of a Frame relay device having a slow transmission speed using information about communication (link) cost controlled by the Frame relay device 30 can be easily changed.
[Route Table and Cache Control Table in Frame Relay Device and Cache Relating Operation]
An example of the cache control table in the Frame relay device will be described with reference to
(1) Route Table
Each Frame relay device 30 executes discovery process of selecting network path of each segment used first with spanning tree algorithm (STA). In case of LAN switch, this information is shared by every switch using a special network frame called bridge protocol data unit (BPDU). Each switch holds a BPDU table and updates it continuously.
(2) Cache Control Table
As shown in
(3) Destroying of Data Cached in Frame Relay Device
The Frame relay device 30 executes destroying (deletion) of the cached data in following cases (a) to (c).
(a) Data which is not requested to be transferred longer than a specific time (only a corresponding data, however if there is a remaining capacity in the cache memory, this does not always need to be executed.)
(b) When the storage apparatus becomes non-responsive due to turned off a power or the like, then that entry is deleted from the route table. (All data in the corresponding storage apparatus are deleted.)
(c) If the remaining capacity of cache memory is smaller than data requested to be cached, then deleting data (or plural data) having low access frequency by the storage controller unit or if cache memory capacity is assigned to a storage corresponding to the access frequency, data having low access frequency (or plural data) are deleted without limiting any storage controller unit.
Consequently, an operation for increasing the cache hit efficiency is carried out.
(4) Report of Cache Memory Capacity Which can be Assigned to a Storage Controller Unit and Sending of Capacity Confirmation Request Message
In the case where plural storage controller units 12 are connected under the Frame relay device 30, if the cached data on other storage controller unit is constantly destroyed in accordance with a new cache request, the overhead increases and the response may slow down.
Then, it is possible to provide each unit of the storage controller unit 12 with a function to preliminarily enables assigning a cache area of a specific capacity. The storage controller unit 12 grasps a capacity to be assigned to itself by inquiry and controls not to make cache request more than necessary to the Frame relay device 30 in order to optimize algorithm.
If there are not so many subject storage controller units 12 or there is generated a capacity in the assignment area due to a released area by power OFF or the like, a confirmation request about the assigned capacity is sent from the Frame relay device 30 to the storage controller unit 12. Then, the storage controller unit 12 inquires with the Frame relay device 30 about a capacity capable of being assigned to readjust cache control algorithm on the side of the storage controller unit 12.
When a new storage controller unit 12 is initialized, cache memory capacity is newly assigned thereto. If there generated necessity to reduce the capacity of the storage controller unit 12 that had been already connected, the confirmation request for the assigned capacity is also sent from the Frame relay device 30 to the storage controller 12. Then, the storage controller unit 12 inquires again and at the same time, identification tag of the cached data that is deleted accompanied by releasing of the area and the like are reported following a request from the storage controller unit 12.
According to this function, the storage controller unit 12 determines whether or not the Frame relay device 30 is equipped with this function. Additionally, it is possible to verify whether or not there exists the function using switch management protocol (SMP) or the like.
[Procedure of Processing from the Storage Apparatus to the Frame Relay Device (Linkage Operation Between the Storage Apparatus and the Frame Relay Device)]
<1> Processing for Making the Frame Relay Device Cache
An example of processing for making the Frame relay device cache will be described with reference to
This processing is carried out as the operations of the storage apparatus 10 (storage controller unit 12) and the Frame relay device 30 at the time of data read from the host machine 50.
(1) The storage controller unit 12 receives a read command based on a read request from the host machine 50.
(2) The storage controller unit 12 searches a cache table in the Frame relay device 30 on a route of the host machine 50 in order to recognize the route of the host machine 50.
(3) The storage controller unit 12 issues a cache command (data tag, memory byte count and the like) to the Frame relay device 30 on the route of a target host machine 50 in order to cache subsequent data prior to data transmission. At this time, the command transmission to the Frame relay device 30 is carried out by specifying an ID of the Frame relay device 30. Further, for example, the data transmission is carried out with fiber channel or the like and it is possible to employ a method for sending or receiving a command with other I/F, for example, LAN. In case of large data, its own access response time is covered by a command for making the Frame relay device 30 cache part of that data.
(4) The storage controller unit 12 transmits data to the Frame relay device 30.
(5) The Frame relay device 30 transmits data to the host machine 50 and caches specified data associated with a tag. A frame header at this time and the like are stored partially as a model for reuse. At this time, corresponding to a cache request from the storage controller unit 12, the Frame relay device 30 reports a status to the storage controller unit 12. If the cache process is successful, identification tag created by the Frame relay device 30 as well as data tag from the storage controller unit 12 are reported to the storage controller unit 12.
In the meantime, if the cache process is not successful, there are two modes (a), (b) as follows (may occurs at other time than a failure).
(a) A status of cache failure is sent back. This includes a case where nothing but data smaller than a byte count requested by the storage controller unit 12 could be stored as well as includes a case where nothing could be stored. If partial memory succeeds, information containing byte quantity stored by the Frame relay device 30 is reported at the same time. At this time, the byte count to be stored is preferred to be integral multiple of the block length handled by the storage controller unit 12 in case of the block type SAN storage. Further in case of the file type NAS, it is preferred to be in the unit of byte or integral multiple of the data length excluding the header of the IP frame, which is an effective method.
(b) Old data corresponding to an ID of the storage controller unit 12 is cleared to cache new data and the identification tag of the new data and the deleted identification tag (or plural tags) of the old data are reported. The storage controller unit 12 deletes the corresponding entry from an internal table (deletes a cache log) in order to indicate that the corresponding data is not cached based on information of the tag deleted from the cache memory 35 by the Frame relay device 30.
(6) If the cache request to the Frame relay device 30 is completed normally, the storage controller unit 12 destroys the data requested for the Frame relay device 30 to cache from the cache memory 17 of the storage controller unit 12 to release the data area and then, stores only cache log (data identification tag, data length and the like)
(7) The storage controller unit 12 reports a normal end status to the host machine 50.
Consequently, the processing for caching in the Frame relay device 30 is completed.
<2>Processing in Case for Transmitting the Data Cached in the Frame Relay Device to the Host Machine
An example of the processing for making the Frame relay device transmit the cached data will be-described with reference to
This processing is executed as operations of the storage apparatus 10 (storage controller unit 12) and the Frame relay device 30 at the time of data read from the host machine 50.
(1) The storage controller unit 12 receives a read command based on a read request form the host machine 50.
(2) The storage controller unit 12 searches the cache table of the Frame relay device 30 on a route to the host machine 50 to recognize the route to the host machine 50,
(3) If a corresponding data exists on the cache memory 35 of the Frame relay device 30, the storage controller unit 12 issues a cache data transmission command (host address, data identification tag, transmission byte count and the like) to the Frame relay device 30. At this time, if a trouble occurs in the cache memory 35 of the Frame relay device 30, by responding to a data transmission instruction from the storage controller unit 12 saying that it is impossible to execute, the storage controller unit 12 executes the data transmission and it is possible to instruct the Frame relay device 30 to cache the corresponding data. Further, if that corresponding data is volatilized due to power OFF/ON of the Frame relay device 30, it is reported that it is impossible to execute. In this case, the storage controller unit 12 can instruct the Frame relay device 30 to cache the data sent by the storage controller unit 12 again.
(4) If data exists in the cache memory 35 of the Frame relay device 30, the Frame relay device 30 transmits that data to the host machine 50 (part of the header portion and the like is automatically generated).
(5) If data not existing in the cache memory 35 of the Frame relay device 30 is included, the storage controller unit 12 accesses remaining data ((2), (3) are executed at the same time).
(6) If transmission of all data in the Frame relay device 30 is completed, the Frame relay device 30 reports data transmission termination to the storage controller unit 12 or a status for announcing the termination in advance.
(7) If there exists remaining data, the storage controller unit 12 transmits the remaining data to the host machine 50 subsequent to data generated by the Frame relay device 30.
(8) The storage controller unit 12 reports normal end status to the host machine 50. Consequently, the processing of transmitting the data cached in the Frame relay device 30 to the host machine 50 is completed.
<3> Processing for Making the Frame Relay Device Destroy Cached Data (at the Time of Data Updating)
An example of the processing for destroying data cached by the Frame relay device (updating data) will be described with reference to
This processing is executed as operations of the storage device 10 (storage controller unit 12) and the Frame relay device 30 at the time of data write from the host machine 50.
(1) The storage controller unit 12 receives a write command due to a write request from the host machine 50.
(2) The storage controller unit 12 searches the cache table of the Frame relay device 30 on a route to the host machine 50 in order to recognize the route to the host machine 50.
(3) When old data of the corresponding data exists in the cache memory 35 of the Frame relay device 30, the storage controller unit 12 issues a cached area release command of the corresponding data issued to the Frame relay device 30 in order to instruct the Frame relay device 30 to destroy (delete) the cached data corresponding to the old data of the updated data.
(4) The Frame relay device 30 deletes the corresponding data in the cache memory 35 and release the cached area.
(5) The Frame relay device 30 reports a deletion completion status of the corresponding data to the storage controller unit 12. Consequently, the processing of case for destroying data cached in the Frame relay device 30 (at the time of updating data) is completed.
<4> Processing for Destroying the Data Cached in the Frame Relay Device (at the Time of Storage Apparatus Initialized)
An example of the processing for destroying data cached in the Frame relay device (at the time of storage apparatus initialized) will be described with reference to
This processing is executed as operations of the storage apparatus 10 (storage controller unit 12) and the Frame relay device 30 at the time of initialization of the storage apparatus,
(1) The storage apparatus 10 is initialized (including restart). At this time, the cache table is initialized.
(2) The storage controller unit 12 broadcasts its own ID information and all cached area release command in order to instruct all the Frame relay devices 30 to clear the cached area relating to itself.
(3) The Frame relay device 30 deletes the corresponding data in the cache memory 30 and releases the corresponding cached area. In this case, although it is not necessary to report the termination of the processing, the Frame relay device 30 may perform the end report of the process for the storage controller unit 12 to recognize the Frame relay device 30 on the route. Further, instead of broadcasting, the host machine 50 may transmit separately a command to the Frame relay device 30 on the route when host machine 50 recognizes the storage controller unit 12.
As a result, the processing in the case of destroying the data cached in the Frame relay device 30 (at the time of storage apparatus initialized) is completed.
<5> Processing in Case of Instructing the Frame Relay Device to Compare Cached Data
An example of the processing in case of instructing the Frame relay device to compare the cached data will be described with reference to
When data read is executed from the host machine 50, if the same data is cached in the first Frame relay device (side near the host machine 50) and the second Frame relay device (side near the storage apparatus 10), this processing is executed as operations of the storage apparatus 10 (storage controller unit 12) and the first and second Frame relay devices 30.
(1) The storage controller unit 12 receives a read command based on a read request from the host machine 50.
(2) The storage controller unit 12 searches the cache table of the Frame relay device 30 on a route to the host machine 50 in order to recognize the route to the host machine 50.
(3) If a corresponding data exists in the cache memory 35 of the Frame relay device 30, the storage controller unit 12 issues a verification command for that corresponding data in order to instruct the first Frame relay device 30 (for check by comparing) about host address, data identification tag, transmission byte count, check destination Frame relay device ID and the like.
(4) If the corresponding data exists in the cache memory 35 of the Frame relay device 30, the storage controller unit 12 issues a cache data transmission command in order to instruct the second Frame relay device 30 (for data transmission) about host address, data identification tag, transmission byte count, check destination Frame relay device ID and the like.
(5) The second Frame relay device 30 that transmits data notifies the first Frame relay device 30 that checks data, prior to the data transmission, that transmission of the corresponding data is started.
(6) The second Frame relay device 30 transmits the corresponding data following the same procedure as subsequent to the above-described <2> (4).
(7) The first Frame relay device 30 relays the data to the host machine 50 and at the same time, compares with data in its own cache memory 35.
(8) Corresponding to a result of the check, the following operations (a), (b), (c) are carried out.
(a) If the check is normal, the first Frame relay device 30 reports normal end status to the storage controller unit 12 and the second Frame relay device 30 from which the data is transmitted. Further, the storage controller unit 12 reports a normal end status to the host machine 50.
(b) If the check is disabled, the first Frame relay device 30 reports abnormal end status due to disabled check to the storage controller unit 12 and the second Frame relay device 30 from which the data is transmitted. In this case, the Frame relay device 30 of the sending destination deletes the corresponding cached data without any data sending. Further, the storage controller unit 12 can instruct the Frame relay device (Frame relay devices) 30 on the route to cache the corresponding data as well as transmit data to the host machine 50.
(c) If the check is abnormal, the first Frame relay device 30 reports the abnormal end status due to check abnormality to the storage controller unit 12 and the second Frame relay device 30 from which the data is transmitted. In this case, because transmission of data to the host machine 50 is completed, the storage controller unit 12 reports the abnormal end status to the host machine 50. Further, the Frame relay devices 30, which are sending destination and sender, delete the corresponding cached data. Corresponding to a re-execution request from the host machine 50, the storage controller unit 12 can instruct the Frame relay devices 30, which are sending destination and sender, to cache the corresponding data as well as transmit data.
In the above-described example, when the Frame relay device 30 reports check abnormality to the storage controller unit 12, the remaining data subsequent to the abnormal detection may be transferred to the storage controller unit 12 by reporting the data position and the like of a final frame which is checked normally to the storage controller unit 12.
Consequently, the processing in case of instructing the Frame relay device 30 to comrpare the cached data is completed.
(1) In case of a cache hit, because nothing but a short frame for instruction/response to/from the Frame relay device 30 is transmitted or received between the storage apparatus 10 and the Frame relay device 30, band occupation rate can be reduced with respect to the case of data transmission. Further, because generally the transmission bandwidth of the Frame relay device 30 is wider than the transmission bandwidth of the storage apparatus 10, the same effect as improvement of the transmission performance of the storage apparatus 10 can be obtained. Further, because the Frame relay device 30 executes nothing but a specified response, the hardware (circuit scale) can be constructed relatively simply.
(2) According to this method, the response of the entire system to a read request from the host machine 50 can be improved without consuming data transmission band. Also by assigning the usage rate of the cache memory for read and write largely to write on the side of the storage apparatus 10, the performance of the write processing can be improved. Alternatively, the number of the memories provided in the storage apparatus 10 can be reduced and there is an advantage for contributing to reduction of cost.
(3) In an application having plural ports and for executing a processing from the plural host machines 50, by installing the Frame relay device 30 on the route to each host machine, necessary data is cached for each host machine on each route. Consequently, further improvement of response can be expected.
(4) If plural Frame relay devices 30 on the route are made to cache the same data, the improvement of data reliability can be achieved by instructing the Frame relay device 30 on the side near the host machine 50 to execute comparison check on data so as to check data from a Frame relay device 30 located on the lower level.
(5) By providing the battery 32 in the Frame relay device 30 to incorporate the UPS, even under power failure, a shut-down processing from the host machine 50 connected to the UPS can be relayed appropriately. Further, a destage request to the storage apparatus 10 can be transmitted independently.
This application example refers to an example in which when a route table is created within the storage apparatus according to the table creation method of the above-described embodiment, automatic load balance control means is realized by the procedure described later and the function provided for the Frame relay device.
An example of a model of the storage network system of this application example will be described with reference to
The model of the storage network system of this application example comprises a storage apparatus 10 constituted of a disk array unit and the like, eight Frame relay devices A-H (30) for relaying data write/read request from the host machine to this storage apparatus 10 and six host machines A-F (50) constituted of information processing units connected to this Frame relay device 30 and the like.
An example of the processing in case of realizing automatic load balance control will be described with reference to
(1) Following the table creation method of the above-described embodiment, the storage apparatus 10 recognizes the Frame relay device 30 located on the route of the host machine 50.
(2) If a substitutive port (sub-port: second port) for a first port is set up in the storage apparatus 10, the storage apparatus 10 searches the Frame relay device 30 on the route of the node A using the sub-port and specifies, for example, a Frame relay device B corresponding to a branch between the first port and the sub-port. That is, it inquires a Frame relay device E about the position (port) and distance (number of repeaters of the Frame relay devices) of the Frame relay devices A, B, D in
(3) If the transmission load to the first port is more than a specific capacity, the storage apparatus 10 sets up a permission of assigning data to the sub-port to the Frame relay device B.
For example, if the Frame relay device 30 controls the load balance when the write load from the host machine 50 is excessive, following occurs. If the data transmission load of a port X leading to the first port of the storage apparatus 10 exceeds a value set by the storage apparatus 10 or is estimated to exceed, the Frame relay device B outputs part of data to a port Y leading to the sub-port of the storage apparatus 10 in order to control to maintain the transmission load from the host machine 50 below a specified level.
(4) For example, if when the read load from the host machine 50 is excessive, the storage apparatus 10 controls the load balance, following occurs. If the data transmission load of the first port of the storage apparatus 10 exceeds a specified value in the storage apparatus 10 or is estimated to exceed, the storage apparatus 10 outputs part of data to the sub-port of the storage apparatus 10 to control to maintain the transmission load of the first port below a specified level. Further, the Frame relay device A and the Frame relay device E relays data of each port to a target node (normal operation).
As an effect of this application example, by recognizing the Frame relay device 30 and the port with bridge ID or the like, automatic load balance control can be achieved without necessity of any support such as a program of the host machine 50.
This application example refers to an example of a configuration in which a Frame relay device is disposed at back end and by increasing the memory capacity on the Frame relay device tremendously to use the large capacity memory as a cache memory of the storage controller unit or a shared memory. And the cache memory and the like on the storage controller unit is not provided.
An example of mounting on the blade type system in the storage network system of this application example will be described with reference to
As for the mounting onto the blade type system, a storage controller unit, a back end type Frame relay device; a storage device unit and a front end type Frame relay device are disposed in order from the top in a chassis. A power supply unit is disposed on the rear face.
This system facilitates installation of additional memory during power ON or can expand a necessary additional cache memory in accordance with an expansion of the chassis. Further, by linkage with the disk control unit, tuning such as optimization of through-put can be carried out easily.
An example of the storage network system of this application example will be described with reference to
The storage network system of this application example comprises a storage apparatus 10a composed of a disk array unit and the like, two front end (F.E) type Frame relay devices A, B (30) disposed on the side of the host machine for relaying data write/read request from the host machine to this storage apparatus 10a, four back end (B.E) type Frame relay devices C-F(40) disposed on the side of the storage apparatus and four host machines A-D (50) each composed of information processing unit connected to the Frame relay device 30 and the like.
The back end type Frame relay device 40 controls two different interfaces for SAS I/F and serial I/F (PCI express) for memory control with independent switch function. This Frame relay device 40 has substantially the same configuration as the Frame relay device 30 described before (
Because the Frame relay device 30 for front end has the same structure and operation as the above-described Frame relay device 30, description thereof is omitted.
As for notification from this system to other system, rewrite of the memory or the like is notified to other system using a message transaction function such as PCI express. In the meantime, conversion between the PCT-express and SAS is not carried out (although the cache memory is shared, it operates as independent two switches). As for write to a duplicated section such as the cache memory, the relay apparatus 40 executes duplicating write with a header that accompanies a device identifier of the relay apparatus 40 attached. That is, the frame is composed of header information comprising flag specifier, Frame relay device ID of own system device, address specifier, Frame relay device ID of other system device, address specifier, data length and the like and data. Consequently, single time data transmission can satisfy the requirement and traffic can be reduced further. Or it is possible to create a multi-cast group with the own system and other system Frame relay devices 40 to multi-cast data.
At the time of cache read, like the above-described embodiments, other system Frame relay device is instructed to send data and when it passes the own system Frame relay device, the own system Frame relay device verifies the data or the other system Frame relay device is instructed to send only a verification code of that data in order to make the data transferred by the own system Frame relay device verified by the data controller side. Consequently, both reduction of the frequency band and high reliability can be achieved without increasing data transmission volume largely.
An example of the structure of the back end type Frame relay device will be described with reference to
As shown in
As contracted to the single CPU structure, as shown in
In the meantime, if a protocol converting unit (I/F converting function) is mounted on the data controller 44 and the PCI-express is changed to LAN, a Frame relay device described later [application example 3] is produced.
An example of processing at the time of data write into a disk (HDD) of the storage device will be described with reference to
This processing is executed as operations of the storage apparatus 10 and the Frame relay device 40 at the time of data write.
(1) The storage apparatus 10a receives a write command from the host machine 50.
(2) The channel control unit 13 notifies the data controller 15 of receiving the write command and transmission data length.
(3) If the received write data is new data and there is sufficient empty area in the cache memory 45 of the Frame relay device 40, the data controller 15 determines a cache area in the Frame relay device 40 for storing the data.
(4) The data controller 15 issues a data cache command to both the Frame relay devices 40 and notifies other system data controller 15 with message transaction.
(5) The channel control unit 13 transmits write data to the both Frame relay devices 40 through the data controller 15. At this time, if parity data is used, the data controller 15 generates parity and transmits to the Frame relay devices 40 of the both systems in the same way.
(6) The data controller 15 instructs the disk control unit 14 to write into disk and instructs the Frame relay device 40 to send data to the corresponding disk.
(7) The disk control unit 14 issues a disk write command. At this time, the Frame relay device 40 relays that command to that corresponding disk and subsequently, sends write data to the disk unit.
(8) After the data transmission processing is completed, the CPU 43 of the Frame relay device 40 reports a status to the data controller 15.
(9) Then, if it is completed normally, the disk unit records data and reports the status. At this time, the Frame relay device 40 relays the status to the disk control unit 14. Further, the disk control unit 14 reports execution termination to the data controller 15. Then, if it is completed normally, the processing at the time of data write to the disk is completed.
(10) If the received write data is not a new data or there is no sufficient empty area in the cache memory 45 of the Frame relay device 40, the data controller 15 issues cached area release command for the Frame relay device 40 to the unnecessary Frame relay devices 40 of both the systems and notifies the data controller 15 of other system through message transaction. Further, the CPU 43 of the Frame relay device 40 releases the corresponding cached area and reports the status to the data controller 15.
(11) Unless it is completed normally after the CPU 43 of the Frame relay device 40 releases the cached area, the data controller 15 notifies the data controller 15 of other system through message transaction. Further, the CPU 16 of the storage controller unit 12 executes error processing to the Frame relay device 40. Consequently, the processing is completed.
An example of processing at the time of data read from a disk (HDD) of the storage device will be described with reference to
This processing is executed as operations of the storage apparatus 10a and the Frame relay device 40 at the time of data read.
(1) The storage apparatus 10a receives a read command from the host machine 50.
(2) The channel control unit 13 notifies the data controller 15 of reception of read command and transmission data length.
(3) If there exists no corresponding data in the cache memory 45 of the Frame relay device 40 or there exists part of the corresponding data and reading of a portion other than cached data is carried out, the data controller 15 instructs the disk controller 14 to read out from the disk and issues a cache command of the corresponding data to the Frame relay device 40.
(4) The disk unit transfers the data. Then, the Frame relay device 40 of the first port sends the data to the disk control unit 14 and the Frame relay device 40 of other system and caches the data. Further, the channel control unit 13 sends the data to the host machine 50.
(5) If transmission of all data is completed, the CPU 43 of each Frame relay device 40 reports a status to the data controller 15.
(6) If it is completed normally, the data controller 15 notifies the data controller 15 of other system through message transaction that the cached data is updated. Consequently, the processing at the time of data read from the disk is completed.
(7) In reading of case where complete corresponding data exists in the cache memory 45 of the Frame relay device 40 or reading cached data with part of the corresponding data left, the data controller 15 issues a cache data transmission command to the Frame relay device 40 of other system and a verification command for the corresponding data to the Frame relay device 40 of own system.
(8) The Frame relay device 40 of other system sends data to the Frame relay device 40 of own system and this Frame relay device 40 of own system compares with corresponding data in the cache memory 45. If data meet as a result of this comparison, the Frame relay device 40 of own system relays data to the data controller 15 and this data controller 15 sends that data to the channel control unit 13. Then, the channel control unit 13 sends that data to the host machine 50.
(9) If transmission of all data is completed, the CPU 43 of each Frame relay device 40 reports a status to the data controller 15. (10) If it is completed normally, the processing is finished. If the data verification is completed abnormally, the data controller 15 notifies the data controller 15 of other system of the abnormality generation through message transaction. Then, the CPU 16 implements error processing to the Frame relay device 40. Consequently, the processing is completed,
As described above, the disk control unit 14 of the storage apparatus 10a is released from data transmission so that multiplexing of command is facilitated thereby realizing effective use of the band.
This application example refers to an example in which the protocol converting unit is mounted on the back end type Frame relay device in order that data can be exchanged with conversion function between two systems interfaces that are provided independently in the above described application example 2.
An example of the storage network system of this application example will be described with reference to
The storage network system of this application example comprises a storage apparatus 10b composed of a disk array unit and the like, three front end (F. E) type Frame relay devices A-C (30) disposed on the side of the host machine for relaying data read/write requests from the host machine to the storage apparatus 10b, four back end (B.E) type Frame relay devices DOG (40a) disposed on the side of the storage apparatus and four host machines A-D (50) composed of information processing unit connected to the Frame relay device 30 and the like.
The back end type Frame relay device 40a has the protocol converting unit 48 as I/F conversion function. Because the Frame relay device 40a has the same structure and operation as the back end type Frame relay device 40 described above except in this protocol converting unit 48, description thereof is omitted. Further, because the front end type Frame relay device 30 has the same structure and operation as the above-described front end type Frame relay device 30, description thereof is omitted.
Generally, if the capacity of the cache memory is the same, there is such a problem that as the data capacity of the storage increases, the cache hit rate drops. However, because in this application example, the cache memory is extended to each expansion chassis of the HDD, the ratio between the cache memory capacity and the data capacity can be made constant, so that the drop of the cache hit rate can be prevented.
Although according to the conventional method employing FC loop for the back end, the expansion chassis can be connected only in series, according to this method, the expansion chassis can be disposed in parallel. Hot swap of each expansion chassis or maintenance thereof by separation is possible which is used to be impossible. Further, there is such a merit that the expansion chassis except a last one can be exchanged during power ON. Further, because through pass decreases as compared to the series connection, the transmission band can be used effectively. Additionally, the freedom in layout of the units is large. In the meantime, near-line storage such as SAS can be connected under the structure of WAN.
In this application example, a measure for carrying out protocol conversion between different type interfaces (LAN-FC) at a low load has been taken and this will be described below.
An example of the mechanism for facilitating the conversion with other serial data transmission protocol will be described with reference to
In the method using 10 b (bit) −8 b conversion used widely in fiber channel which is serial l/F for storage apparatus, SAS and the like, protocol conversion is not necessary and what is necessary is to pass protocols mutually and analyze the identifier of each protocol header and distribute it to an appropriate port. However, because in case of LAN, often the above-mentioned 10 b-8 b conversion is not used except optical transmission interface such as 1000BASE-FX, a large-scale protocol conversion is necessary if no other measure is taken, which leads to increase of circuit scale thereby causing complexity and increase of cost.
Thus, according to this method, as shown in
Data link between each port and device/other Frame relay device in initial condition is established according to the conventional method. For example, idle frame and R_RDY of its response are carried out only by the FC side link and are not outputted to the LAN side.
As a result, for example, network transfer speed of the LAN is higher than the transfer speed of SAS or the like. In this case, time loss caused by data volume increment due to frame divisions and by overhead associated with data conversion can be cancelled. Then after the conversion, it becomes physical layer data of a mating side. Thus, the conversion is enabled without making so conscious of the protocol of the mating side. For example, only adjustment of an important portion in terms of timing is necessary. Of course even if the protocol conversion according to ordinary method is carried out, the above-mentioned configuration can be constructed.
An example of the processing for FC-LAN conversion will be described with reference to
This processing is carried out as an operation of the protocol converting unit 48 in the Frame relay device 40a when SC frame is received.
(1) The Frame relay device 40a receives an FC frame.
(2) The protocol converting unit 48 analyzes the frame header by extracting destination address. Then, the protocol converting unit 48 searches in the route table of the destination address.
(3) If in case of LAN port output, the order set includes (Kxx.x), the protocol converting unit 48 converts (Kxx.x) to code D(xx.x) and outputs to a specified LAN port as the frame of a second MAC address. Then, if the frame conversion is completed, the processing of FC-LAN conversion is completed.
(4) Not in case of LAN port output, the protocol converting unit 48 outputs a frame to a specified FC port. Consequently, the processing is completed.
(5) If the order set does not contain (Kxx.x), the protocol converting unit 48 accumulates first MAC address frame.
(6) If the frame conversion is completed, the protocol converting unit 48 outputs to a specified LAN port as first MAC address frame. Consequently, the processing is completed.
(7) If the frame conversion is not completed, the protocol converting unit 48 determines whether or not next order set contains (Kxx.x). If as a result of this determination, (Kxx.x) is contained, the protocol converting unit 48 outputs to a specified LAN port as first MAC address frame. If the frame conversion is not completed or even if next order set does not contain (Kxx.x), the protocol converting unit 48 moves a pointer to next order set and repeats the processing.
Consequently, due to the conversion function of the protocol converting unit 48, the two interface systems allow data to be brought to or received from each other.
The specific embodiments of the present invention which the present inventors have achieved have been described above. The present invention is not restricted to the above-mentioned embodiments but needless to say, may be modified in various ways within the scope not departing from the gist of the invention.
The present invention relates to a storage network system and is effective when it is applied to the storage network system, particularly a network system in which Frame relay devices are disposed between host machines and storage apparatus and can be widely applied to network Frame relay devices such as network switch, router in NAS and fiber channel switch, SAS expander in SAN and the like.
Number | Date | Country | Kind |
---|---|---|---|
2005-065717 | Mar 2005 | JP | national |