The present invention relates in general to a storage device control apparatus, and, more particularly, to a storage device control apparatus for use in a memory system having a memory module connected to a memory controller by a serial interface to a storage device for storing information based upon access to the storage device control apparatus.
Recently, the amount of data typically processed by a computer system has been rapidly increasing. As a storage system for managing such data, a large scale storage system managed by a RAID (Redundant Arrays of Inexpensive Disks) method for providing a massive storage resource, called a mid-range class or an enterprise class, has been attracting attention. In order to efficiently use and manage such a massive data storage, a technique for connecting a storage system, such as a disk array device, to an information processing device by a SAN (Storage Area Network), to enable a massive access to the storage system at high speeds, has been developed. On the other hand, a NAS (Network Attached Storage) for connecting a storage system to an information processing device via a network using TCP/IP protocol and the like to accomplish access at the file level from the information process device also has been developed.
As a primary storage device of a storage system in which a high reliability is required, a registered DIMM (Dual In-line Memory Module) is used. In the registered DIMM, in order to ensure the reliability of the write data, an ECC memory for generating an ECC (Error Correcting Code) from the memory controller and storing the write data with the attached ECC in the registered DIMM is known. In the ECC memory, the ECC is read together with the write data during a read-access process, and, thus, a data error can be detected and corrected. Conventionally, parallel interface signal lines are connected between the memory controller and the memory module (for example, see Patent Document 1); and, although one of the parallel interface signal lines may be out of order, an error can be detected and corrected by the ECC. However, there is a limit to transmitting data at high speeds in a parallel interface, and it is difficult to mount the circuit components thereof due to a sharp increase in the wiring density. In addition, in a stub-type memory connection, a deterioration of the signals can be prevented from being generated due to multiple reflection of the signal from the stub bus, or due to a resistance component, a capacitance component and an inductance component of the stub bus.
In view of this technical background, recently, an attempt to realize a connection in which a serial interface is provided between the memory controller and the memory module to improve the data transfer speed is currently being investigated. For example, in a FB-DIMM (Fully-Buffered DIMM), the interface with the memory controller has a serial interface similar to a PCI express, and the memory controller can be connected to the FB-DIMM in a point-to-point manner, whereby the data can be transmitted at high speeds, while signal deterioration due to the effect of the stub bus is avoided and the circuit components can be easily mounted.
Japanese unexamined Patent Publication No. 2001-222472
The FB-DIMM for a server is used as a storage device in the storage device control apparatus for receiving an input/output request from an information-processing device to the storage device and for processing the access thereof, whereby the process can be performed at high speeds and the circuit components can be easily mounted.
However, in the case of using a parallel transmitting method, an error can be corrected by the ECC even if one of the interface signal lines is out of order; however, in the case of using the serial transmitting method, the data cannot be transmitted if an obstacle is generated in the interface signal line. In the storage system in which a high reliability is needed, a sufficient fail-safe countermeasure must be prepared in advance against obstacles in the interface signal line.
Accordingly, an object of the present invention is to provide a storage device control apparatus which serves as a storage device for temporarily storing data, while being capable of ensuring a sufficient reliability even when a memory controller and memory modules are connected by a serial interface.
In order to solve the above-mentioned problems, a storage device control apparatus according to the present comprises a channel control unit for outputting an I/O request for accessing a storage device in response to a data input/output request in a file unit received from an information processing device. The channel control unit includes a CPU for receiving the data input/output request in the file unit, an I/O processor for outputting the I/O request corresponding to the data input/output request in the file unit in response to an instruction from the CPU, and a memory system for temporarily storing information required for a file access process of the CPU and having a plurality of memory modules and a memory controller for controlling memory access to the plurality of memory modules. A command, an address and data are serially transmitted from the memory controller to each of the memory modules, each of the memory modules having a plurality of memory elements and a plurality of buffer units. Each of the buffer units receives and interprets the command serially transmitted from the memory controller, controls the memory access to each of the memory elements from the memory controller, and simultaneously converts the serially transmitted data into parallel data and transmits the parallel data to each of the memory elements. The memory modules and the memory controller are connected by duplicated serial interfaces.
Also, the above-mentioned memory system is not limited to a temporary storing device of the channel control unit for processing a file access request from the information-processing device, and various devices (a cache memory, a shared memory, and a temporary storing device of a disk control unit, etc.) can be applied.
The memory system comprises, for example, a plurality of memory modules and a memory controller for controlling memory access to the plurality of memory modules, and a command, an address and data are serially transmitted from the memory controller to each of the memory modules, each of the memory modules having a plurality of memory elements and a plurality of buffer units. Each of the buffer units receives and interprets the command serially transmitted from the memory controller, controls the memory access to each of the memory elements from the memory controller, and simultaneously converts the serially transmitted data into parallel data and transmits the parallel data to each of the memory elements. The memory modules and the memory controller are connected by duplicated serial interfaces.
Another configuration of the memory system comprises, for example, first systematic and second systematic memory-module groups, each composed of a plurality of memory modules, a memory controller for controlling memory access to the memory modules belonging to each of the first and second systematic memory-module groups, and a control circuit for controlling the switching of a memory access path from the memory controller to each memory module. A command, an address and data are serially transmitted from the memory controller to each of the memory modules, each of the memory modules having a plurality of memory elements and a plurality of buffer units. Each of the buffer units receives and interprets the command serially transmitted from the memory controller, controls the memory access to each of the memory elements from the memory controller, and simultaneously converts the serially transmitted data into parallel data and transmits the parallel data to each of the memory elements. The memory controller and each of the memory modules are connected to each other by a serial interface such that a first access path for performing a memory access from the memory controller to the memory module belonging to the first systematic memory module group and a second access path for performing a memory access from the memory controller to the memory module belonging to the second systematic memory module group are different from each other. The control circuit writes the same data as the data stored in the first systematic memory module through the first access path to the second systematic memory module through the second access path, when performing a write access from the memory controller to the first systematic memory module. Thereby, by dual writing of the data, the reliability of the memory system can be increased.
Another configuration of the memory system comprises, for example, a plurality of memory modules and a memory controller for controlling memory access to the plurality of memory modules, in which a command, an address and data are serially transmitted from the memory controller to each of the memory modules, each of the memory modules having a plurality of memory elements and a plurality of buffer units. Each of the buffer units receives and interprets the command serially transmitted from the memory controller, controls the memory access to each of the memory elements from the memory controller, and simultaneously converts the serially transmitted data into parallel data and transmits the parallel data to each of the memory elements. The memory controller comprises first and second memory interface units for loop-connecting with the buffer unit of each of the memory modules through a serial interface signal line. The memory controller can access each of the memory modules by any one of the first and second memory interface units. Thereby, by duplicating the access path such that the memory access can be performed in bi-directions of the loop-shaped serial interface, the reliability of the memory system can be increased.
Another configuration of the memory system comprises, for example, a plurality of the memory modules and a memory controller for controlling memory access to the plurality of memory modules, wherein a command, an address and data are serially transmitted from the memory controller to each of the memory modules, each of the memory modules having a plurality of memory elements and a plurality of buffer units. Each of the buffer units receives and interprets the command serially transmitted from the memory controller, controls the memory access to each of the memory elements from the memory controller, and simultaneously converts the serially transmitted data into parallel data and transmits the parallel data to each of the memory elements. The memory controller and the buffer unit of each of the memory modules are connected to each other through duplicated serial interface signal lines. Thereby, by duplicating the serial interface signal line serially connecting a memory controller and a memory module, the reliability of the memory system can be increased.
The information-processing devices 1-3 (200) are connected with the storage system 600 through a Local Area Network (LAN) 400. The LAN 400 is a communication network, such as the Ethernet (registrated trademark) or FDDI, and the communication between the information-processing devices 1-3 (200) and the storage system 600 is performed by TCP/IP protocol. The data access request (the data input/output request in a file unit; hereinafter referred to as the file access request) according to the designation of the file name with respect to the storage system 600 is transmitted from the information-processing devices 1-3 (200) to the below-mentioned channel control unit CHN1-CHN4 (110).
The LAN 400 is connected to a backup device 910. The backup device 910 is, for example, a disk based device, such as a MO, CD-R or DVD-RAM, or a tape-based device, such as a DAT tape, a cassette tape, an open tape or a cartridge tape. The backup device 910 stores backup data corresponding to the data stored in the storage device 300 by communicating with the storage device control apparatus 100 through the LAN 400. In addition, the backup device 910 is connected with the information-processing device 1 (200), and it acquires the backup data corresponding to the data stored in the storage device 300 through the information-processing device 1 (200).
The storage device control apparatus 100 includes the channel control units CHN1-4 (110). The storage device control apparatus 100 controls the write access or read-access among the information-processing devices 1-3 (200), the backup device 910, and the storage devices 300 through the channel control units CHN1-4 (110) and the LAN 400. The channel control units CHN1-4 (110) receive a file access request from the information-processing devices 1-3 (200), respectively. In other words, the channel control units CHN1-4 (110) are allocated with network addresses (for example, IP addresses) on the LAN 400, respectively, and they are individually operated as a NAS. The NAS can individually provide the NAS service to the information-processing devices 1-3 (200) as if the NAS individually exists. By including the channel control units CHN1-4 (110) for individually providing the NAS service to one storage system 600, the NAS servers which were individually operated on separate computers in prior known systems are integrated in one storage system 600. Also, the storage system can be wholly managed and various settings and controls, or an efficiency of maintenance, such as obstacle management or version management, can be accomplished.
The information-processing devices 3-4 (200) are connected to the storage device control apparatus 100 through a SAN 500. The SAN 500 is a network for transmitting and receiving data to/from the information-processing devices 3-4 (200) using a block, which is a data management unit in the storage region provided by the storage device 300, as a unit. The communication performed between the information-processing devices 3-4 (200) and the storage device control apparatus 100 by the SAN 500 is performed according to a fiber channel protocol. The data access request (hereinafter referred to as a block access request) in a block unit is transmitted from the information-processing devices 3 and 4 (200) to the storage system 600 according to the fiber channel protocol.
The SAN 500 is connected with the backup device 900 corresponding to the SAN. The backup device 900 corresponding to the SAN communicates with the storage device control apparatus 100 through the SAN 500 and stores the backup data corresponding to the data stored in the storage device 300.
The storage device control apparatus 100 further includes channel control units CHF1-2 (110), in addition to the channel control units CHN1-4 (110). The storage device control apparatus 100 communicates with the information-processing devices 3 and 4 (200) and the backup device 900 corresponding to the SAN through the channel control units CHF1-2 (110) and the SAN 500.
Also, an information-processing device 5 (200) is connected directly with the storage device control unit 100 without using a network, such as the LAN 400 or the SAN 500. An example of the information-processing device 5 (200) is a mainframe computer. The communication between the information-processing device 5 (200) and the storage device control apparatus 100 is performed by a communication protocol, such as, for example, FICON (Fiber Connection; registrated trademark), ESCON (Enterprise System Connection; registrated trademark), ACONARC (Advanced Connection Architecture; registrated trademark), or FIBARC (Fiber Connection Architecture; registrated trademark). A block access request is transmitted from the information-processing device 5 (200) to the storage system 600 according to any one of the above-mentioned communication protocol.
The storage device control apparatus 100 communicates with the information-processing device 5 (200) through channel control units CHA1-2 (110).
The SAN 500 is connected with another storage system 610 provided at a location (a secondary site) remote from the location (a primary site) where the storage system 600 is positioned. The storage system 610 is used as a data replicating device having a replication or remote copy function. Also, the storage system 610 may be connected to the storage system 600 by a communication line, such as an ATM, in addition to the SAN 500. In this case, as the channel control unit 110 which is connected to the SAN 500, the channel control unit comprising an interface (channel extender) for using the communication line is used.
By integrating the channel control units CHN1-4 (110), the channel control units CHF1-2 (110) and the channel control units CHA1-2 (110) in the storage system 600, the storage system that is capable of being connected to dissimilar networks can be realized. That is, the storage system 600 is a SAN-NAS combined storage system, which is connected to the LAN 400 by using the channel control units CHN1-4 (110) and is connected to the SAN 500 by using the channel control units CHF1-2 (110).
A connection unit 150 connects each channel control unit 110, a shared memory 120, a cache memory 130 and each disk control unit 140 to each other. The transmission/reception of the command or the data among the channel control units 110, the shared memory 120, the cache memory 130 and the disk control units 140 is performed through the connection unit 150. The connection unit 150 is composed of a high-speed bus, such as a very high-speed crossbar switch, for transmitting the data by high-speed switching. Thereby, the communication performance among the channel control units 110 is remarkably improved and a high-speed file sharing function or high-speed failover can be accomplished.
The shared memory 120 and the cache memory 130 constitute the memory device shared by the channel control units 110 and the disk control units 140. The shared memory 120 is mainly used to store control information, a command, or the like, and the cache memory 130 is mainly used to store data. For example, in the case where a data input/output command which any channel control unit 110 receives from the information-processing device 200 is a write command, the corresponding channel control unit 110 writes the write command to the shared memory 120 and writes the write data received from the information-processing device 200 to the cache memory 130. On the other hand, the disk control unit 140 monitors the shared memory 120, and, if it is determined that the write command is written to the shared memory 120, the disk control unit 140 reads the write data from the cache memory 130 according to the write command and writes this data to the storage device 300.
On the other hand, in the case where a data input/output command which any channel control unit 110 receives from the information-processing device 200 is a read command, the corresponding channel control unit 110 writes the read command to the shared memory 120 and simultaneously determines whether or not the data to be read exists in the cache memory 130. Here, in the case where the data to be read exists in the cache memory 130, the channel control unit 110 reads the data from the cache memory 130 and transmits it to the information-processing device 200. In the case where the data to be read does not exist in the cache memory 130, the disk control unit 140, which detects that the read command is written to the shared memory 120, reads the data to be read from the storage device 300 and writes it to the cache memory 130, and writes that effect to the shared memory 120. If the channel control unit 110 detects that the data to be read is written to the cache memory 130 while monitoring the shared memory 120, the channel control unit 110 reads the data from the cache memory 130 and transmits it to the information-processing device 200.
The disk control unit 140 converts the data access request to the storage device 300, according to a logic address assignment transmitted from the channel control unit 110, into a data access request according to the physical address assignment and performs writing or reading of the data to/from the storage device 300 in response to the I/O request output from the channel control unit 110. In the case where the storage device 300 is composed of a RAID, the disk control unit 140 performs the data access according to the RAID. In addition, the disk control unit 140 performs the replication control or the remote copy control, in order to manage the replication of the data stored in the storage device 300, to control the backup, and to prevent the data from being lost due to fire (disaster recovery).
The storage device 300 comprises a single or a plurality of disk drives (physical volume) and provides a storage region that can be accessed from the information-processing device 200. In the storage region provided by the storage device 300, the physical volume combining the storage space of a single physical volume or a plurality of physical volumes is set. As the logic volume set to the storage device 300, there is a user logic volume that can be accessed from the information-processing device 200 or a system logic volume used for control of the channel control unit 110. The system logic volume stores an operating system executed in the channel control unit 110. Also, the logic volume provided by the storage device 300 is allocated with a logic volume which each of the channel control units 110 can access. Of course, a plurality of the channel control units 110 can share the same logic volume.
In addition, as the storage device 300, for example, a hard disk device or a flexible disk device can be used. As the storage structure of the storage device 300, a RAID type disk array can be constructed by a plurality of the storage devices 300. Further, the storage device 300 and the storage device control apparatus 100 can be directly connected to each other, or they can be connected to each other through the network. Also, the storage device 300 may be formed so as to be integral with the storage device control apparatus 100.
A management terminal 160 is a computer device for repairing and managing the storage system 600, and it is connected to each of the channel control units 110 and each of the disk control unit 140 through an internal LAN 151. By manipulating the management terminal 160, an operator can set up the disk drive of the storage device 300 or the logic volume, and can install the micro program executed in the channel control unit 110 or the disk control unit 140.
Each of the channel control units CHN1 and CHN2 (110) comprises a network interface 111, a CPU (NAS processor) 112, a memory controller 113, a memory (a memory module) 114, an input/output control unit 115, and a converting circuit (a converting LSI) 116, all which are formed on a single substrate or a plurality of circuit substrates as an integral unit. The network interface 111 is a communication interface for communicating with the information-processing device 200 according to the TCP/IP protocol, and it is composed of, for example, a LAN controller. The CPU 112 is a processor for allowing the channel control unit 110 to function as a NAS server. The memory 114 stores various programs and data, for example, an operating system, a volume manager, a file system program, a RAID manager, a SVP manager, a file system protocol (NFS or Samba etc.), a backup management program, an obstacle management program, a NAS manager, and a security management program. The memory controller 113 controls access to the memory 114 according to an instruction received from the CPU 112. The memory system 119 is composed of the memory 114 and the memory controller 113. The input/output control unit 115 includes an I/O processor 117 and a NVRAM (Non Volatile RAM) 118, and it transmits or receives the data or a command to or from the disk control unit 140, the cache memory 130, the shared memory 120 and the management terminal 160. The I/O request corresponding to the file access request is output by the I/O processor 117.
The channel control units CHN1 and CHN2 (110) comprising the cluster are constructed so as to mutually communicate the data by a signal line 110a, and thus they can share the data. In the case where the data is communicated between the channel control units CHN1 and CHN2 (110), since the distance between the units is long, a skew of the signal can be generated in the clock distribution structure. Accordingly, in the present embodiment, in consideration of the above-mentioned problem, a clock extracting structure is employed in the communication between the channel control units CHN1 and CHN2 (110). More specifically, since the memory 114 has a clock distribution structure operated by receiving the distributed clock signal from a clock generator 19 (See
At the lower portion of the management terminal 160, a slot is formed for mounting the board of the channel control unit 110. The board of the channel control unit 110 is a unit on which the circuit substrate of the channel control unit 110 is formed, and it is the unit mounted in the slot. In the storage system 600 related to the present embodiment, the number of slots is 8, and the boards of the channel control units 110 are mounted in the 8 slots as seen in
In the channel control unit 110 in each slot, a plurality of the channel control units 110 of the same kind compose a cluster. For example, a pair of the channel control units CHN1 and CHN2 (110) and a pair of the channel control units CHN3 and CHN4 (110) compose clusters. By composing a cluster, in the case where an obstacle is generated at any channel control unit 110 in the cluster, the process which was performed by the channel control unit 110 in which the obstacle is generated at that time can be continuously performed in the other channel control unit 110 in the cluster.
In the storage device control apparatus 100, power is supplied from two power supply systems in order to improve the reliability, and 8 slots in which a board of the channel control unit 110 is mounted are divided by 4 for the power supply system. Accordingly, in the case of composing a cluster, boards of the channel control units 110 each belonging to each of two power supply systems must be included. Thereby, even though an obstacle is generated in one power supply system thereby to stop supplying power, the power is continuously supplied to the board of the channel control unit 110 belonging to the other power supply system composing the same cluster, and, thus, the process at the channel control unit 110 can be continued.
In addition, the channel control unit 110 is provided as a board which can be mounted in each slot, as mentioned above, and one board can be composed of a plurality of integrally formed circuit substrates. Also, while the other devices composing the storage device control apparatus 100, such as the disk control unit 140 or the shared memory 120, are not shown in
However, as the storage device control apparatus 100 and the storage device 300 accommodated in each chassis, conventional devices corresponding to the SAN can be used. Particularly, the connector of the board of the channel control units CHN1-CHN4 (110) is shaped such that it can be mounted in the slot formed in a conventional chassis, and, thus, the device can be more simply used. In other words, the storage system 600 of the present embodiment can be easily constructed by using an existing product.
The FB-DIMM 10 organizes a plurality of DRAMs, which are the memory elements to a module. The FB-DIMM 10 includes 8 DRAMs 11-1, 11-2, . . . , 11-8 mounted on the DIMM module substrate and a buffer unit (advanced memory buffer) 12 for buffering the packet type of the command, the address and the data that is serially transmitted from the memory controller 113 through the serial interface signal line 20, for interpreting the command, and for controlling the memory access to the DRAMs 11-1, 11-2, . . . 11-8. An FB-DIMM 10 can be connected to the other FB-DIMMs 10 through the buffer unit 12 in a daisy chain manner. In the case where a plurality of the FB-DIMMs 10 are connected to each other through the buffer unit 12 in the daisy chain manner, the buffer unit 12 references the address transmitted from the memory controller 113, and, in the case that it is not the command which is transmitted to its own FB-DIMM, transmits the command to the next FB-DIMM 10. Also, there are various kinds of DRAMs mounted on the FB-DIMM 10 and the structure mentioned herein is only an example thereof.
The buffer unit 12 comprises a pass-through circuit 13, a clock circuit 14, a deserializer/decode circuit 15, a data bus interface 16, a serializer 17, and a pass-through circuit 18. Whether or not the command, which is supplied from the memory controller 113 to the FB-DIMM 10 through the serial interface signal line 21, is a command transmitted to its own FM-DIMM 10 is determined by the pass-through circuit 13. In the case that it is not a command transmitted to its own FB-DIMM, the command passes through and is transmitted to the next FB-DIMM 10. In the case that it is a command transmitted to its own FB-DIMM, the command is interpreted in the deserializer/decode circuit 15. In the case that the command is the write command, the write address and a write data are supplied to the deserializer/decode circuit 15, subsequent to the corresponding command. The deserializer/decode circuit 15 outputs a control signal for effecting write access to the DRAMs 11-1, 11-2, . . . , 11-8 and coverts the write data into parallel data to output it to the DRAMs 11-1, 11-2, . . . , 11-8 through the data bus interface 16. On the other hand, in the case that the command is a read command, the read address is supplied to the deserializer/decode circuit 15, subsequent to the corresponding command. The deserializer/decode circuit 15 outputs a control signal for effecting read-access to the DRAMs 11-1, 11-2, . . . , 11-8. The read data read from the DRAMs 11-1, 11-2, . . . , 11-8 is transmitted to the serializer 17 through the data bus interface 16 and is converted into serial data. The pass-through circuit 18 transmits the read data which has been read from the DRAMs 11-1, 11-2, . . . , 11-8 or the read data that is transmitted from a subsequent FB-DIMM 10 to the memory controller 113 through the serial interface signal line 22.
In addition, the clock circuit generates the DRAM clock required for accessing the memory 14 on the basis of the reference clock generated by the clock generator 19. Also, the memory controller 113 performs a peer-to-peer communication with the FB-DIMM 10 through a SM (System Management) bus 22 to perform system management or power management. By employing the FB-DIMM 10 having a serial interface as the storage device in the storage device control apparatus 100, memory access of the storage device control apparatus 100 can be performed at high speeds and the circuit components can be easily mounted.
In the present embodiment, the serial interface signal line 20 is connected between the memory controller 113 and the FB-DIMM 10 and between the FB-DIMMs 10, and a “duplication of the serial interface” is performed. Here, the “duplication of the serial interface” includes the fail safe countermeasure that the memory access can be performed from the memory controller 113 to the FB-DIMM 10 even if an obstacle is generated in the serial interface signal line 20, and it provides, for example, (1) duplication of the write data, (2) duplication of the access path, (3) a duplication of the serial interface signal line, and the combination thereof. That is, as the “duplication of the serial interface”, there are the case in which a single serial interface between the memory controller 113 and the FB-DIMM 10 is duplicated (the above-mentioned case (3)), the case in which the access path due to the serial interface connection between the memory controller 113 and the FB-DIMM 10 is duplicated (the above-mentioned case (2)), or the case in which the same data is read/written from/to each of the different FB-DIMMs 10 to/from the memory controller 113 by two systematic serial interfaces (the above-mentioned case (1)). Hereafter, each pattern will be described.
The process for dually writing the same write data to each of the memory module groups 41, 42 will be explained. For example, the memory controller 113 accesses the control circuit 31 and/or the control circuit 32 through the serial interface signal lines 20-1, 20-2 and controls the switching control resource of the control circuit 31 and/or the control circuit 32, for example, the switching LSI, and adequately selects the access path to the memory module groups 41, 42 so as to dually write the same write data to the memory module groups 41, 42. Alternately, the memory controller 113 sends the write command, the address, and the write data to any one side or both sides of the control circuits 31 and 32, and the control circuit 31 or 32 will dually write the same write data as the write data received from the memory controller 113 to the memory module groups 41, 42. In the latter case, for example, the memory controller 113 sends the write command, the address, and the write data to the control circuit 31, and the control circuit 31 writes the write data to the first systematic memory module group 41. Again, the control circuit 31 transmits the write command, the address, and the write data to the control circuit 32 through the signal line 30, and the control circuit 32 writes the write data to the second systematic memory module group 42.
As the process of reading the write data from the memory module groups 41, 42 by the memory controller 113, various procedures can be considered. For example, the data may be read from any one of the systematic memory module groups 41, 42. Alternately, the data may be read from both systematic memory module groups 41, 42, whereby if both data are identical, it is determined that the data does not have an error, and, if both data are not identical, the error is corrected.
As a fail-safe countermeasure when an obstacle is generated in any one of the serial interface signal lines 20-1, 20-2, the below-mentioned process will be considered. For example, when an obstacle is generated in the serial interface signal line 20-1 between the memory controller 113 and the control circuit 31 (the point A in
On the other hand, in the case where obstacle has been generated at the point A, when the memory controller 113 performs a read-access, the read command and the address are sent to the control circuit 32 to read the data from the second systematic memory module group 42, or the read command and the address are transmitted from the control circuit 32 to the control circuit 31, such that the control circuit 31 read the data from the first systematic memory module group 41. In the case where an obstacle has been generated at the point B in addition to the point A, the memory controller 113 may access the second systematic memory module group 42 through the control circuit 32.
The buffer unit 12 of the FB-DIMM 10 comprises memory interface units 61, 62 and a DRAM access unit 63. The DRAM access unit 63 comprises an ECC generating unit 64 for generating the ECC. The memory interface units 61, 62 are composed of the above-mentioned pass-through circuits 13, 18. The DRAM access unit 63 is composed of the above-mentioned clock circuit 14, the deserializer/decode circuit 15, the data bus interface 16, and the serializer 17. The memory controller 113 comprises the memory interface units 51, 52 which are connected to the serial interface signal lines 20-1, 20-2, respectively.
When a write access to the FB-DIMM 10 is performed, the ECC generating unit 64 calculates the ECC depending on the write data. As the ECC, for example, a hamming code, with which a 2-bit error can be detected and 1-bit error can be detected and corrected is suitable. The ECC calculated by the ECC generating unit 64 is written to the DRAM 11-9 for storing the ECC. By applying the ECC function in the FB-DIMM 10, the ECC generating unit 64 need not be mounted on the memory controller 113, and, thus, the circuit size of the memory controller 113 can be reduced. If the circuit size of the memory controller 113 is reduced, the memory controller 113 can be mounted in the CPU 112, as shown in
The memory controller 113 is constructed such that memory access can be performed in the bi-direction of the loop with respect to the loop-connected FB-DIMM 10. In this way, by duplicating the access path to the FB-DIMM 10, although an obstacle may be generated in any one access path (memory controller 113→serial interface signal line 20-1→FB-DIMM 10), a memory access can be performed through the other access path (memory controller 113→serial interface signal line 20-2→FB-DIMM 10). For example, in a case in which an obstacle is generated between the memory controller 113 and the FB-DIMM 10 (the point C in
The interface switching unit 53 of the memory controller 113 selects any one of the memory interface units 51 and 52 to be used for the memory access according to the set content in the initial set register 54, when receiving a memory access request from the CPU 112. The initial set register 54 is set with information for which the memory controller 113 between the duplicated memory controllers 113 has an access right, as well as the information of the memory interface unit 51 or 52, which is generally used after performing a memory access. In the case in which the information for which the memory controller 113 has an access right is not set, the two duplicated memory controllers 113 may communicate with each other and determine which memory controller 113 has the access right. The process of switching the serial interface signal line 20-1, 20-2 used for the memory access is equal to the process of
Here,
When receiving a file access request of “write” from the information-processing device 1 (200), the channel control unit CHN1 (110) writes the file data to the memory 114 of the channel control unit CHN1 (110) and then writes the file data to the memory 114 of the channel control unit CHN2 (110) (S26). When a duplication of the file data is performed, the channel control unit CHN1 (110) writes the corresponding file data to the storage device 300. At this time, the channel control unit CHN1 (110) may write the corresponding file data to the storage device 300 after writing it to the cache memory 130, or it may directly write the corresponding file data to the storage device 300 without writing it to the cache memory 130. Here, in the case of writing the file data to the storage device 300 after writing it to the cache memory 130, the file data written to the cache memory 130 may be invalidated. Here, invalidation means allowing a dual storage of the data.
The channel control unit CHN1 (110) determines whether or not the file data to be read exists in the memory 114 of the channel control unit CHN1 (110), when receiving a file access request of “read” from the information-processing device 1 (200). In a case where the corresponding file data exists in the memory 114, the channel control unit CHN1 (110) transmits the file data read from the memory 114 to the information-processing device 1 (200). In a case where the file data to be read does not exist in the memory 114, the channel control unit CHN1 (110) determines whether the corresponding file data exists in the cache memory 130; and, if the corresponding file data exists in the cache memory 130, the channel control unit CHN1 (110) reads the corresponding file data from the cache memory 130 and transmits it to the information-processing device 1 (200). In a case where the corresponding file data does not exist in the cache memory 130, the channel control unit CHN1 (110) accesses the storage device 300, reads the corresponding file data therefrom, and transmits it to the information-processing device 1 (200). The channel control unit CHN1 (110) writes the file data acquired from the storage device 300 or the cache memory 130 to the memory 114 of the channel control unit CHN2 (110) (S26).
As an example of the process of dually writing the file data, there are a method A in which the file data temporarily stored in the memory 114 of the channel control unit CHN1 (110) is dually written to each of the cache memories 130-1, 130-2 and is not stored in the memory 114 of the channel control unit CHN1 (110), and a method B in which the data written to the memory 114 of the channel control unit CHN1 (110) is also written to the memory 114 of the channel control unit CHN2 (110) and is not dually written to the cache memories 130-1, 130-2. Also, in the case of employing the method B as the process of dually writing the file data, the number of wires in the connection unit 150 can be reduced, and, thus, the connection unit 150 can be miniaturized. Further, by dually writing the data to the memories 114 of the channel control units CHN1 and CHN2 (110), the response time for access from the information-processing device 1 (200) can be shortened.
In addition, in a case where the channel control unit CHN1 (110) receives a file access request of “write” from the information-processing device 1 (200), the file data (the write data) written to the storage device 300 may be dually written to the cache memories 130-1, 130-2, similar to the above-mentioned method A, or it may be dually written to the memories 114 of the channel control units CHN1 and CHN2 (110), similar to the above-mentioned method B. Also, in a case where the channel control unit CHN1 (110) receives a file access request of “read” from the information-processing device 1 (200), the file data (the read data) read from the cache memory 130 or the storage device 300 may be dually written to the cache memories 130-1, 130-2, similar to the above-mentioned method A, or it may be dually written to the memories 114 of the channel control units CHN1 and CHN2 (110), similar to the above-mentioned method B.
Also, after writing the data to the storage device 300, the data may be written to the cache memory 130 and then to the storage device 300, or the data may be directly written to the storage device 300 without being written to the cache memory 130. By storing data whose access frequency is high to the cache memory 130, the processing speed of the channel control unit 110 can be improved. In a case where the data is not written to the cache memory 130, the storage resource of the storage device control apparatus 100 has a small capacity.
The memory 114, in which the memory controller 113 and the serial interface are duplicated, is not limited to a temporary storage device of the CPU 112, and all of the temporary storage devices of the disk control apparatus 100 (for example, the shared memory 120, the cache memory 130, the memories 143 of the disk control units 140) can be applied.
The cache memories 130-1, 130-2 which make up the cluster can communicate with each other through the signal line 130a and can share data. In the case of sending the data between the cache memories 130-1, 130-2, since the distance therebetween is long, a skew may be generated in the cluster distribution type. Accordingly, in the present embodiment, the clock extracting type is employed in the communication between the cache memories 130-1, 130-2. More specifically, since memory 132 employs a clock distribution, the clock distribution type is switched to the clock extracting type in the interface between the cache memories 130-1, 130-2. The data signal transmitted from the memory controller 131 to the memory 132 is 8B/10B-encoded and is embedded with a clock signal. The converting circuit 133 extracts the embedded clock signal by 10B/8B-converting (decoding) the data signal. The converting circuits 133 included in the cache memories 130-1, 130-2 are connected to each other through the signal line 130a. The cache memories 130-1, 130-2 can communicate data with each other through the signal line 130a. For example, the memory controller 131 of the cache memory 130-1 can perform a memory access to the memory 132 of the cache memory 130-2. Besides, the cache memories 130-1, 130-2 perform a hard bit communication through the signal line 130a, and, thus, the obstacle of the concerned party can be detected.
In a pair of cache memories 130-1, 130-2, a duplication of data is performed such that the data written to one cache memory 130-1 is also written to the other cache memory 130-2. By this structure, although an obstacle may be generated in one cache memory 130-1, the storage system 600 can continuously perform the process, which has been performed until that time, by using the data stored in the cache memory 130-2. The data duplicated in the cache memories 130-1, 130-2 is, for example, block data written to the storage device 300.
As the process of dually writing the block data, there is a method in which the channel control unit CHN1 (110) writes the block data to both cache memories 130-1, 130-2. Alternatively, there is a method in which the channel control unit CHN1 (110) writes a write command of the block data to the shared memory 120 after writing the block data to only the cache memory 130-1, and then the memory controller 131 or the disk control unit 140, reading the command, writes the block data to the cache memory 130-2. In addition, there is a method in which the CPU for writing the block data is mounted on the cache memories 130-1, 130-2, the CPU reads the write command which the channel control unit CHN1 (110) writes to the shared memory 120, and the CPU writes the corresponding block data to the cache memory 130-2. Also, there is a method in which the disk control unit 140 writes a write command for the cache memory 130-2 to the shared memory 120, and the memory controller 131 (or the CPU for writing the block data) or the channel control unit CHN1 (110), reading the write command, writes the corresponding block data to the cache memory 130-2. By this structure, the number of wires in the connection unit 150 can be reduced, and the connection unit 150 can be miniaturized. The more the number of channel control units CHN, CHF, CHA (110) is increased, the more the effect is increased.
Also, in accessing any one of the cache memories 130-1, 130-2 by the channel control unit CHN1-CHN4 (110), the channel control unit CHN1-CHN4 (110) may be dynamically switched according to the access frequency for the cache memories 130-1, 130-2, or, alternatively, it may be accessed for the cache memories. Alternatively, one cache memory is set to an operation system, and the other cache memory is set to a waiting system and the cache memory which is the operating system is generally accessed. Thereby, in the case in which an obstacle is generated in one cache memory, a memory access from the other cache memory is performed.
In the case in which an obstacle is generated in the cache memory 130-1, the data of the cache memory 130-2 is preferably written to the storage device 300 by the disk control unit 140. By storing the same data as that stored in the cache memory 130-2 to the storage device 300, a duplication of the data can be performed, and, thus, the reliability of the storage system 600 can be ensured.
According to the structure shown in
Although various embodiments of the present invention have been described, the present invention is not limited to this. The present invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the invention as hereafter claimed. For example, the memory system 119 mentioned in the present embodiment is not limited to the storage device control apparatus 100, and a computer apparatus having high reliability can be applied. Also, the memory system 119 comprises the above-mentioned serial interface, thereby high reliability can be ensured and the circuit components can be easily mounted.
According to the present invention, the serial interface between the memory controller and the memory modules can be duplicated, thereby increasing the reliability of the storage device control apparatus.
Number | Date | Country | Kind |
---|---|---|---|
2004-249279 | Aug 2004 | JP | national |
This is a continuation of U.S. application Ser. No. 10/965,820, filed Oct. 18, 2004. This application relates to and claims priority from Japanese Patent Application No. 2004-249279, filed on Aug. 27, 2004. The entirety of the contents and subject matter of all of the above is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | 10965820 | Oct 2004 | US |
Child | 12132243 | US |