The present disclosure relates generally to computer memory architecture, and in particular, to an environment for managing mirrored memory.
Memory mirroring is the practice of creating and maintaining a copy of original data in system memory. Within a mirrored environment, significant portions of the physical memory may be designated as mirrored memory. The allocation of physical memory to mirrored memory may represent a significant manufacturing cost and may limit overall memory storage. Although memory mirroring offers increased system reliability, the reduction in memory capacity may result in reduced system performance.
In a particular embodiment, a method to manage memory includes storing data in a primary memory in communication with a processor. Mirrored memory that mirrors the data stored in the primary memory may be stored. The mirrored memory may be compressed and in communication with the processor. A failure condition associated with the data of the primary memory may be detected. In response to the detected failure condition, the mirrored data in the mirrored memory may be accessed.
In another embodiment, an apparatus is disclosed that includes a primary memory storing data and a mirrored memory storing mirrored data that includes a copy of the data. The mirrored data may be compressed. A processor may be in communication with both the primary memory and the mirrored memory, and program code may be configured to be executed by the processor to detect a failure condition associated with the data of the primary memory, and in response to the detected failure condition, to access the mirrored data in the mirrored memory.
In another embodiment, a program product includes program code to be executed by a processor in communication with both a primary memory storing data and a mirrored memory storing mirrored data including a copy of the data. The program code may be executed to detect a failure condition associated with the data of the primary memory, and in response to the detected failure condition, to access the mirrored data in the mirrored memory. A computer readable storage medium may bear the program code.
An embodiment may use memory compression techniques with mirrored memory to increase the amount of mirrored and primary memory available in a system. Because data in the primary memory is uncompressed, there may be no memory access latency penalty due to decompression under normal operating conditions. Memory space and capacity made available by the memory compression may be used for additional primary memory. The reduction in additional physical memory may reduce the cost overhead of memory mirroring. The decreased costs may encourage use of memory mirroring in a broad range of computing applications.
Features that characterize embodiments are set forth in the claims annexed hereto and forming a further part hereof. However, for a better understanding of embodiments, and of the advantages and objectives attained through their use, reference should be made to the Drawings and to the accompanying descriptive matter.
A particular embodiment uses memory compression to lower the size and cost associated with memory mirroring by compressing mirrored memory and not primary memory. The mirrored data may only be accessed when the primary memory fails. For normal operations (e.g., operations having no or an acceptable number of errors), there may be no decompression latency or performance overhead when uncompressed primary data is accessed.
Errors may be detected in the primary memory. In response to the detected failure, a processor may seamlessly transition from executing out of the uncompressed primary memory to the compressed mirrored memory. The processor may continue to operate without crashing when the primary memory fails, though at a degraded level of performance due to the decompression latency overhead. The defective primary memory may be repaired or replaced at a convenient time to return the computer back to full performance.
Switching from the primary memory to the mirrored memory when a failure is detected may be accomplished by hardware memory control logic, a hypervisor, or by an operating system. Virtual addresses may be changed to point from the primary physical address to the physical address in the mirrored memory that contains the copy of data. Memory compression techniques may be used to reduce the primary memory size when memory compression ratios do not allow the full primary memory size to fit into the compressed mirrored size.
In a particular embodiment, switching from primary memory to mirrored memory may be conditional on the type of memory failure and/or the frequency of memory failures. For example, an uncorrectable error may always prompt a switch from primary memory to mirrored memory. In contrast, a single bit correctable error may not automatically cause a switch. However, a high number of single bit errors may prompt a switch from primary to mirrored memory.
Memory space and capacity made available by the memory compression may be used for additional primary memory. The compressed mirrored memory may be configured to be about a quarter or less of the physical size of the primary memory. The reduction in additional physical memory reduces the cost overhead of memory mirroring. The decreased costs may encourage use of memory mirroring in a broad range of computing applications. Memory compression techniques may thus be used with mirrored memory to reduce the cost of mirroring and to increase the amount of mirrored and primary memory available in a system.
Turning more particularly to the drawings,
The apparatus 100 generally includes one or more physical processors 102-104 coupled to an input/output (I/O) hub 116. Each processor 102-104 may directly and respectively attach to a memory 117-119, e.g., an array of dual in-line memory modules (DIMMs). The physical processors 102-104 may be multithreaded. Multithreading enables the physical processors 102-104 to concurrently execute different portions of program code. The processors 102, 103 may be in communication with a memory controller 113 that is coupled to an additional memory 114. A buffer 115 and an additional memory 121 may be coupled to the processor 104.
The I/O hub 116 may further couple to a number of types of external I/O devices via a system bus 118 and a plurality of interface devices. Illustrative I/O devices include a bus attachment interface 120, a workstation controller 122, and a storage controller 124. Such I/O devices may respectively provide external access to one or more external networks 126, one or more workstations 128, and/or one or more storage devices, such as a direct access storage device (DASD) 129.
The partitions 201-203 may logically comprise a portion of a system's physical CPUs 205, 206, DASD 268, and other resources, as assigned by an administrator. Each partition 201-203 typically hosts an operating system 215-217 that include the virtual processors 207-212. Each partition 201-203 may operate as if it is a separate computer. As shown in
An underlying program called a hypervisor 246, or partition manager, may assign physical resources to each partition 201-203. In virtualization technology, the hypervisor 246 may manage the operating systems 215-217 (or multiple instances of the same operating system) on a single computer system. The hypervisor 246 may manage the system's processor, memory, and other resources to allocate resources to each operating system 215-217. For instance, the hypervisor 246 may intercept requests for resources from the operating systems 215-217 to globally share and allocate resources. If the partitions 201-203 are sharing processors, the hypervisor 246 may allocate physical processor cycles between the virtual processors 207-212 of the partitions 201-203 sharing one or more of the CPUs 205, 206.
The hypervisor 246 may include a memory mirroring program 253 configured to transition from uncompressed, primary memory, (such as may be stored at DIMM 248) to compressed, mirrored memory (e.g., at DIMM 249). The memory mirroring program 253 may transition CPU accesses to the mirrored memory when a failure is detected in the primary memory. In another embodiment, a memory mirroring program may be included within an operating system.
Each operating system 215-217 may control the primary operations of its respective logical partition 201-203 in the same manner as the operating system of a non-partitioned computer. Each logical partition 201-203 may execute in a separate memory space, represented by virtual memory 250-252. Moreover, each logical partition 201-203 may be statically and/or dynamically allocated a portion of available resources in the system 200. For example, each logical partition 201-203 may share one or more of the CPUs 205, 206, as well as a portion of the available memory space for use in virtual memory 250-52. In this manner, a given CPU 205, 206 may be utilized by more than one of the logical partitions 201-203.
The hypervisor 246 may include a dispatcher 255 that manages the dispatching of virtual processors 207-212 to the CPUs 205, 206 on a dispatch list, or ready queue 247. The ready queue 247 comprises memory that includes a list of the virtual processors 207-212 having work that is waiting to be dispatched on a CPU 205, 206. The hypervisor 246 shown in
Additional resources, e.g., mass storage, backup storage, user input, network connections, and the like, are typically allocated to one or more logical partitions in a manner well known in the art. Resources may be allocated in a number of manners, e.g., on a bus-by-bus basis, or on a resource-by-resource basis, with multiple logical partitions sharing resources on the same bus. Some resources may be allocated to multiple logical partitions 201-213 at a time.
The bus 264 may have resources allocated on a resource-by-resource basis, e.g., with a local area network (LAN) adaptor 276, optical disk drive 278 and a DASD 280 allocated to logical partition 202, and LAN adaptors 282, 284 allocated to logical partition 203. The bus 266 may represent, for example, a bus allocated specifically to the logical partition 203, such that all resources on the bus 266 (e.g., the DASDs 286, 288), are allocated to the same logical partition 203. The hardware shown in
The first processor 302 of
In the embodiment of
The array of DIMM mirrored memory 320 may mirror the first array of DIMM primary memory 306, as indicated by arrow 326. The array of DIMM mirrored memory 320 may include compressed data. For example, the array of DIMM mirrored memory 320 may include a compressed version of the data in the first array of DIMM primary memory 306. The compression ratio of the compressed data in the array of DIMM mirrored memory 320 (e.g., as compared to the uncompressed data in the first array of DIMM primary memory 306) may be about four to one.
The second memory 308 may occupy less space than the third memory 314 by virtue of having the compressed, mirrored data in the first DIMM mirrored memory 310. Alternatively, the space savings attributable to the compressed, mirrored data in the array of DIMM mirrored memory 320 in the fourth memory 318 may provide space for additional primary memory (e.g., the third array of DIMM primary memory 322).
As indicated by arrow 328, the array of DIMM mirrored memory 320 may mirror the second array of DIMM primary memory 316. For example, the second processor 312 may replicate and compress data of the second array of DIMM primary memory 316 for storage in the array of DIMM mirrored memory 320. The second processor 312 may operate out of, or access, the second array of DIMM primary memory 316 during normal (e.g., expected) runtime operations. The detection of an error or other fault condition may cause the second processor 312 to operate out of the array of DIMM mirrored memory 320.
One or more of the first processor 302 and the second processor 312 may include compression logic to compress data when stored in the first DIMM mirrored memory 310 and the array of DIMM mirrored memory 320. When being compressed, data may be encoded (represented as symbols) to take up less space. In a particular embodiment, memory compression may expand memory capacity about four times without increasing actual physical memory. Memory compression may be measured in terms of its associated compression ratio. The compression ratio is the quotient of memory space required by uncompressed data relative to the smaller amount of memory space required by compressed data.
At least one of a hypervisor and an operating system, such as the hypervisor 246 and the operating system 217 of
One or more of the first processor 302 and the second processor 312 may be configured to coordinate memory ballooning processes. For instance the first processor 302 may limit storage space in the first array of DIMM primary memory 306 to reduce the amount of storage used in the array of DIMM mirrored memory 320. Such an action may be initiated in response to a determination that the array of DIMM mirrored memory 320 does not have adequate storage capacity for the data in the first array of DIMM primary memory 306. As such, the storage capacity of the first array of DIMM primary memory 306 may be limited based on the storage capacity of the array of DIMM mirrored memory 320.
The buffer 413 may be coupled in between the second processor 412 and the third memory 414 that includes a second array of DIMM primary memory 416. The third buffer 417 may be coupled in between the second processor 412 and the fourth memory 418 that includes the array of DIMM mirrored memory 420 and the third array of DIMM primary memory 422. A processor bus 407 may be coupled to both the first processor 402 and the second processor 412.
In the embodiment of
The array of DIMM mirrored memory 420 may mirror the first array of DIMM primary memory 406, as indicated by arrow 426. The array of DIMM mirrored memory 420 may include compressed data. For example, the array of DIMM mirrored memory 420 may include a compressed version of the data in the first array of DIMM primary memory 406. The compression ratio of the compressed data in the array of DIMM mirrored memory 420 as compared to the uncompressed data in the first array of DIMM primary memory 406 may be about four to one, as with the first DIMM mirrored memory 410.
The second memory 408 may occupy less space by virtue of the compressed, mirrored data in the first DIMM mirrored memory 410. Alternatively, the space savings attributable to the compressed, mirrored data in the array of DIMM mirrored memory 420 in the fourth memory 418 may provide space for additional primary memory, such as the third array of DIMM primary memory 422.
As indicated by arrow 428, the array of DIMM mirrored memory 420 may mirror the second array of DIMM primary memory 416. For example, one or more of the processors 402, 412 the memory controller 403, and the buffers 405, 413, 417 may replicate and compress data of the second array of DIMM primary memory 416 to be stored in the array of DIMM mirrored memory 420. The second processor 412 may operate out of, or access, the second array of DIMM primary memory 416 during normal runtime operations. The detection of an error or other fault condition may cause the second processor 412 to operate out of the array of DIMM mirrored memory 420.
One or more of the first processor 402, the second processor 412, the memory controller 403, the buffer 405, the buffer 413, and the buffer 417 may include compression logic to compress data when stored in the first DIMM mirrored memory 410 and the array of DIMM mirrored memory 420. At least one of a hypervisor and an operating system, such as the hypervisor 246 and the operating system 215 of
One or more of the first processor 402, the second processor 412, the memory controller 403, the buffer 405, the buffer 413, and the buffer 417 may further be configured to coordinate memory ballooning processes. For instance, storage space in the first array of DIMM primary memory 406 may be limited in order to reduce the amount of storage used in the array of DIMM mirrored memory 420. Such an action may be initiated in response to a determination that the array of DIMM mirrored memory 420 does not have adequate storage capacity for the data in the first array of DIMM primary memory 406. As such, the storage capacity of the first array of DIMM primary memory 406 may be limited based on the storage capacity of the array of DIMM mirrored memory 420.
An error in the primary memory may be detected, at 504. For instance, error correction codes may be evaluated to determine if the data read from the primary memory includes uncorrectable errors or a high number of correctable errors. In
At 506, the processor may operate out of mirrored compressed memory. In response to a detected failure in the primary memory, the processor may access the mirrored compressed memory. For example, the first processor 302 of
At 508, the method 500 may include determining if the primary memory has been repaired or replaced. The processor may access the mirrored memory until the primary memory can be repaired or replaced. For instance, the first processor 302 of
Particular embodiments described herein may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In a particular embodiment, the disclosed methods are implemented in software that is embedded in processor readable storage medium and executed by a processor, which includes but is not limited to firmware, resident software, microcode, etc. For example, switching between primary memory and mirrored memory may be implemented using hardware logic on memory interface silicon in a manner that is transparent to both the hypervisor and operating system software. Embodiments may include features where hardware controls the switch between the physical memory it is operating out of (e.g., either primary or mirrored memory).
Further, embodiments of the present disclosure, such as the one or more embodiments may take the form of a computer program product accessible from a computer-usable or computer-readable storage medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer-readable storage medium can be any apparatus that can tangibly embody a computer program and that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
In various embodiments, the medium can include an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable storage medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W) and digital versatile disk (DVD).
A data processing system suitable for storing and/or executing program code may include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the data processing system either directly or through intervening I/O controllers. Network adapters may also be coupled to the data processing system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modems, and Ethernet cards are just a few of the currently available types of network adapters.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the disclosed embodiments. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the scope of the disclosure. For example, an embodiment may include multiple processors connected to a single memory controller, either using separate processor busses from each processor to the memory controller, or using a single shared system bus that is connected to all processors and the memory controller. In another example, embodiments may facilitate partial mirroring applications, where only a portion of the memory is stored in a specified memory address range that is to be mirrored. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope possible consistent with the principles and features as defined by the following claims.
Number | Name | Date | Kind |
---|---|---|---|
5255382 | Pawloski | Oct 1993 | A |
5555392 | Chaput et al. | Sep 1996 | A |
5812817 | Hovis | Sep 1998 | A |
6658549 | Wilson | Dec 2003 | B2 |
6681305 | Franke et al. | Jan 2004 | B1 |
6766429 | Bland et al. | Jul 2004 | B1 |
6775751 | Tremaine | Aug 2004 | B2 |
6944740 | Abali et al. | Sep 2005 | B2 |
6956507 | Castelli et al. | Oct 2005 | B2 |
6961821 | Robinson | Nov 2005 | B2 |
6968424 | Danilak | Nov 2005 | B1 |
7225307 | Micka et al. | May 2007 | B2 |
7293195 | Watanabe et al. | Nov 2007 | B1 |
7302543 | Lekatsas et al. | Nov 2007 | B2 |
7360024 | Hironaka et al. | Apr 2008 | B2 |
7636810 | Ramakrishnan | Dec 2009 | B2 |
7836020 | Rao et al. | Nov 2010 | B1 |
7930483 | Borkenhagen | Apr 2011 | B2 |
7949865 | Yadav | May 2011 | B1 |
7984240 | Borkenhagen | Jul 2011 | B2 |
8037251 | Borkenhagen | Oct 2011 | B2 |
8090908 | Bolen et al. | Jan 2012 | B1 |
8205043 | Bolen et al. | Jun 2012 | B2 |
8495267 | Abali et al. | Jul 2013 | B2 |
20010001872 | Singh et al. | May 2001 | A1 |
20040264256 | Mauritz et al. | Dec 2004 | A1 |
20060069879 | Inoue et al. | Mar 2006 | A1 |
20060101206 | Wood et al. | May 2006 | A1 |
20080022048 | Baker | Jan 2008 | A1 |
20090228635 | Borkenhagen | Sep 2009 | A1 |
20090228656 | Borkenhagen | Sep 2009 | A1 |
20090228664 | Borkenhagen | Sep 2009 | A1 |
20090228668 | Borkenhagen | Sep 2009 | A1 |
20090300414 | Huang et al. | Dec 2009 | A1 |
Number | Date | Country |
---|---|---|
1400899 | Mar 2004 | EP |
Entry |
---|
Tremaine et al., IBM Memory Expansion Technology (MXT), IBM Journal of Research and Development, vol. 45, No. 2, Mar. 2001, (pp. 271-285). |
International Search Report and Written Opinion of the ISA dated Mar. 14, 2012—International Application No. PCT/EP2011/070114. |
Number | Date | Country | |
---|---|---|---|
20120124415 A1 | May 2012 | US |