The invention relates to computer system memory. More specifically, the invention relates to translating virtual addresses to physical memory addresses based on memory bank designation.
Present computer systems have increasingly complex configurations. Not only is there a central processor executing software application code, but it is becoming more common to have two or more processors in a computer system. A second processor may be a fully independent central processor or it could also be another agent within the system that performs more specialized functions such as graphics processors, network processors, system management processors, or any one of a number of other types of processors. Depending on the system configuration, a system with two or more processors may share the system memory. This can create efficiency problems because two or more processors with an equal or near equal arbitration policy can potentially lead to a phenomenon called memory page thrashing.
In one possible system configuration, there are two processors. Both processors share a single channel double data rate (DDR) direct inline memory module (DIMM). A single channel DDR DIMM is limited to eight memory banks. Though there are eight banks, only four of the eight banks can be open at any given time. A bank is open when there is a particular page within the bank that is open and accessible by one of the two processors. When two processing agents access a single channel of memory, they end up competing for the same set of banks. This results in frequent page open and close operations, affecting the pipelined memory throughput.
For example, consider two processors, processor 1 and processor 2, doing a burst read to the same bank. Processor 1 opens a page, Page 0, in Bank 0 and reads a cache line. In the case of a 50% arbitration policy between the two processors, in the next cycle, processor 2 closes Page 0 and opens another page, Page 1, to read a cache line. This is followed by processor 1 which closes Page 1 and opens Page 0 to continue its burst. Even though the two processors are not interacting often with each other, they are hurting each other's performance and also bringing down the memory system efficiency. This creates a page thrashing phenomenon because only one page per bank can be open at any given time.
The present invention is illustrated by way of example and is not limited by the figures of the accompanying drawings, in which like references indicate similar elements, and in which:
Embodiments of a method, device, and system for an address translation scheme based on bank address bits for a multi-processor, single channel memory system are disclosed. In the following description, numerous specific details are set forth. However, it is understood that embodiments may be practiced without these specific details. In other instances, well-known elements, specifications, and protocols have not been discussed in detail in order to avoid obscuring the present invention.
Processor-memory interconnect 100 provides processor 1 (102), processor 2 (104), and other devices access to the memory subsystem. In one embodiment, a memory controller and address translator unit 108 that controls access and translates addresses to system memory 110 is located on the same chip as processor-memory bridge 106. In another embodiment, there are two memory controllers, each of which are located on the same chip as processor 1 (102) and processor 2 (104) respectively (multiple memory controllers are not shown in this figure). Information, instructions, and other data may be stored in system memory 110 for use by processor 1 (102), processor 2 (104), as well as many other potential devices. In one embodiment, a graphics processor 112 is coupled to processor-memory bridge 106 through a graphics interconnect 114.
I/O devices, such as I/O device 120, are coupled to system I/O interconnect 118 and to processor-memory interconnect 100 through I/O bridge 116 and processor-memory bridge 106. In different embodiments, I/O device 120 could be a network interface card, an audio device, or one of many other I/O devices. I/O Bridge 116 is coupled to processor-memory interconnect 100 (through processor-memory bridge 106) and system I/O interconnect 118 to provide an interface for a device on one interconnect to communicate with a device on the other interconnect.
In one embodiment, system memory 110 is a direct inline memory module (DIMM). In different embodiments, the DIMM could be a double data rate (DDR) DIMM, a DDR2 DIMM, or any other of a number of types of memories that implement a memory bank scheme. In one embodiment, there is only one DIMM module residing in the system. In another embodiment, there are multiple DIMM modules residing in the system. In different embodiments, a DDR DIMM can be a single channel DDR DIMM or a multi-channel DDR DIMM.
Now turning to the next figure,
Now turning to the next figure,
Thus, in this embodiment the mutually exclusive banks of memory, accessible to either processor 1 (300) or processor 2 (302) but not both processors, eliminates the page thrashing issue described above in reference to
In
In this embodiment, the memory controller and address translator 404 allocates memory Banks 0-5 (406) to be accessible only by processor 1 (400) and allocates memory Banks 6-7 (408) to be accessible only by processor 2 (402). Thus, in this embodiment, out of the entire amount of physical memory present in the computer system, processor 1 (400) is allocated 75% of the memory banks and processor 2 (402) is allocated 25% of the memory banks.
It may be beneficial to additionally have one or more memory banks in a DIMM accessible by both processors in the computer system. Thus, in yet another embodiment, one or more banks are allocated to be accessible by both processors. Now turning to the next figure,
In this embodiment, the memory controller and address translator 504 allocates memory Banks 0-3 (506) to be accessible only by processor 1 (500), allocates memory Banks 5-7 (508) to be accessible only by processor 2 (502), and allocates memory Banks 3-4 to be accessible by both processors 1 (500) and 2 (502). Thus, in this embodiment, out of the entire amount of physical memory present in the computer system, processor 1 (500) is allocated exclusive use of 37.5% of the memory banks, processor 2 (502) is allocated exclusive use of 37.5% of the memory banks, and processors 1500) and 2 (502) are allocated shared use of the remaining 25% of the memory banks. In another embodiment, there is a single central processor and a second device that is not a central processor accessing the memory. In yet another embodiment, there are two devices accessing the memory, both of which are not central processors. In yet another embodiment, there are more than two devices or processors accessing the memory. The descriptions of the embodiments in reference to
Turning now to the next figure,
When the OS boots up in the bootstrap processor (processor 1), it assigns virtual memory space up to 4 GB to each process that runs in processor 1 (600). The Virtual to Physical (V2P) mapping table maps this virtual memory address space to the physical memory address space (602). In this embodiment, the OS sees processor 2 as a PCI device and assigns virtual memory space 604 (up to 4 GB) to the device driver which drives processor 2. In this embodiment, the processor 2 driver is a special kernel process which is allowed to lock a window, Window 1, in the physical memory address space 602, which corresponds to a portion of physical memory 606 (see Window 1 at bottom of physical memory 606). Window 1 is never swapped out of the physical memory by other processes. The physical addresses for the memory accesses of processor 2 are contained within Window 1.
In one embodiment, the address for the beginning of Window 1, as well as its length, can be written and stored as a register file in the memory controller and address translator unit. In other embodiments, these values can be stored within or external to the memory controller and address translator unit. In addition, the values can be stored in any medium capable of storing information for a computer system.
Returning to
The embodiment in
Now turning to the next figure,
The process continues by processing logic mapping at least one bank in the memory for exclusive use by a second device (processing step 702). The second device can also be a central processor, a network processor, a graphics processor, a system management processor, or any other type of relevant processor or device that can be a bus master. In different embodiments, either processor 1 or processor 2 would likely be a boot strap processor to load the OS, though this is not necessary if there is an additional processor apart from processors 1 and 2 to accomplish this task. The process is finished at this point. In one embodiment, this process is implemented during system boot up. In another embodiment, in a system with more than two processors, this process could continue by designating more banks of memory to be exclusively used by additional processors.
Now turning to the next figure,
Finally, the process concludes by processing logic translating the second target physical address to a bank-specific physical address in a second memory bank in a memory device, wherein the second device has exclusive access to the second memory bank (processing step 806). In many embodiments, this process may take place multiple times. In different embodiments, processing steps 800 and 802 may repeat multiple times prior to processing steps 804 and 806 taking place (or vice versa) if the first device and second device are not set up with a 50% arbitration policy. In different embodiments, devices 1 and 2 can be any of the processor devices described above in reference to
Thus, embodiments of a method, device, and system for an address translation scheme based on bank address bits for a multi-processor, single channel memory system are disclosed. These embodiments have been described with reference to specific exemplary embodiments thereof. Though, the device, method, and system may be implemented with any given protocol with any number of layers. It will be evident to persons having the benefit of this disclosure that various modifications and changes may be made to these embodiments without departing from the broader spirit and scope of the embodiments described herein. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.