1. Technical Field
This invention generally relates to configuration of a memory controller in a computing system, and more specifically relates to configuration of a memory controller in a massively parallel super computer.
2. Background Art
Computer systems store information on many different types of memory and mass storage systems that have various tradeoffs between cost and speed. One common type of data storage on modern computer systems is dynamic random access memory (DRAM). Banks of DRAM require a memory controller between the memory and a computer processor that accesses the memory. The controller must be configured with specific parameters to control the access to the DRAM. One common type of DRAM is double data rate synchronous DRAM (DDR SDRAM). The memory controller for the DDR SDRAM is referred to as a DDR controller.
Massively parallel computer systems are one type of computer system that use DDR SDRAM memory and a DDR memory controller. A family of massively parallel computers is being developed by International Business Machines Corporation (IBM) under the name Blue Gene. The Blue Gene/L system is a scalable system in which the current maximum number of compute nodes is 65,536. The Blue Gene/P system is a similar scalable system under development. The Blue Gene/L node consists of a single ASIC (application specific integrated circuit) with 2 CPUs and memory. The full computer would be housed in 64 racks or cabinets with 32 node boards in each rack.
On a massively parallel super computer system like Blue Gene, the DDR controller must be properly configured to communicate with and control the SDRAM chips in the DDR memory. The configuration parameters for the DDR controller are often different depending on the type and manufacturer of the SDRAM. In the prior art, the DDR controller was configured with low level code loaded with a boot loader into the nodes of the massively parallel super computer. This required a different boot loader to be prepared and compiled depending on the type and manufacturer of the memory in the node boards, or for other memory controller parameters. Thus, for each system provided to a customer, or for a new replacement of node cards, a new boot loader needed to be prepared and compiled with the correct DDR controller parameters.
Without a way to more effectively configure the DDR controllers, super computers will require manual effort to reconfigure systems with different memory on the compute nodes thereby wasting potential computer processing time and increasing maintenance costs.
According to the preferred embodiments, a method and apparatus is described for configuration of a memory controller in a parallel computer system using an extensible markup language (XML) configuration file. In preferred embodiments an XML file with the operation parameters for a memory controller is stored in a bulk storage and used by the computers service node to create a personality. The personality has binary register data that is transferred to static memory in the compute nodes by the service node of the system. The binary register data is then used during the boot process of the compute nodes to configure the memory controller.
The disclosed embodiments are directed to the Blue Gene architecture but can be implemented on any parallel computer system with multiple processors. The preferred embodiments are particularly advantageous for massively parallel computer systems.
The foregoing and other features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings.
The preferred embodiments of the present invention will hereinafter be described in conjunction with the appended drawings, where like designations denote like elements, and:
The present invention relates to an apparatus and method for configuration of a DDR controller in a massively parallel super computer system using an XML configuration file. In preferred embodiments an XML file with the DDR settings is stored in a bulk storage and used by the computers service node to create DDR controller parameters in a personality file that is transferred to the compute nodes during the boot process. The preferred embodiments will be described with respect to the Blue Gene/L massively parallel computer being developed by International Business Machines Corporation (IBM).
The Blue Gene/L computer system structure can be described as a compute node core with an I/O node surface, where communication to 1024 compute nodes 110 is handled by each I/O node that has an I/O processor 170 connected to the service node 140. The I/O nodes have no local storage. The I/O nodes are connected to the compute nodes through the collective network and also have functional wide area network capabilities through a gigabit ethernet network. The connections to the compute nodes is similar to the connections to the compute node except the I/O nodes are not connected to the torus network.
Again referring to
The service node manages another private 100-Mb/s Ethernet network dedicated to system management through an Ido chip 180. The service node is thus able to control the system, including the individual I/O processors and compute nodes. This network is sometime referred to as the JTAG network since it communicates using the JTAG protocol. Thus, from the viewpoint of each I/O processor or compute node, all control, test, and bring-up is governed through its JTAG port communicating with the service node. This network is described further below with reference to
Again referring to
The Blue Gene/L supercomputer communicates over several additional communication networks. The 65,536 computational nodes are arranged into both a logical tree network and a logical 3-dimensional torus network. The logical tree network connects the computational nodes in a binary tree structure so that each node communicates with a parent and two children. The torus network logically connects the compute nodes in a three-dimensional lattice like structure that allows each compute node to communicate with its closest 6 neighbors in a section of the computer. Other communication networks connected to the node include a Barrier network. The barrier network uses the barrier communication system to implement software barriers for synchronization of similar processes on the compute nodes to move to a different phase of processing upon completion of some task. There is also a global interrupt connection to each of the nodes.
Additional information about the Blue Gene/L system, its architecture, and its software can be found in the IBM Journal of Research and Development, vol. 49, No. 2/3 (2005), which is herein incorporated by reference in its entirety.
Again referring to
The boot process for a node consists of the following steps: first, a small boot loader is directly written into the compute node static memory 230 by the service node using the JTAG control network. The boot loader then loads a much larger boot image into the memory of the node through a custom JTAG mailbox protocol. One boot image is used for all the compute nodes and another boot image is used for all the I/O nodes. The boot image for the compute nodes contains the code for the compute node kernel, and is approximately 128 kB in size. The boot image for the I/O nodes contains the code for the Linux operating system (approximately 2 MB in size) and the image of a ramdisk that contains the root file system for the I/O node. After an I/O node boots, it can mount additional file systems from external file servers. Since the same boot image is used for each node, additional node specific configuration information (such as torus coordinates, tree addresses, MAC or IP addresses) must be loaded separately. This node specific information is stored in the personality for the node. In preferred embodiments, the personality includes data for configuring the DDR controllers derived from an XML file as described herein. In contrast, in the prior art, the parameters setting for the controller parameter registers 255 were hardcoded into the boot loader. And thus, in the prior art, changing the parameters settings would require recoding and compilation of the boot loader code.
The DDR controller parameters include a variety of setting for the operation of the DDR controller. These settings include DDR memory timings parameters for memory chips from different manufacturers (e.g., CAS2CAS delays . . . and other memory settings), defective part workarounds such as steering data around a bad DDR chip and enabling special features of the DDR controller such as special modes for diagnostics. The parameters further may include memory interface tuning such as to optimize the DDR controller to favor writes vs. read operations, which might benefit certain types of users or applications. In addition, other parameters that may be used in current or future memory controllers are expressly included in the scope of the preferred embodiments.
The method 600 next looks to the steps that are performed in the compute nodes. The nodes start boot when released from reset by the control system (step 635). The personality for the node is read from the SRAM (step 640). The DDR controller is configured using the personality settings (step 645). The initialization of the compute node is then continued by launching the kernel as is known in the prior art (step 650). The method 600 is then complete.
As described above, embodiments provide a method and apparatus for configuration of a memory controller in a parallel super computer system. Embodiments herein allow the memory controller settings to be reconfigured easily without recompiling the boot loader to reduce costs and increase efficiency of the computer system.
One skilled in the art will appreciate that many variations are possible within the scope of the present invention. Thus, while the invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that these and other changes in form and details may be made therein without departing from the spirit and scope of the invention.