The detailed description is provided with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of various embodiments. However, various embodiments of the invention may be practiced without the specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to obscure the particular embodiments of the invention.
The various components of processors 110 may be coupled by one or more communication busses 120A, 120B, 120C, 120D, 120E which will be referred to collectively herein by reference numeral 120. The various components of processors 130 may be coupled by one or more communication busses 140A, 140B, 140C, 140D, 140E which will be referred to collectively herein by reference numeral 140. Further, processors 110, 130 may be coupled by a communication bus 150. Electronic apparatus 100 further comprises a memory module 160 coupled to processors 110, 130 by communication busses 120E, 140E. In one embodiment, the communication busses 120, 130, and 150 may be implemented as point-to-point busses.
The processors 110, 130 may be any processor such as a general purpose processor, a network processor that processes data communicated over a computer network, or other types of a processor including a reduced instruction set computer (RISC) processor or a complex instruction set computer (CISC). The processing units 112, 132 may be implemented as any type of central processing unit (CPU) such as, e.g., an arithmetic logic unit (ALU).
The memory module 160 may be any memory such as, e.g., Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Random Operational Memory (ROM), or combinations thereof. The I/O modules 116, 136 may include logic to manage one or more input/output ports on the respective communication busses 120, 130, 150 and the memory module 160.
In one embodiment, cache memory units 114, 134 may be embodied as write-back cache modules. The cache modules 114, 134 temporarily stores data values modified by the respective processors 110, 130, thereby reducing the number of bus transactions required to write data values back to memory module 160. In the embodiment depicted in
The coherence controllers 118, 138 manage operations to maintain cache coherency in cache modules 114, 118. For example, when processing unit 112 modifies a data value, the modified data value exists in its cache module 114 before it is written back to memory 160. Thus, until the data value in cache module 114 is written back to the memory module 160, the memory module 160 and other cache units (such as cache 134) will contain a stale data value.
Coherence controllers 118, 138 may implement one or more techniques to maintain cache coherency between cache modules 114, 138 and memory module 160. Cache coherency techniques typically utilize coherency status information which indicates whether a particular data value in a cache unit is invalid, modified, shared, exclusively owned, etc. While many cache coherency techniques exist, two popular versions include the MESI cache coherency protocol and the MOESI cache coherency protocol. The MESI acronym stands for the Modified, Exclusive, Shared and Invalid states while the MOESI acronym stands for the Modified, Owned, Exclusive, Shared and Invalid states.
The meanings of the states vary from one implementation to another. Broadly speaking, the modified state usually means that a particular cache unit has modified a particular data value. The exclusive state and owned state usually means that a particular cache unit may modify a copy of the data value. The shared state usually means that copies of a data value may exist in different cache units, while the invalid state means that the data value in a cache unit is invalid.
In one embodiment, cache controllers 118, 138 snoop bus operations and use the coherency status information to ensure cache coherency. For example, assume that a first processor having a first cache unit desires to obtain a particular data value. Furthermore, assume that a second processor having a second cache unit contains a modified version of the data value (the coherency status information indicates that the data value in the second cache unit is in the modified state).
In this example, the first processor initiates a read bus request to obtain the data value. The second cache unit snoops the read bus request and determines that it contains the modified version of the data value. The second cache unit then intervenes and delivers the modified data value to the first processor via the common bus. Depending on the system, the modified data value may or may not be simultaneously written to the main memory.
In another example, assume that the first processor desires to exclusively own a particular data value. Furthermore, assume that a second cache unit contains an unmodified, shared copy of the data value (the coherency status information indicates that the data value in the second cache unit is in the shared state). In this example, the first processor initiates a read bus request which requests data for exclusive use.
The second cache unit snoops the read bus request and determines that it contains a shared copy of the data value. The second cache unit then invalidates its shared data value by changing the data value's coherency status information to the invalid state. Changing the data value's coherency status to the invalid state invalidates the data value within the second cache unit. The first processor then completes the read bus request and obtains a copy of the data value from main memory for exclusive use.
In an alternate embodiment, cache controllers 118, 138 may implement a bus broadcasting technique to maintain cache coherency. For example, in multiple-bus systems bus transactions initiated on each bus may broadcast to other buses in the system.
In an alternate embodiment, cache controllers 118, 138 may implement directory-based cache coherency methods. In directory techniques, the main memory subsystem maintains memory coherency by storing extra information with the data. The extra information in the main memory subsystem may indicate 1) which processor or processors have obtained a copy of a data value and 2) the coherency status of the data values. For example, the extra information may indicate that more than one processor shares the same data value. In yet another example, the extra information may indicate that only a single processor has the right to modify a particular data value.
When a processor requests a data value, the main memory subsystem determines whether it has an up-to-date version of the data value. If not, the main memory subsystem transfers the up-to-date data value from the processor with the up-to-date data value to the requesting processor. Alternatively, the main memory can indicate to the requesting processor which other processor has the up-to-date data value.
In an alternate embodiment, cache controllers 118, 138 may implement a bus interconnect cache coherency technique in which coherency status information associated with the data values which are stored in the respective cache units 114, 134. The particular cache coherency technique(s) implemented by the coherence controllers 118, 138 are beyond the scope of this disclosure.
In one embodiment, coherence controllers 118, 138 may be implemented as logical units such as, e.g., software or firmware executable on processors 110, 130. In alternate embodiments, coherence controllers may be implemented as logic circuitry on processors 110, 130.
Referring to
Input message queues may be directed to arbitration logic module 220, which arbitrates for access to processing pipeline 224. In one embodiment, pipeline 224 may be implemented as a multi-stage processing pipeline. Pipeline 224 may generate messages of variable bandwidth requirements, which may be sent to various destinations through an output port 244. The bandwidth requirement of these outgoing messages generally is not known when arbitration logic module 220 issues a messages from input side queues into pipeline 224.
In one embodiment, processing pipeline 224 maintains a request status file 228 that tracks the status of requests to coherence controller 200. Request status file 228 may track various stages of a request such as, for example, whether requests, responses, memory acknowledgments and the like are received, internal states etc. Request status file 228 may record a request identifier associated with each request and store a status identifier associated with the request identifier. In one embodiment, the status identifier may identify the request as pending, in-process, in the output queue, or transmitted to an output port.
Coherence controller 200 further includes an output issue logic module 236, which operates on the messages in the pipeline 224 to manage the release of messages from processing pipeline 224. Messages released from the output queue 232 are directed to a packet builder and output logic module 240, which places the output message into one or more data packets and outputs the data packet to an output port 244 for transmission across a data bus. In some embodiments, packet builder and output logic module 240 and output port 244 may be a component of an I/O module such as I/O modules 116, 136, rather than a component of a coherence controller 200.
Coherence controller 200 may further include one or more data buffers 248 that receive data from memory such as, e.g., memory module 160. For example, data buffer 248 may receive data from memory module 160, resulting from a read operation or a write data to be written to memory module 160.
Referring to
At operation 325 the type of output message to be generated is determined. For example, based on type of request or snoop response received in the input message queue 210 and memory ack queue 214, a message may be sent to a caching agent. When message is issued from arbitration logic 220, it is not aware of the type of original request and or the current status. For example, when only a message for a read request has been is received, no output message is generated. Similarly, when a snoop response is received and it is not last then no output message is generated. Similarly, if all responses are received but no memory ack is received then no output message is generated. By contrast, if all these operations are done then it will send a data message that may keep output port busy for some time.
If, at operation 330, an output port is available or if a bypass is possible (for example, if the output message queue 232 is empty or if the output message is given priority over messages in the output message queue 232), then control passes to operation 350 and the packet builder and output logic module 240 constructs a packet and sends the packet to an output port 244.
By contrast, if at operation 330 a port is not available (and the output port cannot be bypassed) then control passes to operation 335 and the message is queued in the output message queue 232. At operation 340 the output issue logic module 236 waits for an output port to become available, whereupon control passes to operation 350 and the packet builder and output logic module 240 constructs a packet and sends the packet to an output port 244. In some embodiments the output issue logic module 236 may compare a bandwidth requirement associated with the message to an amount of bandwidth available on the output port to determine whether an output port is available.
In embodiments, the system of
A chipset 406 may also be in communication with the interconnection network 404. The chipset 406 may include a memory control hub (MCH) 408. The MCH 408 may include a memory controller 410 that communicates with a memory 412. The memory 412 may store data and sequences of instructions that are executed by the CPU 402, or any other device included in the computing system 400. In one embodiment of the invention, the memory 412 may include one or more volatile storage (or memory) devices such as random access memory (RAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), static RAM (SRAM), or other types of memory. Nonvolatile memory may also be utilized such as a hard disk. Additional devices may communicate through the interconnection network 404, such as multiple CPUs and/or multiple system memories.
The MCH 408 may also include a graphics interface 414 that communicates with a graphics accelerator 416. In one embodiment of the invention, the graphics interface 414 may be in communication with the graphics accelerator 416 via an accelerated graphics port (AGP). In an embodiment of the invention, a display (such as a flat panel display) may communicate with the graphics interface 414 through, for example, a signal converter that translates a digital representation of an image stored in a storage device such as video memory or system memory into display signals that are interpreted and displayed by the display. The display signals produced by the display device may pass through various control devices before being interpreted by and subsequently displayed on the display.
A hub interface 418 may allow the MCH 408 to communicate with an input/output control hub (ICH) 420. The ICH 420 may provide an interface to I/O devices that communicate with the computing system 400. The ICH 420 may communicate with a bus 422 through a peripheral bridge (or controller) 424, such as a peripheral component interconnect (PCI) bridge, a universal serial bus (USB) controller, or other types of a bus. The bridge 424 may provide a data path between the CPU 402 and peripheral devices. Other types of topologies may be utilized. Also, multiple buses may communicate with the ICH 420, e.g., through multiple bridges or controllers. Moreover, other peripherals in communication with the ICH 420 may include, in various embodiments of the invention, integrated drive electronics (IDE) or small computer system interface (SCSI) hard drive(s), USB port(s), a keyboard, a mouse, parallel port(s), serial port(s), floppy disk drive(s), digital output support (e.g., digital video interface (DVI)), or other types of peripherals.
The bus 422 may communicate with an audio device 426, one or more disk drive(s) 428, and a network interface device 430 (which may be in communication with the computer network 403). Other devices may communicate through the bus 422. Also, various components (such as the network interface device 430) may be in communication with the MCH 408 in some embodiments of the invention. In addition, the processor 402 and the MCH 408 may be combined to form a single chip. Furthermore, the graphics accelerator 416 may be included within the MCH 408 in other embodiments of the invention.
Furthermore, the computing system 400 may include volatile and/or nonvolatile memory (or storage). For example, nonvolatile memory may include one or more of the following: read-only memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), electrically EPROM (EEPROM), a disk drive (e.g., 428), a floppy disk, a compact disk ROM (CD-ROM), a digital versatile disk (DVD), flash memory, a magneto-optical disk, or other types of nonvolatile machine-readable media capable of storing electronic instructions and/or data.
As illustrated in
The processors 502 and 504 may be any type of a processor such as those discussed with reference to the processors 402 of
At least one embodiment of the invention may be provided within the processors 502 and 504. Other embodiments of the invention, however, may exist in other circuits, logic units, or devices within the system 500 of
The chipset 520 may be in communication with a bus 540 using a PtP interface circuit 541. The bus 540 may have one or more devices that communicate with it, such as a bus bridge 542 and I/O devices 543. Via a bus 544, the bus bridge 543 may be in communication with other devices such as a keyboard/mouse 545, communication devices 546 (such as modems, network interface devices, or other types of communication devices that may be communicate through the computer network 603), audio I/O device, and/or a data storage device 548. The data storage device 548 may store code 549 that may be executed by the processors 502 and/or 504.
In various embodiments of the invention, the operations discussed herein, e.g., with reference to
Additionally, such computer-readable media may be downloaded as a computer program product, wherein the program may be transferred from a remote computer (e.g., a server) to a requesting computer (e.g., a client) by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem or network connection). Accordingly, herein, a carrier wave shall be regarded as comprising a machine-readable medium.
Reference in the specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least an implementation. The appearances of the phrase “in one embodiment” in various places in the specification may or may not be all referring to the same embodiment.
Also, in the description and claims, the terms “coupled” and “connected,” along with their derivatives, may be used. In some embodiments of the invention, “connected” may be used to indicate that two or more elements are in direct physical or electrical contact with each other. “Coupled” may mean that two or more elements are in direct physical or electrical contact. However, “coupled” may also mean that two or more elements may not be in direct contact with each other, but may still cooperate or interact with each other.
Thus, although embodiments of the invention have been described in language specific to structural features and/or methodological acts, it is to be understood that claimed subject matter may not be limited to the specific features or acts described. Rather, the specific features and acts are disclosed as sample forms of implementing the claimed subject matter.
Number | Date | Country | Kind |
---|---|---|---|
1547/DEL/2006 | Jun 2006 | IN | national |