The need for improved “system processing speed” in electronic devices (e.g., computers) is ongoing. System processing speed is affected by various factors such as the number of processors, clock speeds and bus bandwidth. Furthermore, management of interconnections/requests between processors affects system processing speed. Furthermore, management of interconnections/requests between processors and external system components affects system processing speed.
For a detailed description of exemplary embodiments of the invention, reference will now be made to the accompanying drawings in which:
Certain terms are used throughout the following description and claims to refer to particular system components. As one skilled in the art will appreciate, computer companies may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following discussion and in the claims, the terms “including” and “comprising” are used in an open-ended fashion, and thus should be interpreted to mean “including, but not limited to . . . .” Also, the term “couple” or “couples” is intended to mean either an indirect, direct, optical or wireless electrical connection. Thus, if a first device couples to a second device, that connection may be through a direct electrical connection, through an indirect electrical connection via other devices and connections, through an optical electrical connection, or through a wireless electrical connection.
The following discussion is directed to various embodiments of the invention. Although one or more of these embodiments may be preferred, the embodiments disclosed should not be interpreted, or otherwise used, as limiting the scope of the disclosure, including the claims. In addition, one skilled in the art will understand that the following description has broad application, and the discussion of any embodiment is meant only to be exemplary of that embodiment, and not intended to intimate that the scope of the disclosure, including the claims, is limited to that embodiment.
Embodiments of the disclosure implement a paired node controller scheme to improve system processing speed. The paired node controller scheme can be used with any processor or computer system such as the Intel-manufactured Nehalem-EX processors. The paired node controller scheme may be implemented, for example, in a chipset that provides an interface between a processor and other system components of a computer system.
In at least some embodiments, the pair of node controllers 112A and 112B exist within a single physical device (e.g., a semiconductor chip), which may comprise additional pairs of node controllers as well. In accordance with at least some embodiments, each pair of node controllers in the chipset services a pair of processors, where each processor is directly linked to one of the node controllers, and to the other processor as well. Each paired processor may have local memory attached to it, and may be linked to additional devices as well (e.g., an input/output agent). The pair of processors, the pair of node controllers, and other locally linked devices and memory may be referred to as a “node.”
In
In at least some embodiments, requests to a remote memory (i.e., memory outside of the node where the request originates) are transmitted on a direct link from a processor to its physically linked node controller. However, upon reaching the node controller, a request may be switched over to the other node controller. More specifically, the node controller 112A comprises a link controller 122A that is directly linked to processor 104A. Requests by the processor 104A are received by the link controller 122A and then are forwarded to an inbound link router (PREC) 124A of the node controller 112A or, alternatively, to an inbound link router (PREC) 124B of the node controller 112B. The inbound link router 124A likewise may receive requests from the link controller 122B of the node controller 112B. The inbound link controller 124A operates to route requests received from link controllers 122A and 122B to a plurality of protocol blocks 126A-126N for transmission via a system fabric 130 to other system components 140 (e.g., additional node controllers, processors, etc.). As shown, the node controller 112B comprises its own inbound link controller 124B, which operates to route requests received from link controllers 122A and 122B to a plurality of protocol blocks 128A-128N for transmission via a system fabric 130 to other system components 140.
Responses en route to the processors 104A and 104B are received from the other system components 140 by the system fabric 130 and are organized into the protocol blocks 126A-126N or 128A-128N for handling by each PI2P logic 114A and 114B. As needed, incoming responses may be switched over to the other node controller in transit to the processor that issued the original request.
In accordance with at least some embodiments, the disclosed invention comprises a pair of node controllers (e.g., node controllers 112A and 112B), where each node controller has a unique node identifier for visibility by each of the paired processors 104A and 104B. Further, each node controller 112A and 112B selectively switches packets en route to the processors 104A and 104B from one node controller to the other (e.g., from one PI2P to the other). The packets may be switched, for example, via a pair of cross-connected buses 132 whose bandwidth is matched to outbound links 134A or 134B from the node controllers 112A and 112B.
In accordance with at least some embodiments, queues, arbitration logic, and switch logic within each PI2P of a node controller operate to handle packets en route to a processor. More specifically, each of the node controllers 112A and 112B may contain a PI2P which itself contains a plurality of queues for storing packets from within the node controller, or received from the other node controller. Further, each PI2P may contain a plurality of queues for storing packets to be forwarded to the other node controller or to the linked processor. Such queues for each PI2P are shown in
For arbitration, each PI2P 114A and 114B may comprise a first arbiter and a second arbiter. The first arbiter is embodied by the switch logic 118A or 118B and is for queued packets to be forwarded to the other node controller. In at least some embodiments, the first arbiter is configured to arbitrate among queued packets based on available buffer space in the other node controller. Meanwhile, the second arbiter is embodied by the outbound link arbiter 116A or 116B and is for queued local packets and queued packets received from the other node controller. In at least some embodiments, the second arbiter is configured to arbitrate among queued packets based on QPI link credit.
In QPI-link-based embodiments (e.g., with Nehalem-EX processors), switching of packets between node controllers 112A and 112B improves performance. This is because Nehalem-EX processors use a fixed number of QPI transaction identifiers (IDs) to issue requests to each node controller in the QPI domain. Making two node controllers visible to each processor doubles the number of requests each processor can have outstanding in the system. However, it necessitates switching of packets from one node controller to the other when a node controller receives a request from its connected processor that is directed to the other node controller, or when a node controller receives a response directed to the processor that is connected to the other node controller.
In
As shown
The above discussion is meant to be illustrative of the principles and various embodiments of the present invention. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.