The invention relates to a data processing system on at least one integrated circuit, the data processing system comprising at least two modules and a network arranged to transmit data between the modules, the data processing system being arranged to apply a flow control scheme for synchronizing data traffic between the modules, wherein the network comprises a first sub-network and a second sub-network, the first sub-network and the second sub-network having different operating conditions.
The invention also relates to a method for synchronizing data traffic in a data processing system on at least one integrated circuit, the data processing system comprising at least two modules and a network which transmits data between the modules, wherein the data processing system applies a flow control scheme for synchronizing data traffic between the modules, wherein the network comprises a first sub-network and a second sub-network, the first sub-network and the second sub-network having different operating conditions
Networks-on-Chip (NoC's) have been proposed and widely accepted as an adequate solution for the problems relating to the interconnection of modules on highly complex chips. Compared to conventional interconnect structures such as single busses or hierarchies of busses, the network concept offers a number of important advantages. For example, (i) networks are able to structure and manage wires in deep sub-micron technologies satisfactorily, (ii) they allow good wire utilization through sharing, (iii) they scale better than busses, (iv) they can be energy-efficient and reliable, and (v) they decouple computation from communication through well-defined interfaces, which enables that the modules and the interconnect structure can be designed in isolation and integrated relatively easily.
A Network-on-Chip typically comprises a plurality of routers, which form the nodes of the network and which are arranged to transport and route the data through the network. Furthermore, the network is usually equipped with so-called network interfaces, which implement the interface between the modules connected to the network and the network itself. The modules are usually categorized into master modules and slave modules. The master modules send request messages to the slave modules, for example a request message comprising a write command accompanied by data which should be written in a memory (slave) module. The slave module may send back a response message including an acknowledgement of the receipt of the request message, or an indication of the success of the write operation requested by the master module. The request-response mechanism is often referred to as the transaction model. The combination of a request and a corresponding response is often referred to as a transaction.
Networks-on-Chip constitute a rapidly evolving area of research and development. In recent years many publications have been made, for example about network topologies or the design of components such as network interfaces, routers and switches. An important recent development is the concept of multi-chip networks. Multi-chip networks are divided into sub-networks which are dedicated to the communication between modules forming part of a sub-system and performing specific functions in a larger data processing system. The sub-networks reside on different integrated circuits (dies, boards or chips). Alternatively, sub-networks may reside on a single chip. In the latter case they may have different power or voltage domains.
In the context of the present invention U.S. Pat. No. 6,018,782 is particularly relevant. U.S. Pat. No. 6,018,782 discloses a single-chip integrated circuit which comprises a plurality of modules interconnected in an on-chip network. The modules are processors or memory devices or hybrids. An inter-module link provides an electrical path for data communication among the modules. The modules are connected to the inter-module link by inter-module ports, with at least one inter-module port coupled between an associated module and the inter-module link. The inter-module link electrically couples the inter-module ports and provides a communications pathway between the modules. The on-chip network may also include an inter-module network switch for joining circuits of the inter-module link and routing data packets from one inter-module links to another or an inter-chip network bridge to join two single chip integrated circuits into a single communications network and route data packets from modules on one computer chip to modules on another computer chip.
The inter-chip network bridge is capable of joining two computer chips to extend the on-chip network through a number of connectors, as can be seen in FIGS. 2 and 5 of U.S. Pat. No. 6,018,782. The inter-chip network bridge preferably includes one or more output buffers which operate to accept outgoing data destined for an address on a second computer chip, and one or more input buffers operable to receive incoming data destined for an associated address on the associated computer chip. The inter-chip network bridge accepts data to be transferred to the second computer chip into an output buffer when space in the output buffer is available. The data in the output buffer is transferred to a corresponding inter-chip network bridge on the second computer chip through the connectors, if the latter inter-chip network bridge signals availability to accept additional data.
It is apparent from the description of U.S. Pat. No. 6,018,782 that the network bridge only applies to communication between networks residing on different integrated circuits, and that it only comprises buffer means for temporarily storing data which should be transmitted from one network to another. There is no mechanism for synchronization of data transfer from one network to another. The facilities offered by the network bridge are very limited in the sense that it only offers a possibility to couple the network to another chip and thereby extend the network. It further provides relatively simple buffer means to queue data when a corresponding network bridge (comprised in the network on the other computer chip) indicates that it cannot accept additional data. Hence, a major disadvantage of this network bridge is that it cannot adequately synchronize the data traffic from one network to another.
It is also apparent that two components are needed, in particular a network bridge on a first computer chip and a cooperating network bridge on a second computer bridge, the combination of which negatively affects the performance of the network as a whole due to an increased latency. The negative effect on the performance is another disadvantage of the known network bridge.
Another relevant document is the article “Implementation of interface router IP for Proteo Network-on-Chip”, by Mikko Alho and Jari Nurmi, Institute of Digital Computer Systems, Tampere University of Technology, Finland. In this article an interface router IP for the Proteo NoC (developed at the Tampere University of Technology) is introduced and implemented. Besides the implementation of this interface router IP, the concept of multiple sub-networks is briefly illustrated, as well as the use of bridge components to interconnect the sub-networks into a larger network. However, a specification of these bridge components is absent. The above-mentioned lack of data traffic synchronization is not dealt with, nor mentioned as a technical problem caused by the possibly different characteristics of the various sub-networks.
It is an object of the invention to provide a means and a method for interconnecting sub-networks of the kind set forth, which means and method are able to adequately synchronize the data traffic between the sub-networks. This object is achieved by the data processing system as claimed in claim 1 and by the method as claimed in claim 7.
The data processing system according to the invention comprises a conversion unit, which conversion unit is arranged to convert a first flow control scheme applied in a first sub-network into a second flow control scheme applied in a second sub-network. The conversion unit may cooperate with or be integrated with another component, for example a component which performs conversion of operating frequency between sub-networks (clock-domain crossing). For the correct functioning of flow control it is necessary that separate flow control schemes are used for respectively the first sub-network and the second sub-network. The conversion unit performs a conversion between these schemes. For example, if the flow control schemes are credit-based the conversion unit computes the correct amount of credits for the first flow control scheme, based on the amount of credits available in the second flow control scheme. If necessary, credit conversion is performed. The latter is necessary when the flit sizes are different in the first and second sub-network, for example. The conversion unit translates the credits from the second sub-network (which credits represent a certain amount of data elements) into credits for the first sub-network. The number of credits may be different in respectively the first and second sub-network, for the same amount of data elements.
In an aspect of the invention, which is defined in claim 2, the data processing system deploys a flow control scheme for synchronizing data traffic between the modules, wherein the flow control scheme is based on credits stored in a first module, which credits represent the amount of data which can be received by a second module. This is often referred to as a credit-based flow control scheme.
In another aspect of the invention, which is defined in claim 3, the first sub-network comprises a first router and the second sub-network comprises a second router, an output of the first router being coupled to an input of the conversion unit, and an output of the conversion unit being coupled to an input of the second router, wherein the first router comprises a first buffer unit, and wherein the second router comprises a second buffer unit, wherein the conversion unit is arranged to receive data from the first buffer unit, and wherein the conversion unit is further arranged to store data for transmission to the second buffer unit, the conversion unit comprising an intermediate buffer unit for storing the data, characterized in that the communication between the first buffer unit and the intermediate buffer unit is controlled by the first flow control scheme, and in that the communication between the intermediate buffer unit and the second buffer unit is controlled by the second flow control scheme. The separate flow control schemes control separate pairs of buffers and the conversion unit converts between the flow control schemes.
In a further aspect of the invention, which is claimed in claim 4, the first sub-network and the second sub-network use flow control units having different sizes, and wherein the conversion unit is arranged to convert credits used by the second flow control scheme into credits used by the first flow control scheme. This is referred to as credit conversion; the credits used in the second flow control scheme are translated into credits for the first flow control scheme.
In a further aspect of the invention, which is claimed in claim 5, the first sub-network and the second sub-network reside on different chips, the data processing system being provided with a further conversion unit, wherein an off-chip link is provided between the conversion unit and the further conversion unit. The conversion means is extended with a further conversion unit which cooperates with the first conversion unit. This is advantageous when an off-chip link is provided between the conversion units.
In a further aspect of the invention, which is claimed in claim 6, the first sub-network and the second sub-network reside on a single chip, the first sub-network and the second sub-network having different clock domains, characterized in that the conversion unit is also arranged to provide clock-domain crossing. In this embodiment the conversion unit is integrated with means to perform the clock-domain crossing.
The present invention is described in more detail with reference to the drawings, in which:
medium (on-chip copper versus off-chip fiber, etc.);
clock or operating frequency (i.e. speed);
clock phases;
link width;
link-level flow control schemes;
operating modes (e.g. burst mode versus constant transmission mode).
The link-level bridge LLB1 is arranged to translate between physical and link-level protocols, from link L1 to link L3. Communication between router R1 and router R2 takes place in the form of packets. Typically, packets comprise at least one of a header, a payload and a tail. The packets are further split and allocated to so-called flow control units. A flow control unit is commonly referred to as a ‘flit’.
First, the principle of link-level flow control will be discussed.
Initially, if the buffer unit fifo2 of the second router R2 is still empty, the value of the credits (remote space) counter is equal to the size of this buffer unit. Router R1 can send as much flits as it has credits, i.e. a number of flits which is equal to the value of the credits (remote space) counter. When the first router R1 transmits data to the second router R2, it decrements the credits (remote space) counter by the amount of flits which it transmits. When data leaves the buffer unit fifo2 of the second router R2, the value of the credits to report counter is incremented by the number of flits which have left this buffer unit. If the value of the credits to report counter is larger than zero, this value is reported to the first router R1, where it is added to the value of the credits (remote space) counter. In this manner, no data will be sent to the second router R2 if there is no buffer space for storing the data, and therefore no data will be lost (i.e. the communication is lossless).
It is noted that the link-level bridge LLB1 may contain more than one buffer unit, for example a series of first-in first-out buffer units. The use of a single bridge buffer unit fifoB can be seen as an abstraction. A person skilled in the art can select the actual implementation and location of the buffer. Examples of buffer implementations are a double latch for frequency conversion and a sequentializer for link width conversion.
As mentioned before, because the flow control mechanism has been divided into separate flow control mechanisms for the pairs of buffers fifo1-fifoB and fifoB-fifo2, the buffer overflow in the link-level bridge LLB1 can be avoided. However, if the flit sizes are different in the sub-networks, then additionally credit conversion is required. For example, if the flit size in the sub-network of router R1 is 2 words and the flit size in the sub-network of router R2 is 4 words, then three 4-word flits which leave the link-level bridge LLB1 must be translated into six credits to report to router R1. Or, if the flit size in the sub-network of router R1 is 3 words and the flit size in the sub-network of router R2 is 4 words, then three 4-word flits which leave the link-level bridge LLB1 must be translated into four credits to report to router R1.
It is remarked that the scope of protection of the invention is not restricted to the embodiments described herein. Neither is the scope of protection of the invention restricted by the reference symbols in the claims. The word ‘comprising’ does not exclude other parts than those mentioned in a claim. The word ‘a(n)’ preceding an element does not exclude a plurality of those elements. Means forming part of the invention may both be implemented in the form of dedicated hardware or in the form of a programmed general-purpose processor. The invention resides in each new feature or combination of features.
Number | Date | Country | Kind |
---|---|---|---|
04106213.4 | Dec 2004 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB2005/053954 | 11/29/2005 | WO | 00 | 5/25/2007 |