1. Technical Field
Embodiments relate to an apparatus and in particular but not exclusively to an apparatus for communicating with a target via an interconnect.
2. Description of the Related Art
Ever increasing demands are being placed on the performance of electronic circuitry. For example, consumers expect multimedia functionality on more and more consumer electronic devices. By way of example only, advanced graphical user interfaces drive the demand for graphics processor units (GPU). HD (High definition) video demand for video acceleration is also putting an increased demand for performance in consumer electronic devices. There is for example a trend to provide cheap 2D and 3D TV or video on an ever increasing number of consumer electronic devices.
In electronic devices, there may be two or more initiators which need to access one or more targets by a shared interconnect. Access to the interconnect needs to be managed in order to provide a desired level of quality of service for each of the initiators. Broadly, there are two types of quality of service management: static; and dynamic. The quality of service management attempts to regulate bandwidth or latency of the initiators in order to meet the overall quality of service required by the system.
According to an aspect, there is provided an apparatus comprising: an output configured to output data to a communication path of an interconnect for routing to a target; and
a rate controller configured to control a rate of said output data, said rate controller configured to control said rate in response to feedback information from said target.
The rate may comprise at least one of bandwidth and frequency of said output data.
The controller may be configured to output a request to a communication path of said interconnect for routing to said target.
The request may be output on to one of: a different communication path to said output data and the same communication path as said output data.
The bandwidth controller may be configured to control a rate at which a plurality of requests are output in response to said feedback information.
The feedback information may comprise information about a time taken for said request to reach said target and a response to said request to be received from said target.
The feedback information may comprise information about said communication path on which said data is output.
The feedback information may comprise information about a quantity of data stored in said target.
The feedback information may comprise information on a quantity of information stored in a buffer.
The feedback information may comprise information indicating that a quantity of data stored in said target is such that the store has at least a given amount of data.
The controller may be configured to determine that if said store has at least a given amount of data, said rate is to be reduced.
The controller may be configured to estimate a current status of said target based on previous feedback information.
The controller may be configured to receive feedback information associated with a different apparatus, said different apparatus outputting data on the communication path on which said apparatus is configured to output data.
The interconnect may be provided by a network on chip.
According to another aspect, there is provided a target comprising: an input configured to receive data from an apparatus via to a communication path of an interconnect; and a feedback provider configured to provide feedback information to said apparatus, said feedback information being usable by said apparatus to control the rate at which said data is output to said communication path.
The input may be configured to receive a request from said apparatus via a communication path of said interconnect.
The feedback information may comprise information about a time taken for said request to reach said target.
The feedback information may comprise information about said communication path on which said data is received.
The feedback information may comprise information about a quantity of data stored in said target.
The feedback information may comprise information on a quantity of information stored in a buffer of said target.
The feedback information may comprise information indicating that a quantity of data stored in said target is such that the stored data is at least a given amount of data.
The feedback provider may be configured to provide feedback information associated with a different apparatus to said apparatus, said different apparatus outputting data on the communication path on which said apparatus is configured to output data.
According to another aspect, there is provided a system comprising: an apparatus as discussed above, a target as discussed above and said interconnect.
According to another aspect, there is provided an integrated circuit or die comprising: an apparatus as discussed above, a target as discussed above or said system discussed above.
According to another aspect, there is provided a method comprising: outputting data to a communication path of an interconnect for routing to a target; and controlling a rate of said output data, said rate controller configured to control said rate in response to feedback information from said target.
According to another aspect, there is provided a method comprising: receiving data from an apparatus via a communication path of an interconnect; and providing feedback information to said apparatus, said feedback information being usable by said apparatus to control the rate at which said data is output to said communication path.
For a better understanding of some embodiments, reference will now be made by way of example only to the accompanying Figures in which:
Reference is made to
It should be appreciated that these units are by way of example only. In alternative embodiments, any one or more of these units may be replaced by any other suitable unit. In some embodiments, more or less than the illustrated number of initiators may be used.
By way of example only, the targets comprise a flash memory 24, a PCI (Peripheral Component Interconnect) 26, a DDR (Double Data Rate) memory scheduler 28, registers 30 and an eRAM 32 (embedded random access memory). It should be appreciated that these targets are by way of example only and any other suitable target may alternatively or additionally by used. More or less than the number of targets shown may be provided in other embodiments.
The NoC 4 has a respective interface 11 for each of the respective initiators. In some embodiments, two or more initiators may share an interface. In some embodiments, more than one interface may be provided for a respective initiator. Likewise an interface 13 is provided for each of the respective targets. In some embodiments, two or more targets may share an interface. In some embodiments, more than one interface may be provided for a respective target.
Some embodiments will now be described in the context of consumer electronic devices and in particular consumer electronic devices which are able to provide multimedia functions. However, it should be appreciated that other embodiments can be applied to any other suitable electronic device. That electronic device may or may not provide a multimedia function. It should be appreciated that some embodiments may be used in specialized applications other than in consumer applications or in any other application. By way of example only, the electronic device may be a phone, an audio/video player, set top box, television or the like.
Some embodiments may be for extended multimedia applications (Audio, video, etc). In general, some embodiments may be used in any application where multiple different blocks providing traffic have to be supported by a common interconnect and have to be arbitrated in order to satisfy a desired Quality of Service.
Quality of service management is used to manage the communications between the initiators and targets via the NoC 4. The QoS management may be static or dynamic.
Techniques for quality of service management have been proposed to regulate the bandwidth or latency of the various system masters or initiators in order to meet the overall system quality of service. These schemes generally do not provide a fine link with real traffic behavior. Initiators normally do not consume regularly their target bandwidth. For example, a real-time video display unit does not issue traffic for most of the VBI (vertical blanking interval) period, and the traffic may be varied from one line to another due to chroma sampling.
Another issue to be considered relates to the effective bandwidth of the DDR which depends on the traffic issued by the initiator. This may lead to an increase in system latency and network on chip congestion.
Reference is made to
If the target bandwidth has not been achieved, the multiplexer 48 is configured to select a relatively high priority for the data 50. On the other hand, if the target bandwidth has been achieved, the multiplexer 48 is configured to select a relatively low priority for the data. The multiplexer provides a priority output in the form of priority information. This priority information will be associated with the data output by the initiator. The priority information output by the multiplexer 48 is used by an arbitrator (not shown) on the network on chip when arbitrating between requests from a number of initiators.
The network on chip technology such as shown in
Undesirable network behavior with a consequent low quality of service may occur if there is an unexpected bandwidth or latency bottleneck in the network on chip. This may result in the initiators raising their quality of service requirements resulting in a further degradation of quality of service. A bottleneck may occur for one or more different reasons such as due to effective DDR bandwidth variation or efficiency or the peak behavior of conflicting initiators.
Reference is now made to
The first initiator is arranged to output traffic having the first quality of service, A as is the fourth initiator. This traffic will be provided via the first virtual channel. The second initiator provides traffic with the second quality of service, B. The third initiator provides traffic having a third quality of service, C and the fifth initiator provides data traffic with the fourth quality of service, D. The initiators 6 are, as in the arrangement shown in
Reference is now made to
Dealing with effective DDR bandwidth results in dynamic turning off of the bandwidth of some of the traffic classes. Usually, this would be for the poorest traffic classes (e.g., class 84). However, other traffic classes may also be involved depending on their quality of service constraints. Shown on the graph and referenced 86 is the effective DDR efficiency. As can be seen, the effective DDR efficiency varies between a maximum value of 100% and a minimum value of 40%. The average value of around 70% is also shown. It should be noted that these percentage values are by way of example only. The DDR efficiency is an indication of how effectively the DDR is being used taking into account for example numbers of cycles to perform a data operation which requires access to the DDR and/or scheduling of different operations competing for access to the DDR.
The DDR scheduler may be aware of pending requests at its level. However, the scheduler may not necessarily known the exact number of pending requests in the other parts of the network on chip infrastructure. In some systems for implementing in practice an arrangement such as shown in
In some embodiments, congestion may be avoided in the network on chip infrastructure by dynamically changing the bandwidth of some of the communication paths while maintaining the bandwidth of others. This may be based on the effective bandwidth available at the DDR scheduler level. Dynamic tuning of bandwidth in a communication path may be performed in a number of different scenarios where the bandwidth offered by the infrastructure is not easily predictable. This may be for example from network-on-chip-island to network-on-chip-island, from initiator to DDR or the like.
Reference will now be made to
In some embodiments, the quantity of pending requests for a communication path may be indirectly monitored at the scheduler level. The rate of data output by the initiator may be controlled so that the communication path does not become full and congestion may not occur. A DDR scheduling algorithm may regulate the initiator data rate depending on the DDR scheduler monitoring. The DDR scheduler may have buffering capabilities (buffer margin) to fully or partially cover an unknown number of hidden requests. These requests would be requests which are in transit in the network on chip. In some embodiments, the existing communication resources for end-to-end information transfer may be used.
The service packet may simply be a data packet or may be a specific packet. Alternatively or additionally a data packet may be modified to include information or an instruction to trigger a response. The service or data packet is sent to trigger a response from the DDR scheduler. The service packet may be used to feedback information to the scheduler, for example on round trip latency, as will be described later. In some embodiments, the service packet request may be used as a measure of the latency of the communication path. Information on the latency of the path and on a buffer may be provided back to the initiator in order to provide information which can be used for End-to-End quality of service.
In some embodiments, the service or data packet may be omitted and a different mechanism may be used to trigger the sending of information from the DDR scheduler back to the initiator. This may be used to provide information on the status of the buffer.
In one embodiment, separate service packets and user data packets are provided. The user data packet comprises a header and a payload. The payload of a user data packet comprises user data. The header comprises a packet descriptor. This packet descriptor will include a type identifier. This type identifier will indicate that the packet contains user data. The packet descriptor may additionally include further information such as size or the like. The header also includes a network on chip descriptor. This may include information such as a routing address or the like.
The service packet also has a header and a payload. The payload of a service packet comprises a service descriptor with information such as the channel state for end-to-end quality of service or the like. The header comprises a packet descriptor. The packet descriptor will include a type identifier which will indicate that the packet is a service packet. The packet descriptor may include additional information such as size or the like. As with the user data packet, the header will include a network on chip descriptor which will include information such as, for example, a routing address or the like.
The type ID field of the service packet and user data packet are analyzed in order to properly manage the packet.
The DDR scheduler has a buffer 96 which is arranged to store the DDR scheduler pending requests. This buffer has a threshold 98. When the quantity of data in this buffer 96 exceeds this threshold 98, this will cause the response to the service packet to include this information. Where provided communication path 94 may be used for end-to-end quality of service and is separate from communication path 92, used for the service request packet. A dedicated feedback path 94 may be such that the delays on this path are minimized. Alternatively, the response may use the same communication path 92 as used for the service request packet. This information is fed back to the data processor 90 which controls the rate at which data is put onto the communication path 92 in response to that feedback.
Alternatively or additionally the exceeding of the threshold may itself trigger the sending of a response or a message to the initiator via communication path 92 or 94.
To summarize, the service packet request may be provided on the same communication path as the data or a different communication path to the data. The service packet response may be provided on the same communication path as the service packet request, the same communication path as the data (where different to that used for the service packet request) or a communication path different to that used for the service packet request and/or data.
Some embodiments may have a basic locked loop where the data traffic from an initiator is tuned thanks to information at the DDR scheduler level and a go/no-go scheme. The service packet response is thus returned by the DDR scheduler with the current state of the related communication path 92. This information is determined from the status of the buffer.
If the service packet is sent via the communication path 92 which is used for data, the service packet response will be removed from the data traffic at the initiator level, in some embodiments. In some embodiments, the service packet will enter a dedicated communication path resource in the DDR scheduler where the communication path latency may not depend on related or other data communication path latency associated with a DDR. In other words the data which is received by the scheduler may then need to wait a further length of time before it is scheduled for the DDR. The service packet is removed from the data communication path such that the service packet does not have this further length of time delay.
The initiator may be controlled in any suitable way in response to the feedback from the DDR scheduler. For example, the traffic may be enabled by default until a communication path full state (determined by the status of the buffer) is returned by the DDR scheduler. The traffic will be resumed for example after a predetermined period or time out. Alternatively or additionally, the data traffic may be suspended by default. A communication path ready state will allow traffic for a given amount of time, for example, until a time out. Alternatively or additionally, the traffic may be enabled on reception of the communication path ready state and suspended upon a communication path full state.
The message or response which is sent from the DDR scheduler back to the initiator is determined by the state of the buffer. In some embodiments, the threshold is set such that data which has been sent from the initiator but not yet received can be accommodated. Thus, a margin may be provided in some embodiments. In some embodiments, more than one threshold may be provided. In some embodiments, the falling below a threshold may determine the nature of the response. In other embodiments, a different measure related to the buffer may be used instead of or in addition to a threshold.
Reference is now made to
Information on the status of the buffer is provided to a processor 112. The processor is configured to provide the response to the service packet from the packet generator 106, as soon as possible in some embodiments. The response which is received by the packet generator 106 is used to control the bandwidth tuner 104. This may increase the rate at which packets are put on to the communication path, slow the rate at which packets are put into the communication path, stop the putting of packets onto the virtual communication path and/or start the putting of packets onto the communication path.
It should be appreciated that there may be more than one service packet for which a response is outstanding. In other words a response to a service packet does not need to be received in some embodiments in order for the next service packet to be put onto the communication path (although this may be the case in some embodiments).
The rate at which service packets are put onto communication path may be controlled in some embodiments.
In some embodiments, the service packet traffic is configured to have a higher priority over the data traffic. In some embodiments, a minimum bandwidth budget ensures that the service packet may always be transferred between the initiator and the scheduler. Where the service packet is sharing a communication path with other packets, the service packets may be given priority over that minimum bandwidth.
In one alternative embodiment, two separate communication paths may be provided. The first communication path is for the data from the initiator. The second communication path will be for the service packet communication between the initiator and the scheduler.
The one or more communication paths may be bidirectional or may be replaced by two separate communication paths, one for each direction.
Some embodiments may improve the locked-loop accuracy and speed. Some embodiments may have a more sustainable bandwidth estimation. Some embodiments may have a bandwidth overhead limitation due to the service packet usage. In some embodiments, there may be optimization of the buffering capabilities of the scheduler.
The accuracy of the loop error due to service packet response time can be improved by control carried out in the initiator. That control may be performed by the packet generator and/or any other suitable controller. The packet generator and/or other controller may use a suitable algorithm. The latency of the service packet response has an impact on how quickly the initiator is able to react to changes in congestion in the communication path. The algorithm may for example make predictions on the current buffer status, before the corresponding response packet has been received. These predications may be made on the basis of the previous responses and/or the absence of a response to one or more outstanding service packets and/or any other information. These predictions may cancel or at least partially mask the effects of the service packet response latency. In some embodiments, if the algorithm is able to mitigate at least partially the effects of the service packet response latency, the buffer margin may be smaller.
Additionally or alternatively the rate of issuance of the service packet response may be controlled.
Some embodiments may provide more service packet information from the scheduler and linear algorithms at the initiator level. This may be for one or more of the following reasons. Firstly, this may be used in relation to the filling level of the related data communication path. The buffer provides the filling information as a measure of the filling level of the communication path; in other words, the number of outstanding requests that can be handled. This information may be used for derivation; in other words, whether the situation in the communication path becomes better or worse. In some embodiments, this information can be used for self-regulation of the service packet issuing rate. In some embodiments, further information can be used for integration and recursive analysis of service packets, as discussed previously.
Reference is made to
The network on chip has an arbiter 122 which is configured to arbitrate requests between the network on chip and the DDR scheduler 28. In the arrangement shown in
As schematically shown, the second initiator has a multiplexer 124. The multiplexer 124 selectively outputs a service packet from a service packet issuer 123 or a data traffic packet from a data traffic issuer onto the communication path. Although this is not specifically shown in the previous Figures, it should be appreciated that such an arrangement may be included in any of the previously described arrangements.
The second initiator has a measurer 125 which is configured to measure the service packet round trip. This is the time taken for a service packet issued from the second initiator to be received by the DDR scheduler, and a response to be issued from the DDR scheduler to that packet and received back at the second initiator. This provides a measure of the latency in the system and a measure of congestion. It should be appreciated that the first initiator may have a similar service packet round-trip latency measurer. The DDR scheduler 28 is configured to have a first service communication path processor 112a for the first communication path CP0. The scheduler also has a second service communication path processor 112b associated with the second communication path CP1. The data which is received from the network on chip is provided to a data multiplexer 126 which is able to output the data from the first and second communication paths to the DDR. The respective service packets are provided to the respective service communication path processor. Thus service packets on the first communication path are provided to the first service communication path processor 112a. Likewise, service packets on the second communication path are provided to the second service communication path processor 112b.
The arrangement of
Thus, as described, there is a round trip latency measure of the service packet trip at the initiator. This may be combined with any issuing rate method. The round-trip latency information will be transferred to the DDR scheduler in a subsequent service packet request. In other words, the latency associated with an earlier service packet request and the associated response will be provided to the DDR scheduler in a later service packet request.
At the DDR scheduler level, the DDR scheduler is able to analyze the round-trip latency variation. End-to-end quality of service control can be performed on the communication paths involved in congestion and associated with the lowest traffic class, in some embodiments. Depending on this analysis, the response will be used to control for example a bandwidth tuner.
In some embodiments, a calibration is performed. This is to estimate the nominal communication path latency. This may be done in a test phase where there is no data on the network on chip and instead one or more service packets are issued and responded to in order to determine the latency in the absence of congestion. This latency may be the static latency.
It should be appreciated that in some embodiments, control across a single communication path may be exerted as well as control over two or more communication paths. In other words, the embodiments described previously in relation to for example
Reference is made to
By way of comparison, two traffic classes are shown in Graph 2 where network on chip arbitration drives the bandwidth allocation among the traffic classes. Graph 2 may be the result of using a system such as shown in
In the third Graph 3 of
It should be appreciated that the communication path may be any suitable communication resource and may for example be a channel. In some embodiments, the communication path can be considered to be a virtual channel.
It should be appreciated that one or more of the functions discussed in relation to one or more sources and/or one or more targets may be provided by one or more processors. The one or more processors may operate in conjunction with one or more memories. Some of the control may be provided by hardware implementations while other embodiments may be implemented in by software which may be executed by a controller, microprocessor or the like. Some embodiments may be implemented by a mixture of hardware and software.
While this detailed description has set forth some embodiments of the present invention, the appending claims cover other embodiments of the present invention which differ from the described embodiments according to various modifications and improvements. Other applications and configurations may be apparent to the person skilled in the art. Some of the embodiments have been described in relation to an initiator and a DDR scheduler. It should be appreciated that this is by way of example only and the target may be any initiator and target may be any suitable apparatus. Alternative embodiments may use any suitable interconnect instead of the example Network-on-Chip.
The various embodiments described above can be combined to provide further embodiments. The embodiments may include structures that are directly coupled and structures that are indirectly coupled via electrical connections through other intervening structures not shown in the figures and not described for simplicity. These and other changes can be made to the embodiments in light of the above-detailed description. In general, in the following claims, the terms used should not be construed to limit the claims to the specific embodiments disclosed in the specification and the claims, but should be construed to include all possible embodiments along with the full scope of equivalents to which such claims are entitled. Accordingly, the claims are not limited by the disclosure.
Number | Date | Country | Kind |
---|---|---|---|
1218933.8 | Oct 2012 | GB | national |