This application claims priority of European Patent application No. 10153509.4 filed on Feb. 12, 2010, the entire contents of which is hereby incorporated by reference herein.
The present invention relates to a method and a device for controlling clock gating of a data processing block, in particular a data processing block of a plurality of data processing blocks of a circuitry system which are interconnected by a streaming data bus.
Large Systems-on-a-Chip (SoCs) usually consist of several components that contain data processing modules, potentially together with a local controller, that perform some sort of defined (sub-) task. In the case of an SoC for wireless communication applications, for example, such components of the system could be the building blocks of a modem circuitry such as digital front end (DFE), Tx unit, shared RAM, forward error correction (FEC) data unit, fast Fourier transform (FFT) unit, parameter estimation unit, equalizer unit, searcher unit, FEC control unit and the like. Often, these components don't need to be active all the time; rather periods of data processing are followed by idle periods. It is then desirable to switch off the local clock(s) during these periods, generally to save power, and more specifically to increase the battery life for mobile devices.
Today, virtually all of these components are described in some hardware description language (HDL) like VHDL or Verilog, for then being translated into gate-level netlists by synthesis tools. A simple coding style for automatic clock-gating is as follows:
When using such an HDL representation which can be achieved for most pure data path components these software tools can automatically infer a structure as shown in
Although this form of automatic translation can efficiently insert clock gating for large portions of the design, it is limited to the coding style mentioned above.\
An example for a more complicated description is as follows:
This kind of coding style cannot be handled by the synthesis tools in terms of automatic translation of clock gating mentioned in conjunction with
However, with sub-micron technologies decreasing structure sizes more and more, the dynamic switching power of the clock tree or clock mesh becomes an ever growing fraction of the whole dynamic power consumption of an electronic device.
As this switching power of the clock distribution network is not covered by the automatic clock gate inference described before, some form of higher-level control is required, in order to gate-off a local clock distribution network at its root. Some control processor must query the state of the data processing modules, as well as the state of incoming, internal and outgoing busses, as illustrated in
A general object of the present invention is to reduce the power consumption of electronic devices. A more specific object of the invention is to provide a way for automatic clock gating in SoC designs.
The present invention provides a device for controlling clock gating of one of a plurality of data processing blocks of a circuitry system which are interconnected by a streaming data bus structure. The device has an input connected to each of the data processing block's data processing units and bus segments for receiving a busy indication therefrom to keep track of the data transfer and processing activity therein, and has an output connected to a clock gate at the root of the local clock distribution network of the data processing block to gate off the clock of the data processing block when an idle condition is detected, and recover the clock when a wake-up condition is detected.
In a presently preferred embodiment, the streaming data bus uses a handshake-type transfer protocol which comprises a one-bit indication signaling the beginning and the end of a sequence of data, and said activity tracking device comprises a logical gate adapted to combine the one-bit busy indications from each of the data processing units and busses of the data processing block and to output a clock disabling signal when all one-bit busy indications signal absence of any data to be transferred or processed, and to output a clock recovering signal when at least one of the one-bit busy indications signals presence of any data.
In this way, the activity tracker can determine an idle condition as well as a wake-up condition, and can gate the clock at the root of the local distribution network independently from local controller and without requiring any software activity.
The invention also provides a method for controlling clock gating of one of a plurality of data processing blocks of a circuitry system which are interconnected by a streaming data bus.
The novel approach for automatic clock gate control by activity tracking provides several advantages over the prior art. Clock gate insertion can be implemented and verified already on RTL level. As the clock is gated at the root of the local clock distribution network, high coverage of the data processing block is ensured, i.e. a whole module or component can be switched off to reduce activity of large sections of the clock tree which translates into a significant reduction of power consumption of a circuitry system which comprises the data processing block. Clock gating may gather 100% of the flip-flops of a data processing block, independently from any side conditions like minimum width of register bank or enable conditions.
The inventive method is implemented without any higher level software control required. Rather, an extra advantage of the new approach is, that even a local controller which may be a part of the data processing block can be clock gated.
Application examples for the method and activity tracking device for clock gating according to the invention are in data processing blocks such as digital front end (DFE) unit, LTE Tx unit, shared RAM unit, forward error correction (FEC) data unit, fast Fourier transform (FFT) unit, parameter estimation unit, searcher unit, and FEC control unit of a wireless telecommunication modem device, without being limited thereto.
The novel method and device for clock gating can be implemented hierarchically. That is, clock gating may be implemented in a data processing block which comprises several data processing modules to switch on and off the entire data processing block's clock, and may simultaneously or alternatively be implemented in any or all of the data processing modules comprised in a data processing block to support clock gating at a lower hierarchical level.
Additional features and advantages of the present invention will be apparent from the following detailed description of a specific embodiment which is given by way of example only and in which reference will be made to the accompanying drawings, wherein:
The streaming data bus uses a handshake-type transfer protocol which comprises a one-bit indication signaling the beginning and the end of a sequence of data.
SSL comprises four signals: data, valid, accept, frame. A data signal can have a width of multiple bits, e.g. 7, 16, 24, 32 bits. The valid/accept signals are similar to that of other handshake-type transfer protocols and are used to drive and stall the communication from source to sink. Source and sink can set or reset these signal at any time. Data is taken over if both are “high” on the rising edge of the clock. Data source and data sink must have the same understanding of what the frame signal means. Data source and data sink, herein, can be any of the functional units of an SoC such as, to give a non-limiting example, a wireless telecommunication modem device which includes functional units like front end (DFE) unit, LTE Tx unit, shared RAM unit, forward error correction (FEC) data unit, fast Fourier transform (FFT) unit, parameter estimation unit, searcher unit, and FEC control unit.
A “frame” in the sense of SSL transfer protocol is a logical group or sequence of data, such as e.g. an OFDM symbol, a block of control data, a block of information data, etc. Data transfer only occurs if the accept, valid and frame signals are high. In this case the frame signal marks the beginning and end of a data block transfer. The source can set the valid and frame signals in advance. The sink can set the accept signal in advance. In case the frame signal is not used by a source, it can clamp the output to “high”. In case a sink does not know how to interpret an incoming frame signal, it can be ignored.
With this SSL transfer protocol the activity detection in activity tracker 40 of
Hence, the invention provides a very low complexity way of automatic clock gating. Optionally, activity tracker 40 can be implemented to additionally consider an internal state of data processing modules 32, 34 and/or the optionally embedded controller 36 for the enable condition of the clock provided by the activity tracker which internal state may also be expressed by a binary signal.
While the activity tracker of the invention has been explained in conjunction with the data streaming protocol illustrated in
The method and device for controlling clock gating according to the invention can also be scaled down to the level of data processing modules 32 and 34 itself which are exemplified in
The control of clock gating according to the invention can be easily combined with a circuitry system described in the applicant's co-pending EP application entitled “Circuitry System and Method for Connecting Synchronous Clock Domains of the Circuitry System”.
A data stream received from clock domain A is passed through auxiliary input buffer 71, said multiplexer 72, and data output buffer 73 to clock domain B. Control logic 74 receives control signals from sink and source interfaces (snk. valid, snk.clk_en, src.accept, src.clk_en) of clock domain separation module 60. Data elements of the data stream are selectively buffered in auxiliary input buffer 71 for at least one clock cycle, in function of the received control signals, and control signals (snk. accept, src.valid) are emitted to the sink and source interface, respectively, of the clock domain separation device by control logic 74. Auxiliary input buffer 71 is operable to buffer data elements of a data stream that has been accepted during a clock cycle in which a non-accept condition of the data sink has been transferred from the source interface to the sink interface of the device and is also operable to buffer data elements of a data stream in case the source interface side's clock is gated off during transfer and the sink interface side's clock remaining active or being gated off following shut-off of the source interface clock. In this way, module 60 enables the clock in synchronous clock domains A, B to be switched on and off independently from each other while maintaining data integrity of the streaming data.
Number | Date | Country | Kind |
---|---|---|---|
10153509.4 | Feb 2010 | EP | regional |