This application is based upon and claims the benefits of priority from the prior Japanese Patent Application No. 2008-084468 filed on Mar. 27, 2008, the entire contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates to a serial data processing circuit and, more particularly, to a serial data processing circuit for processing N serial signals during each clock cycle.
2. Description of the Related Art
A super pipeline technology is used to improve performance of LSI (Large Scale Integration) circuits. Specifically, a combinational circuit between FF (Flip-Flop) circuits is divided into a plurality of combinational circuits and one or more FF circuits are then inserted between the divided combinational circuits to serially connect the combinational circuits, thereby realizing serial data processing. This technology could increase the operating frequency of the entire combinational circuit, thereby improving the throughput performance.
Conventionally known is a pipelined RISC (Reduced Instruction Set Computer) type processor to be driven by a parallel mode (see, for example, Japanese Unexamined Patent Publication No. Hei 5-224929).
The super pipeline technology, however, has a problem of causing increase in power consumption.
In view of the foregoing, it is an object of the present invention to provide a serial data processing circuit that processes serial data with low power consumption.
To accomplish the above-described object, there is provided a serial data processing circuit. This serial data processing circuit comprises: a latch unit including n latches connected to output signal lines from a logic circuit to sequentially latch output data sets from the logic circuit and to output N data sets in parallel; and a selector for sequentially selecting the data sets supplied from the latch unit and converting the sequentially selected data sets into serial data for one signal line to supply the serial data to the next logic circuit.
The above and other objects, features and advantages of the present invention will become apparent from the following description when taken in conjunction with the accompanying drawings which illustrate preferred embodiments of the present invention by way of example.
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings, wherein like reference numerals refer to like elements throughout.
The respective latch units 1a to 1d receive, in parallel, data sets supplied to the logic circuit 3 or data sets produced by a combinational circuit within the logic circuit 3.
The latch units 1a to 1d sequentially latch the data sets D1 sequentially supplied to the logic circuit and output N data sets in parallel. Suppose, for example, that the logic circuit sequentially receives the data sets D1 including data (a), data (b), data (c), data (d), data (e), . . . . In this case, outputs from the latch units 1a to 1d are as shown in
The selector 2 sequentially selects the data sets D1 supplied from the latch units 1a to 1d and supplies the selected data to the logic circuit 3. For example, when the latch unit 1a latches data (a), the selector 2 selects the data (a) and supplies it to the logic circuit 3. When the latch unit 1b latches data (b), the selector 2 selects the data (b) and supplies it to the logic circuit 3. As a result, the logic circuit 3 can process N serial data sets.
As described above, the serial data processing circuit sequentially latches the data sets D1 supplied to the logic circuit 3 or the data sets produced by the logic circuit 3, and outputs N data sets in parallel. Then, the serial data processing circuit sequentially selects the latched data sets D1 and supplies the selected data to the logic circuit. As a result, the present embodiment realizes the same performance as that of pipeline processing with low power consumption.
Next, an embodiment of the present invention will be described in detail with reference to the accompanying drawings.
The serial data processing circuit 11 receives data Din from the peripheral circuit 12. The serial data processing circuit 11 performs predetermined processing on the incoming data Din and supplies data Dout to the peripheral circuit 12. The serial data processing circuit 11 operates in synchronization with a clock CLK1 having a frequency f. The peripheral circuit 12 operates in synchronization with a clock CLK2 having a frequency f×N, where the symbol N is a positive integer indicating the number of data sets to be processed during each clock cycle.
The delay devices 21 to 23 delay phases of the incoming clocks CLK1 by 2π/N and produce the delayed clocks CLK1. The symbol N indicates the number of serial data sets.
The FF circuits 31 to 34 are inserted in parallel in the upstream of the logic circuit 81 according to the number of data sets to be processed.
The FF circuits 31 to 34 receive a clock CLK1 and clocks CLK1 with phases delayed by the delay devices 21 to 23, respectively. The FF circuits 31 to 34 sequentially latch the data sets Din in synchronization with the incoming clocks CLK1. In the case of the example of
The FF circuits 31 to 34 receive in parallel the data sets Din supplied from the peripheral circuit 12. The data sets Din are those processed during each clock cycle. The data sets Din are those supplied from the peripheral circuit 12. Frequencies of the data sets Din are set to be f×N.
Suppose, for example, that the FF circuits 31 to 34 receive the data sets Din including data (a), data (b), data (c), data (d), data (e), . . . having a frequency f×N from the peripheral circuit 12. In this case, the FF circuits 31 to 34 perform a latch operation as follows. First, the FF circuit 31 latches the data (a) in synchronization with a clock CLK1 having a frequency f. Next, the FF circuit 32 latches the data (b) in synchronization with a clock CLK1 phase-shifted by ¼ cycle from the clock CLK1 supplied to the FF circuit 31. Next, the FF circuit 33 latches the data (c) in synchronization with a clock CLK1 phase-shifted by ¼ cycle from the clock CLK1 supplied to the FF circuit 32. Next, the FF circuit 34 latches the data (d) in synchronization with a clock CLK1 phase-shifted by ¼ cycle from the clock CLK1 supplied to the FF circuit 33. Next, the FF circuit 31 latches the data (e) in synchronization with a clock CLK1 having a frequency f. Hereinafter, the same operation is repeated.
The selectors 41 to 44 receive the data sets latched by the FF circuits 31 to 34. Further, the selectors 41 to 44 receive a clock CLK1 and a clock CLK1 delayed by the delay device 21. The selectors 41 to 44 sequentially select, based on the two incoming clocks CLK1, the data sets Din latched by the FF circuits 31 to 34 and supply the selected data to the logic circuit 81.
In the case of an example of
Therefore, according to the above-described example where the FF circuits 31 to 34 receive the data sets Din including data (a), data (b), data (c), data (d), data (e), the selectors 41 to 44 supply the data sets including data (a), data (b), data (c), data (d), data (e), . . . having a frequency (f×4) to the logic circuit 81.
For selection signals supplied to the selectors 41 to 44, CEIL (Log (N)) clocks are selected from N clocks CLK1 having different phases. For example, in the case of N=2, one clock is selected from two clocks having different phases and used as a selection signal. In the case of N=4 (in the case of the example of
In the logic circuit 81, the output side has the same structure and performs the same operation as those of the above-described input side. Delay devices 51 to 53 delay phases of the incoming clocks CLK1 by 2π/N and produce the delayed clocks CLK1. In the case of the example of
The FF circuits 61 to 64 receive a clock CLK1 and clocks CLK1 with phases delayed by the delay devices 51 to 53, respectively. The FF circuits 61 to 64 sequentially latch, in synchronization with the incoming clocks CLK1, the data sets Din supplied from the logic circuit 81. In the case of the example of
The selectors 71 to 74 receive the data sets latched by the FF circuits 61 to 64. Further, the selectors 71 to 74 receive a clock CLK1 and a clock CLK1 delayed by the delay device 51. The selectors 71 to 74 sequentially select, based on the clock CLK1 and the delayed clock CLK1, the data sets Din latched by the FF circuits 61 to 64 and supply the selected data to the peripheral circuit 12.
In
Further, the delay devices 21 to 23, the FF circuits 31 to 34, and the selectors 41 to 44 may be inserted in the logic circuit 81.
The FF circuits 91 and 94 operate in synchronization with the same clock, while the FF circuits 92 and 95 operate in synchronization with the same clock. The clock with which the FF circuits 91 and 94 operate is phase-shifted by π from the clock with which the FF circuits 92 and 95 operate.
The selector 93 receives the same clock as that supplied to the FF circuits 92 and 95. The selector 93 sequentially selects the incoming data sets based on the state ‘0 or 1’ of this clock and supplies the selected data to the logic circuit 96.
When synthesizing data sets on the output side of the logic circuit 96, a timing control must be performed such that the data sets propagate only between the FF circuits that operate with clocks having the same phase, while the data sets are prevented from propagating between the FF circuits that operate with clocks having different phases.
Referring, for example, to
Specifically, when a data set propagating from one FF circuit will arrive at another FF circuit that operates with a clock having a different phase, the minimum propagation time must be made longer than a hold time of an FF circuit for the data set to arrive at. Further, when a data set selected by a selector will arrive at an FF circuit for the data set to be latched, the arrival timing of the data must be set as setup conditions, whereas when a data set selected by a selector will arrive at an FF circuit for the data set not to be latched, the arrival timing of the data must be set as hold conditions.
Meanings of the respective symbols in
T: clock cycle
Ts: setup time of FF circuit
Th: hold time of FF circuit
tij: signal propagation time between FF circuit i and FF circuit j
tsj: signal propagation time between selector and FF circuit j
Twij: clock skew of FF circuit j to FF circuit i
Twsj: clock skew of FF circuit j to selector
Ts1s: setup time between selector 93 and FF circuit 94
Ts2s: setup time between selector 93 and FF circuit 95
Ts1h: hold time between selector 93 and FF circuit 94
Ts2h: hold time between selector 93 and FF circuit 95
The setup conditions are as described below.
t11<T+Tw11−Ts
t22<T+Tw22−Ts
Ts1s<T+Tws2−Tw12+Tw11−Ts
Ts2s<T+Tws1−Tw21+Tw22−Ts
The hold conditions are as described below.
t12>Tw12+Th
t21>Tw21+Th
Ts1h>Tws1+Th
Ts2h>Tws2+Th
As described above, the logic circuit 96 receives the data sets in the number equal to the number of clocks (two clocks in
In
The following formula (1) shows power consumption of the circuit in
F: clock frequency
αi: operation rate of gate i
Pi: charge and discharge power of gate i
In a case of converting the logic circuit 103 in
NF: number of original FFs
MF: number of FFs inserted for super pipeline processing
NL: number of basic gates in combinational circuit
Nc: number of buffers during each clock cycle
F: original clock frequency
αi(F): operation rate of i-th FF
αi(AF): operation rate of i-th added FF
αi(L): operation rate of i-th gate in combinational circuit
βi(C): operation rate of i-th clock buffer
Pi(F): power consumption of i-th FF
Pi(AF): power consumption of i-th added FF
Pi(L): power consumption of i-th gate in combinational circuit
Pi(C): power consumption of i-th clock buffer
A first term of the formula (2) represents power consumption in the original FF circuits of the logic circuit. A second term represents power consumption in the FF circuits added by the pipeline processing. A third item represents power consumption in the logic circuit. A fourth item represents power consumption in the clock circuit.
Comparing the formula (2) with the formula (1), the following will be seen: in the formula (2), the second term is added due to addition of FF circuits. Since an operating frequency of the entire circuit increases by n times, each term is multiplied by n.
In the case of providing in parallel the FF circuits on the input and output sides of the logic circuit to realize serial data processing as shown in
Ns: number of added selectors
αi(S): operation rate of i-th selector
Pi(S): power consumption of i-th selector
A first term of the formula (3) represents power consumption in the FF circuits. A second term represents power consumption in the logic circuit. A third item represents power consumption in the selectors. A fourth item represents power consumption in the clock circuit.
Comparing the formula (3) with the formula (1), the following will be seen: in the formula (3), the third term representing the power consumption in the selectors is added. Since the number of FF circuits increases by n times, the first term is multiplied by n. Since n data sets propagate through the logic circuit during each clock cycle, an operation rate of the circuit increases by n times and therefore, the power consumption represented by the second term increases by n times.
The following formula (4) represents a difference between the power consumption represented by the formula (3) and that represented by the formula (2).
A first term represents an increase in power consumption due to addition of the selectors. A second term represents a power consumption difference due to differences in clock frequencies. Specifically, the clock frequency increases by n times in the pipeline processing of
In formula (4), the operation rate is designated as an average operation rate α. The number of the added FF circuits, which is supposed to be n times (n stages) the number of original FF circuits, is designated as n·NF. Power consumption of the FF circuits is estimated at q·Pε. As a result, the formula (4) is transformed to the following formula (5).
Normally, k is two in the two-input selector, three in the three-input selector, and six in the six-input selector, whereas q is ten. Accordingly, the first term has a negative value, which decreases with increase of the factor n. Meanwhile, the second term has a value equal to or smaller than zero.
Thus, in the case of providing in parallel the FF circuits on the input and output sides of the logic circuit to realize a serial data processing circuit, a significant power reduction is achieved as compared with the case of serially inserting the FF circuits 111 to 113 between the logic circuits 103a to 103d of
As shown in
The clock phase shifter 121 supplies to the selectors 131a, . . . , 131n a clock phase-shifted by 2π/N. Based on this clock, the selectors 131a, . . . , 131n convert N data sets Din supplied from the peripheral circuit 12 into N serial data sets during each clock cycle and supply the parallel-to-serial converted data sets to the serial data processing circuit 11.
The clock phase shifter 122 supplies to the FF circuit groups 141a, . . . , 141n a clock phase-shifted by 2π/N. Based on this clock, the FF circuits groups 141a, . . . , 141n convert N serial data sets included in the serial data processing circuit 11 during each clock cycle into parallel data sets and supply the serial-to-parallel converted data sets Dout to the peripheral circuit 12.
Thus, the data sets supplied to and from the serial data processing circuit 11 are subjected to parallel-to-serial conversion and serial-to-parallel conversion. As a result, the present embodiment operates the serial data processing circuit 11 and the peripheral circuit 12 with a clock CLK 21 having the same frequency.
The FF circuits 151 to 157 receive a clock CLK31 via the clock buffer 158. The FF circuit 151 receives 8-bit data to be filtered. The FF circuits 151 to 157 supply the sequentially received data to the downstream FF circuits 151 to 157 in synchronization with the clock CLK31.
The adder 160 receives the currently supplied data and the data supplied one clock ago. The adder 160 adds these data sets together and supplies the addition result to the adder 162.
The adder 161 receives the data supplied two clocks ago and the data supplied three clocks ago. The adder 161 adds these data sets together and supplies the addition result to the adder 162.
The adder 162 adds the data supplied from the adder 160 and the data supplied from the adder 161 together, and supplies the addition result to the FF circuit 166.
The adder 163 receives the data supplied four clocks ago and the data supplied five clocks ago. The adder 163 adds these data sets together and supplies the addition result to the adder 165.
The adder 164 receives the data supplied six clocks ago and the data supplied seven clocks ago. The adder 164 adds these data sets together and supplies the addition result to the adder 165.
The adder 165 adds the data supplied from the adder 163 and that supplied from the adder 164 together, and supplies the addition result to the FF circuit 167.
The FF circuits 166 and 167 receive the clock CLK31 via the clock buffer 159. The FF circuits 166 and 167 latch the data sets supplied from the adder 162 and the adder 165, respectively, and supply the latched data sets to the adder 168.
The adder 168 adds the data sets supplied from the FF circuits 166 and 167, and supplies the addition result to the shifter 169. The shifter 169 shifts by 3 bit positions toward the LSB (Least Significant Bit) direction the 11-bit data supplied from the adder 168. That is, the shifter 169 divides by 8 the addition data sets equivalent to eight data sets supplied to the FIR filter, thereby obtaining an average value.
When associating the adder 195 and shifter 196 in
That is, the FIR filter in
The FF circuits 171, 173, 175, and 177 receive a clock CLK41 via the clock buffer 179. The FF circuit 171 receives 8-bit data to be filtered. The FF circuits 171, 173, 175, and 177 supply the sequentially received data sets to the downstream FF circuits 171, 173, 175, and 177 in synchronization with the clock CLK41.
The FF circuits 172, 174, 176, and 178 receive the clock CLK41 via the clock buffer 180. The FF circuit 172 receives 8-bit data to be filtered. The FF circuits 172, 174, 176, and 178 supply the sequentially received data sets to the downstream FF circuits 172, 174, 176, and 178 in synchronization with the clock CLK41.
The FF circuits 171, 173, 175, and 177 are positive edge-triggered FF circuits, whereas the FF circuits 172, 174, 176, and 178 are negative edge-triggered FF circuits. That is, the FF circuits 171 and 172 receive the clocks CLK41 having a phase difference of π. When the FF circuits 171 to 178 are the same edge-triggered FF circuits, a delay circuit having a phase difference of π is provided on the output side of any one of the clock buffers 179 and 180.
The adder 183 receives the data supplied from the FF circuit 171 and the data supplied from the FF circuit 172. The adder 183 adds these data sets together and supplies the addition result to the adder 185.
The adder 184 receives the data supplied from the FF circuit 173 and the data supplied from the FF circuit 174. The adder 184 adds these data sets together and supplies the addition result to the adder 185.
The adder 185 adds the data supplied from the adder 183 and the data supplied from the adder 184 together, and supplies the addition result to the FF circuits 189 and 190.
The adder 186 receives the data supplied from the FF circuit 175 and the data supplied from the FF circuit 176. The adder 186 adds these data sets together and supplies the addition result to the adder 188.
The adder 187 receives the data supplied from the FF circuit 177 and the data supplied from the FF circuit 178. The adder 187 adds these data sets together and supplies the addition result to the adder 188.
The adder 188 adds the data supplied from the adder 186 and the data supplied from the adder 187, and supplies the addition result to the FF circuits 192 and 193.
The FF circuit 189 receives the clock CLK41 via the clock buffer 181. In synchronization with the clock CLK41, the FF circuit 189 supplies to the selector 191 the data supplied from the adder 185.
The FF circuit 190 receives the clock CLK41 via the clock buffer 182. In synchronization with the clock CLK41, the FF circuit 190 supplies to the selector 191 the data supplied from the adder 185.
The FF circuit 189 is a positive edge-triggered FF circuit, whereas the FF circuit 190 is a negative edge-triggered FF circuit. That is, the FF circuits 189 and 190 receive the clocks CLK41 having a phase difference of π. When the FF circuits 189 and 190 are the same edge-triggered FF circuits, a delay circuit having a phase difference of π is provided on the output side of any one of the clock buffers 181 and 182.
The selector 191 receives the clock CLK41 via the clock buffer 182. The selector 191 selects any one of the data sets supplied from the FF circuits 189 and 190, and supplies the selected data to the adder 195 based on the state ‘0 or 1’ of the clock CLK41.
The FF circuit 192 receives the clock CLK41 via the clock buffer 181. In synchronization with the clock CLK41, the FF circuit 192 supplies to the selector 194 the data supplied from the adder 188.
The FF circuit 193 receives the clock CLK41 via the clock buffer 182. In synchronization with the clock CLK41, the FF circuit 193 supplies to the selector 194 the data supplied from the adder 188.
The FF circuit 192 is a positive edge-triggered FF circuit, whereas the FF circuit 193 is a negative edge-triggered FF circuit. That is, the FF circuits 192 and 193 receive the clocks CLK41 having a phase difference of π. When the FF circuits 192 and 193 are the same edge-triggered FF circuits, a delay circuit having a phase difference of π is provided on the output side of any one of the clock buffers 181 and 182.
The selector 194 receives the clock CLK41 via the clock buffer 182. The selector 194 selects any one of the data sets supplied from the FF circuits 192 and 193, and supplies the selected data to the adder 195 based on the state ‘0 or 1’ of the clock CLK41.
The adder 195 adds the data sets supplied from the selectors 191 and 194, and supplies the addition result to the shifter 196. The shifter 196 shifts by 3 bits positions toward the LSB direction the 11-bit data supplied from the adder 195 to produce 8 bit-data.
The FIR filter in
The FF circuit 171 latches the data (in) on the positive edge of the clock CLK41. Accordingly, the FF circuit 171 has an output as indicated at “ff1.o” in
The FF circuits 173, 175, and 177 sequentially latch the data supplied from the FF circuit 171 while delaying the output by one clock cycle. Accordingly, the FF circuits 173, 175, and 177 have outputs as indicated at “ff3.o”, “ff5.o”, and “ff7.o” in
The FF circuit 172 latches the data (in) on the negative edge of the clock CLK41. Accordingly, the FF circuit 172 has an output as indicated at “ff2.o” in
The FF circuits 174, 176, and 178 sequentially latch the data supplied from the FF circuit 172 while delaying the output by one clock cycle. Accordingly, the FF circuits 174, 176, and 178 have outputs as indicated at “ff4.o”, “ff6.o”, and “ff8.o” in
The adder 183 adds the outputs of the FF circuits 171 and 172. Accordingly, the adder 183 has an output as indicated at “add1.o” in
The adder 184 adds the outputs of the FF circuits 173 and 174. Accordingly, the adder 184 has an output as indicated at “add2.o” in
The adder 186 adds the outputs of the FF circuits 175 and 176. Accordingly, the adder 186 has an output as indicated at “add3.o” in
The adder 187 adds the outputs of the FF circuits 177 and 178. Accordingly, the adder 187 has an output as indicated at “add4.o” in
The adder 185 adds the outputs of the adders 183 and 184. Accordingly, the adder 185 has an output as indicated at “add5.o” in
The adder 188 adds the outputs of the adders 186 and 187. Accordingly, the adder 188 has an output as indicated at “add6.o” in
The FF circuit 189 latches, on the positive edge of the clock CLK41, the data supplied from the adder 185. Accordingly, the FF circuit 189 has an output as indicated at “ff9.o” in
The FF circuit 190 latches, on the negative edge of the clock CLK41, the data supplied from the adder 185. Accordingly, the FF circuit 190 has an output as indicated at “ff10.o” in
The FF circuit 192 latches, on the positive edge of the clock CLK41″ the data supplied from the adder 188. Accordingly, the FF circuit 192 has an output as indicated at “ff11.o” in
The FF circuit 193 latches, on the negative edge of the clock CLK41, the data supplied from the adder 188. Accordingly, the FF circuit 193 has an output as indicated at “ff12.o” in
The selector 191 selects any one of the data sets supplied from the FF circuits 189 and 190, and supplies the selected data to the adder 195 in synchronization with the clock CLK41. Accordingly, the selector 191 has an output as indicated at “se11.o” in
The selector 194 selects any one of the data sets supplied from the FF circuits 192 and 193, and supplies the selected data to the adder 195 in synchronization with the clock CLK41. Accordingly, the selector 194 has an output as indicated at “se12.o” in
The adder 195 adds the outputs of the selectors 191 and 194. Accordingly, the adder 195 has an output as indicated at “add7.o” in
The shifter 196 shifts by 3 bits the output of the adder 195. Accordingly, the shifter 196 has an output as indicated at “out” in
Thus, the FIR filter for processing two serial data sets shown in
In the FIR filter of
When associating the circuit of
The FF circuits 203 to 206 receive a clock CLK51 via a clock buffer 201. The FF circuit 207 receives the clock CLK51 via a clock buffer 202.
A frequency of the clock CLK51 is twice that of the clock CLK31 of
A reduction effect of power consumption in the FIR filter of
Suppose that power consumption in the selector having two inputs is 1.5 times that of Basic Cell (Basic Cell: NAND circuit or NOR circuit having two inputs) and power consumption in the FF circuit is ten times that of Basic Cell. In the formula (5), the first term representing an increment of power consumption in the selector and a reduction of power consumption in the FF circuit is as represented by the following formula (6).
Since formula (5) uses an approximation, the formula (6) is not directly derived from the formula (5) but derived from the formulas (2) and (3).
Further, the FIR filter of
A value obtained by adding formulas (6) and (7) together is the power reduction amount of the FIR filter of
Thus, by inserting in parallel the FF circuits 189, 190, 192, and 193 and the selectors 191 and 194 in the logic circuit of the FIR filter in
The foregoing is considered as illustrative only of the principles of the present invention. Further, since numerous modifications and changes will readily occur to those skilled in the art, it is not desired to limit the invention to the exact construction and applications shown and described, and accordingly, all suitable modifications and equivalents may be regarded as falling within the scope of the invention in the appended claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2008-084468 | Mar 2008 | JP | national |