The present application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2010-085470 filed on Apr. 1, 2010, with the Japanese Patent Office, the entire contents of which are incorporated herein by reference.
The disclosures herein generally relate to electronic circuits, and particularly relate to a reconfigurable circuit that can be dynamically reconfigured.
A dynamic reconfigurable circuit (hereinafter referred to as a reconfigurable circuit) includes a data execution unit inclusive of a plurality of execution elements having various execution functions, and also includes a data selecting unit serving as a network for connecting between the execution elements. Based on configuration data that are externally set, settings are made in a reconfigurable manner with respect to execution instructions for the execution elements of the data execution unit and connections provided by the data selecting unit between the execution elements. The configuration data may be updated while the reconfigurable circuit is operating, thereby dynamically modifying execution instructions and connections between the execution elements. This allows the execution units to be shared in a time-division fashion, thereby reducing the size of hardware of the entire circuit. Further, a reconfigurable circuit may perform a pipeline operation, and is thus capable of processing a data stream at high speed.
The execution elements that provide execution functions exhibit different execution latencies, depending on the type of execution. In general, the number of execution cycles for completing an execution is relatively small for simple computation whereas the number of execution cycles for completing an execution is relatively long for complex computation. A given execution element may receive execution results from two preceding execution elements. In such a case, these execution results from the two execution elements may not become available at the same time due to different execution latencies, the order of executions, etc. In order to align the timings, a delay may be introduced to a data path along which the earlier execution result is supplied. Further, when a desired computation is implemented by combining various execution functions, the function to count the number of execution cycles for completing the computation may be provided for the purpose of notifying an external device of the completion of all the computation.
The delay circuit that introduces the above-noted delay is preferably designed such that the number of delay stages is adjustable to a desired number of cycles in order to ensure latitude in circuit configuration. The provision of such a delay circuit, however, gives rise to a problem of an increase in circuit size of the data execution unit. As the number of delay circuits increases, further, circuit size increases both in the data execution unit and in the data selecting unit. This causes reduction in the operating speed of paths passing through the data selecting unit. In order to implement the above-noted function to count the number of execution cycles for completing computation, a memory may be provided to store data indicative of the number of execution cycles for the maximum number of possible computations, and, also, a counter may be provided to count the number of execution cycles. This also adds to the problem of increases in circuit size. Such increases in circuit size grow larger as the number of possible combinations of execution functions and the number of types of execution functions increase, i.e., as desired versatility increases.
According to an aspect of the embodiment, a reconfigurable circuit includes a data execution unit including a plurality of execution elements, each of which performs execution with respect to plural data upon the plural data being all in a valid state, and holds valid-state output data indicative of a result of the execution at an output node thereof while all the plural data are in the valid state, a data selecting unit configured to connect between the execution elements in a reconfigurable manner, and a data input unit configured to store data as input data that are supplied to a series of execution elements connected through the data selecting unit to perform a series of executions, wherein a valid or invalid state of given data is specified by a valid signal indicative of a valid or invalid state, the valid signal accompanying and forming a pair with the given data, and the input data supplied from the data input unit to the data execution unit are fixed to valid-state constant data while the series of executions are performed.
According to an aspect of the embodiment, a method is provided to operate a reconfigurable circuit which includes a data execution unit including a plurality of execution elements, each of which performs execution with respect to plural data upon the plural data being all in a valid state, and holds valid-state output data indicative of a result of the execution at an output node thereof while all the plural data are in the valid state, a data selecting unit configured to connect between the execution elements in a reconfigurable manner, and a data input unit configured to store data as input data that are supplied to a series of execution elements connected through the data selecting unit to perform a series of executions. The method includes causing the input data supplied from the data input unit to the data execution unit to be fixed to valid-state constant data while the series of executions are performed.
According to an aspect of the embodiment, a reconfigurable circuit includes a data execution unit including a plurality of execution elements, each of which latches plural data in a simultaneous or sequential manner in an order in which the plural data become valid, and performs execution with respect to the plural data upon the plural data being all latched, thereby outputting valid-state data indicative of a result obtained by the execution, a data selecting unit configured to connect between the execution elements in a reconfigurable manner, and a data input unit configured to store data as input data that are supplied to a series of execution elements connected through the data selecting unit to perform a series of executions, wherein a valid or invalid state of given data is specified by a valid signal indicative of a valid or invalid state, the valid signal accompanying and forming a pair with the given data,
The object and advantages of the embodiment will be realized and attained by means of the elements and combinations particularly pointed out in the claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
A description will first be given of a general configuration a reconfigurable circuit.
Executions performed by the execution function units 23-1 through 23-7 are specified by execution instructions included in configuration data. Connections between the execution function units 23-1 through 23-7 are specified by connection data included in configuration data. Such configuration data are stored in the configuration memory 21A. From a plurality of configuration data pieces stored in the configuration memory 21A, the sequencer 21B selects a configuration data piece indicative of a current operation type of the data selecting unit 13 and the data execution unit 14. The configuration data piece selected by the sequencer 21B is supplied to the data selecting unit 13 and the data execution unit 14, so that the data selecting unit 13 and the data execution unit 14 operate according to the operation type corresponding to the configuration data. The sequencer 21B successively selects configuration data pieces for provision to the data selecting unit 13 and the data execution unit 14, thereby successively updating the type of operation (i.e., contexts) of the data selecting unit 13 and the data execution unit 14.
When at least one of the valid signal inputs VALID0 and VALID1 is “0” indicative of a data invalid state at a rising edge of the clock signal CLOCK, the valid signal output VALID_OUT becomes “0” indicative of an invalid state. In the example illustrated in
The execution function unit may be configured such that the execution data output and the valid signal output are changed in synchronization with the clock signal CLOCK. When the execution function unit performs execution in response to the rising edge T3, for example, the execution data output DATA_OUT may be updated to a new execution result, and the valid signal output VALID_OUT may be set equal to “1” at the next rising edge T4. When at least one of the valid signal inputs VALID0 and VALID1 is “0” indicative of a data invalid state at the rising edge T4, the valid signal output VALID_OUT may change from “1” to “0” at the next rising edge T5. In this example, the latency of execution is equal to one clock cycle. However, the latency of execution is not limited to one clock cycle, and may be equal to any number of clock cycles. When the execution function unit performs execution in response to a given rising edge, for example, the execution data output DATA_OUT may be updated to a new execution result, and the valid signal output VALID_OUT may be set equal to “1” at a rising edge of the n-th following cycle. When at least one of the valid signal inputs VALID0 and VALID1 is “0” indicative of a data invalid state at the rising edge of the next following cycle, the valid signal output VALID_OUT may change from “1” to “0” at the rising edge of the n+1-th following cycle.
The control unit 10 illustrated in
The internally provided signals may be signals supplied from the execution output ports (PORT10 through PORT17 illustrated in
The control unit 10 makes settings to the data selecting unit 13 and the data execution unit 14 according to desired configuration data, and, also, writes input data to be processed to the data input output unit 12. After this, the control unit 10 sets the selection control signal of the data input output unit 12 illustrated in
Data I20 through I22 of the execution input port PORT2 illustrated in (c) and data I30 through I132 of the execution input port PORT3 illustrated in (d) are executed by the execution function unit 23-2. As illustrated in (e), execution results C20 through C22 are output with a delay of two cycles from the input data where the delay is equal to the latency of the execution function unit 23-2. In order to match the timing of the execution results C20 through C22, data I10 through I12 of the execution input port PORT1 illustrated in (f) are input into the delay unit 24-1 to be delayed by two cycles as illustrated in (g). The execution results C20 through C22 illustrated in (e) and the input data I10 through I12 illustrated in (g) are executed by the execution function unit 23-5. As illustrated in (h), execution results C50 through C52 are output with a delay of one cycle which is equal to the latency of the execution function unit 23-5. The execution results θ50 through C52 are output from the execution output port PORT17 as illustrated in (i).
Further, data I10 through I12 of the execution input port PORT1 illustrated in (f) and data I20 through I22 of the execution input port PORT2 illustrated in (c) are executed by the execution function unit 23-1. As illustrated in (j), execution results C10 through C12 are output with a delay of one cycle from the input data where the delay is equal to the latency of the execution function unit 23-1. In order to match the timing of the execution results C20 through C22, the execution results C10 through C12 illustrated in (j) are input into the delay unit 24-2 to be delayed by one additional cycle as illustrated in (k). The execution results C20 through C22 illustrated in (e) and the execution results C10 through C12 illustrated in (k) are executed by the execution function unit 23-4. As illustrated in (l), execution results C40 through C42 are output with a delay of one cycle which is equal to the latency of the execution function unit 23-4. The execution results C40 through C42 are output from the execution output port PORT14 as illustrated in (m).
The execution completion detection signal PREDICATE illustrated in (h) is generated by the counter 25 illustrated in
In response to the assertion of the execution completion detection signal PREDICATE, the control unit 10 sets the selection control signal of the data input output unit 12 to a value indicative of the selection of external signals. Further, the control unit 10 places the external write enable in a negated state and supplies a read address, thereby reading data from the data memories 32 of the data input output unit 12.
In the configuration described above, the number of delay stages of the delay units 24-1 through 24-4 is preferably settable to a desired number of cycles as previously described. The provision of such delay units, however, gives rise to a problem of an increase in circuit size of the data execution unit. Further, as the number of delay circuits increases, further, circuit size increases both in the data input output unit 12 and in the data selecting unit 13. This causes reduction in the operating speed of paths passing through the data selecting unit 13. Also, the provision of the counter 25 gives rise to a problem of an increase in circuit size. Such increases in circuit size grow larger as the number of possible combinations of execution functions and the number of types of execution functions increase, i.e., as desired versatility increases.
In the following, embodiments will be described with reference to the accompanying drawings.
The reconfigurable circuit illustrated in
The data execution unit 64 includes a plurality of execution function units 71-1 through 71-8. The number of the execution function units is only an example, and is not limited to this example. The network circuit 22 of the data selecting unit 13 connects between the execution function units 71-1 through 71-8 of the data execution unit 64 in a reconfigurable manner. The data input output unit 62 includes a plurality of register units 70-1 through 70-8 and an execution completion detection unit 72. The number of the register units 70-1 through 70-8 is only an example, and is not limited to this example. The data selecting unit 13 includes execution output ports PORT10 through PORT17 connected to the execution completion detection unit 72. Signals indicative of a valid or invalid state of one or more execution results obtained by a series of executions are supplied to the execution completion detection unit 72 through one or more of the execution output ports PORT10 through PORT17. Output data O0 through O7 of the execution output ports PORT10 through PORT17 are also supplied to the register units 70-1 through 70-8, respectively.
Executions performed by the execution function units 71-1 through 71-8 are specified by execution instructions included in configuration data. Connections between the execution function units 71-1 through 71-8 are specified by connection data included in configuration data. Such configuration data are stored in the configuration memory 21A. From a plurality of configuration data pieces stored in the configuration memory 21A, the sequencer 21B selects a configuration data piece indicative of a current operation type of the data selecting unit 13 and the data execution unit 64. The configuration data piece selected by the sequencer 21B is supplied to the data selecting unit 13 and the data execution unit 64, so that the data selecting unit 13 and the data execution unit 64 operate according to the operation type corresponding to the configuration data. The sequencer 21B successively selects configuration data pieces for provision to the data selecting unit 13 and the data execution unit 64, thereby successively updating the type of operation (i.e., contexts) of the data selecting unit 13 and the data execution unit 64.
An execution function unit performs the operation described in connection with
The latency of execution may be the number of clock cycles responsive to the type of execution (i.e., execution instruction). The latency of execution is not limited to one clock cycle, and may be equal to any number of clock cycles. When the execution function unit performs execution in response to a given rising edge, for example, the execution data output DATA_OUT may be updated to a new execution result, and the valid signal output VALID_OUT may be set equal to “1” at a rising edge of the n-th following cycle. When at least one of the valid signal inputs VALID0 and VALID1 is “0” indicative of a data invalid state at the rising edge of the next following cycle, the valid signal output VALID_OUT may change from “1” to “0” at the rising edge of the n+1-th following cycle. The latency of a valid signal is equal to the latency for outputting an execution result. Such latency of a valid signal may be implemented by using a shift register having stages that are equivalent in number to the latency for performing execution.
The control unit 60 illustrated in
Internally provided signals are the signals O0 through O7 supplied from the execution output ports PORT10 through PORT17 of the data selecting unit 13. Each of the signals O0 through O7 includes a data signal and a data valid signal. The data valid signal serves to indicate a valid or invalid state of the data signal. With the selection control signal being “0” indicative of the selection of internal signals, execution result data obtained as a result of a series of executions are written to a specified write address in the data memory 82. Specifically, the rising-edge detection circuit 85 generates a HIGH pulse for the length of one cycle upon the valid signal of the execution result being changed to “1”. This HIGH pulse is supplied to the data memory 82 as a write enable signal WE, so that the execution result data is written to the data memory 82 as write data WD. With the selection control signal being “0” indicative of the selection of internal signals, the start indicating signal START_TRIGGER is asserted, so that this asserted signal is supplied to the data memory 82 as an enable signal EN. When this happens, the write enable signal WE is in a negated state, so that data is read from the specified read address in the data memory 82. Further, in response to the start indicating signal START_TRIGGER, the flip-flop 84 is set to “1”, so that the output data valid is set to “1” indicative of a valid state. Namely, in response to the assertion of the start indicating signal START_TRIGGER, the data input output unit 62 starts supplying valid data to the data execution unit 64. The flip-flop 84 is reset to “0” in response to the assertion of the execution completion detection signal PREDICATE supplied from the execution completion detection unit 72, which will be described later. Namely, in response to the assertion of the execution completion detection signal PREDICATE, the data input output unit 62 stops supplying valid data to the data execution unit 64.
The data memory 82 of the register unit 70-1 is configured to hold read data in an output state (i.e., to maintain the read data in a state in which the read data is being output at the output node). Namely, when a data read operation is performed with respect to a specified address, read data RD is output from the data memory 82, and is thereafter maintained in such an output state. The data input output unit 62 has data stored in the data memory 82 where this data is to be applied to execution function units connected through the data selecting unit 13 to perform a series of executions. While the series of executions are being performed, the data output of the data memory 82 is held in an output state as described above, so that the input data supplied from the data input output unit 62 to the data execution units 64 is fixed to a constant value in a valid state. Namely, unlike the case illustrated in
The execution completion detection unit 72 includes selectors 91-1 through 91-8, an AND gate 92, and a rising-edge detection circuit 93. Selection control signals to the selectors 91-1 through 91-8 are supplied from the control unit 60. These selection control signals control whether to supply the valid signals O0.VALID through O7.VALID to the data memory 82 through the selectors 91-1 through 91-8, respectively. Specifically, a selector that receives a selection control signal being “1” selects the valid signal for provision to the data memory 82. Further, a selector that receives a selection control signal being “0” selects a fixed value of “1” for provision to the data memory 82. The data memory 82 sets its output to “1” when all the selected valid signals are “1”. The rising-edge detection circuit 93 detects a rising edge upon the output of the AND gate 92 being changed from “0” to “1”, thereby generating a HIGH pulse for the length of one cycle. This HIGH pulse for one cycle corresponds to the asserted state of the execution completion detection signal PREDICATE. The execution completion detection signal PREDICATE is supplied to the control unit 60 and the data input output unit 62. In this manner, the selectors in the execution completion detection unit 72 are controlled by the selection control signals, thereby selecting output ports that are taken into consideration when deciding whether relevant signals indicative of a valid or invalid state all indicate a valid state simultaneously.
The control unit 60 makes settings to the data selecting unit 13 and the data execution unit 64 according to desired configuration data, and, also, writes input data to be processed to the data input output unit 62. After this, the control unit 60 sets the selection control signal of the data input output unit 62 described in connection with
Data I21 of the execution input port PORT2 illustrated in (c) and data I31 of the execution input port PORT3 illustrated in (d) are executed by the execution function unit 71-2. As illustrated in (e), an execution result C21 is output with a delay of two cycles from the input data where the delay is equal to the latency of the execution function unit 71-2. An AND operation is performed between a valid signal for the execution result C21 illustrated in (e) and a valid signal for the data I11 of the execution output port PORT1 illustrated in (f), and the resulting AND logic value is illustrated in (g). As the AND logic value illustrated in (g) becomes “1”, the execution result C21 illustrated in (e) and the input data I11 illustrated in (f) are executed by the execution function unit 71-5. As illustrated in (h), an execution result C51 are output with a delay of one cycle which is equal to the latency of the execution function unit 71-5. This execution result C51 is output from the execution output port PORT17 as illustrated in (i). The output value of the execution output port PORT17 is written to the register unit 70-7 as illustrated in (j).
Further, data I11 of the execution input port PORT1 illustrated in (f) and data I21 of the execution input port PORT2 illustrated in (c) are executed by the execution function unit 71-1. As illustrated in (k), an execution result C11 is output with a delay of one cycle from the input data where the delay is equal to the latency of the execution function unit 71-1. An AND operation is performed between a valid signal for the execution result C11 illustrated in (k) and a valid signal for the execution result. C21 illustrated in (e), and the resulting AND logic value is illustrated in (l). As the AND logic value illustrated in (g) becomes “1”, the execution result C21 illustrated in (e) and the execution result C11 illustrated in (k) are executed by the execution function unit 71-4. As illustrated in (m), an execution result C41 is output with a delay of one cycle which is equal to the latency of the execution function unit 71-4. This execution result C41 is output from the execution output port PORT14 as illustrated in (n). The output value of the execution output port PORT14 is written to the register unit 70-4 as illustrated in (o).
The execution completion detection unit 72 asserts the execution completion detection signal PREDICATE illustrated in (p) upon one or more execution results (i.e., the output of PORT14 and the output of PORT17 in the example of
In response to the assertion of the execution completion detection signal PREDICATE, the data input output unit 62 stops supplying valid data to the data execution unit 64. Namely, at each of the execution input ports PORT2, PORT3, and PORT1 illustrated in FIG. 9-(c), (d), and (f), respectively, the valid signal changes “1” to “0” in response to the assertion of the execution completion detection signal PREDICATE. At each of the execution, results illustrated in (e), (h), (k), and (m), also, the valid signal changes from “1” to “0” in the next clock cycle in response to the above-noted changes in the valid signals.
In response to the assertion of the execution completion detection signal PREDICATE, the control unit 60 updates the read address supplied to the data input output unit 62 to a next read address for the purpose of starting a next series of executions. After this, the control unit 60 asserts the start indicating signal START_TRIGGER at the HIGH level as illustrated in (b). As a result, a series of executions identical to the above-described series of executions will be performed with respect to next data. It may be noted that the contexts may also be updated at this time to perform a new series of executions different from the previous one. The control unit 60 also reads the read data C51 from the register unit 70-7 illustrated in (j) and the read data C41 from the register unit 70-4 illustrated in (o) as execution results of the series of executions.
In the reconfigurable circuit illustrated in
The reconfigurable circuit illustrated in
Data I21 of the execution input port PORT2 illustrated in (c) and data I31 of the execution input port PORT3 illustrated in (d) are executed by the execution function unit 71-2. As illustrated in (e), an execution result C21 is output with a delay of two cycles from the input data where the delay is equal to the latency of the execution function unit 71-2. The execution function unit 71-5, for example, performs execution with respect to the execution result C21 illustrated in (e) and the data I11 of the execution input port PORT1 illustrated in (f). In so doing, the input latch latches the execution result C21 and the data I11, and, then, the execution is performed to generate the execution result C51 as illustrated in (g). As illustrated in (g), an execution result C51 are output with a delay of one cycle which is equal to the latency of the execution function unit 71-5. In the execution function unit 71-5, the operation to latch the execution result C21 is immediately performed in response to “1” of the valid signal for the execution result C21. Also, the operation to latch the data I11 is immediately performed in response to “1” of the valid signal for the data I11.
In the configuration described above, an execution function unit latches plural data in a simultaneous or sequential manner in the order in which they become valid. When all the plural data are latched, the execution function unit performs execution with respect to these plural data, thereby to output valid data indicative of a result obtained from the execution. Further, the execution completion detection unit may be configured to latch and hold the valid signals O0.VALID through O7.VALID supplied thereto. The execution completion detection unit may then assert the execution completion detection signal PREDICATE upon one or more execution results of a series of executions being all in a valid state. Unlike the reconfigurable circuit illustrated in
According to at least one embodiment, data supplied from the data input unit in the reconfigurable circuit is maintained in a signal activated state (i.e., data supply state) while a series of executions are being performed. With this arrangement, each execution function unit for performing execution based on data supplied from the data input unit can maintain its output in a state in which the execution result in a valid state is being output. Accordingly, for any given execution function unit that receives plural data, each of the plural data is held in a valid state without being inactivated, so that it suffices for the given execution function unit to start execution after waiting for all the data to be simultaneously in a valid state. A delay circuit for achieving timing alignment is thus no longer provided.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiment(s) of the present inventions have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2010-085470 | Apr 2010 | JP | national |