This invention relates to signal processing by a processor. In particular, this invention concerns processing of signals implementing periodic behaviors using one reference.
In the prior art, when a hardware block is implemented after a reset, at cycle 4i+0, the hardware block drives a signal from a first register A. Then, at cycle 4i+2, the hardware drives the signal from a second register B. At all other times, the hardware drives the signal to 0.
As should be appreciated by those skilled in the art, this is an example of periodic behavior, with a period of 4 cycles. In other words, within each group of 4 cycles, the hardware block repeats the same behavior.
As may be appreciated by those skilled in the art, a standard way of implementing the processing in this 4-cycle hardware block is with an external control to drive the block. An example of the implementation of the standard approach is provided below in Code Segment #1.
This implementation takes no advantage of the periodicity incorporated into the code segment. Code Segment #1 relies on externally generated control.
As a result, there has developed an interest in the art for implementations that do avail themselves of the periodicity incorporated into a particular code segment.
This need remains unaddressed by the prior art.
The invention is directed to at least this failing in the prior art by implementing an instruction set that capitalizes on the periodicity incorporated into a particular code segment.
One aspect of the invention implements a hardware block in an environment that exhibits a periodic behavior with a period of N cycles. Within the N cycles, at various points, the hardware block executes functions that may differ from one another.
Contrary to the standard implementation technique, the invention provides “enables” for each of the separate executions of the functions within the hardware block with a period of N.
In one embodiment of the invention, the hardware block is implemented via a technique where a single reference is provided according to a 0 . . . N−1 count. All controls are generated locally based on that reference.
Other aspects of the invention will become apparent from the description that follows and the drawings appended hereto.
The invention will now be described in connection to the figures appended hereto, in which:
In connection with the description of the invention, one or more embodiments are described. As should be appreciated by those skilled in the art, the embodiments are intended to be exemplary only. Those skilled in the art will readily appreciate that there are equivalents and variations that also may be implemented without departing from the scope of the invention. Those equivalents and variations are intended to fall within the scope of the invention as described herein.
As discussed above, in the prior art, when a hardware block is implemented after a reset, at cycle 4i+0, the hardware block drives a signal from a first register A. Then, at cycle 4i+2, the hardware drives the signal from a second register B. At all other times, the hardware drives the signal to 0. As noted above, a standard way of implementing the processing in this 4-cycle hardware block is with an external control to drive the block.
The invention avoids implementing the processing in this 4-cycle hardware block with an external control to drive the hardware block. The invention, at least in one or more of its embodiments, internalizes control into the hardware block, at least partially.
Specifically, the invention incorporates a counter that is the length of the period. In one contemplated embodiment, the counter operates from 0 . . . 3. In the generic embodiment, the counter operates from 0 . . . N−1. Relying on the counter, the hardware block implements its control based on the value of the period. An example of this implementation is provided by Code Segment #2, below.
In this embodiment, external control is still required to generate an id counter so that 0 corresponds to the point at which A is read, and so on.
While acceptable, this approach is inconvenient for several reasons. For example, it is conceivable that there may be multiple blocks with the same period, but which need to initiate at different start times. As a result, a more efficient approach is to have only one counter feeding all of the blocks.
In addition, during development, it may be necessary to shift the start point of one or more of the blocks by one cycle.
In another contemplated variation, it may be necessary to move the example block by two cycles so that A is selected at 4i+2, and B at 4i+5=4*(i+1)+1.
Taking this into account, the invention contemplates an approach where the behaviors of one or more of the blocks are shifted by some offset from the counter 0. One straight forward way of doing this is by decrementing the counter by the offset. An example of this approach is provided in Code Segment #3, below.
As is apparent, in this approach, the 0 of the id is moved back to where the block expects it to be. Thus, if the id is 4i+2, id_adj will be 4i+0, and A will be selected.
An alternative to this approach is to move the phase in which the behavior will be selected. Thus, the code is changed so that A will now be triggered when id is 2 (i.e., “10”) instead of 0. This approach is detailed in Code Segment #4, below. For this approach, it is noted that VHSIC Hardware Description Language (“VHDL”) is not a valid variable. If VHDL were a valid variable, the example would be much larger, making awkward the illustration of this approach. For this reason, VHDL is presented in the manner provided in Code Segment #4.
The invention will now be described in connection with what is referred to as “state access.”
In one contemplated example of the invention, a state includes four (4) identical banks, A, B, C, and D. The banks are accessed periodically, with A, B, C, and D being read at phases 0, 1, 2, and 3 and being written at phase 3, 0, 1, and 2. This arrangement exists in a multi-threaded processor with 4 threads, where the threads are barrel-threaded (i.e., they are processed in a fixed sequence). In this example, the 4 banks correspond to the state of each of the threads, where the state for a thread t is read at 4i+t and written at 4i+t+3. A sample implementation for this example is provided below in Code Segment #5.
In the example provided in Code Segment #5, the phase of the counter is assumed to be in a condition such that phase 0 corresponds to the read of thread 0.
Suppose the phase of the counter changes so that the phase 0 of the counter is moved by one cycle, so that the phase corresponds to a read of thread 1. In this particular instance, the logic set forth above does not change. Instead, the state for thread 1 will be stored in regs(0), and so on, with the state for thread 0 being stored in regs(3). The function, however, remains the same.
As should be appreciated by those skilled in the art, the embodiments discussed so far have focused on loops comprising 4 steps, enabled at counters 0, 1, 2, and 3, in the simplest example. The invention, however, is not limited to four steps, but may encompass N steps. When N steps are incorporated into the loop, the counter will enable the functions from 0 to N−1. As also may be appreciated by those skilled in the art, the loop will contain at least two functions because one aspect of the invention is to capitalize on the periodicity of certain processing schema.
Taking this into account, a loop may contain N steps with at least a first function and a second function being enabled at steps X and Y respectively within the loop. Steps other than the first and second functions drive the signal to 0.
As is apparent from the foregoing, the start of one or more of the periods for a processing scheme may be shifted by one or more counts of the counter. In other words, the start of the period is altered to begin at a different cycle within the loop. This is referred to as an offset.
In cases where several hardware blocks assist with the processing scheme, the counter applies to each of the hardware blocks. In addition, one or more of the hardware blocks may be offset from others of the hardware blocks.
However, the hardware blocks need not write to these registers. Instead, they may be shifted out of phase from the default condition. For example, in a phase shift where the shift amount S=1, data processed from block B is written to regs(0), block C to regs(1), block D to regs(2) and block A to regs (3). The phase shift may be more than 1, as should be appreciated by those skilled in the art.
With reference to
In the discussion above, the id counter is viewed as providing a N-valued phase reference Φi. As is apparent from the foregoing discussion, 0 to N−1 have been used herein to identify phase reference values, but any sequence of N distinct values within the range Φ0 . . . ΦN−1 may be used without departing from the scope of the invention. In one contemplated embodiment, a grey-counter (values 00, 01, 10, 11) may be employed to provide the phase reference values.
As should be apparent, different behaviors in the block are triggered when the phase reference reaches a particular predetermined value. From a generic perspective, function A is enabled when the phase reference value is ΦA. As may be appreciated by those skilled in the art, ΦAε{Φ0 . . . ΦN−1}. Accordingly, function A is triggered at a predetermined phase reference point within the total phase range.
Separately, it is contemplated that the same function may be enabled more than once during the processing period. For example, function A may be enabled at phase reference values, ΦA1 and ΦA2. It should be apparent to those skilled in the art that ΦA1ε{Φ0 . . . ΦN−1} and that ΦA2ε{Φ0 . . . ΦN−1} In most cases, it is contemplated that the total phase range corresponds to a single processing period. As should be apparent, the phase range alternatively may correspond to more than one processing period without departing from the invention.
Not only may the same function be repeated during a processing period, it is also contemplated that a plurality of functions, X . . . Y, may be processed at different predetermined phase reference values, ΦA.
Additionally, the invention contemplates using an offset δ. Use of an offset δ within a block is equivalent to moving the enable point (ΦA) of the function A by δ with respect to the phase reference. The value of the offset δ equates with a displacement with respect to the original phase. This may be implemented in one of two ways:
(1) A second phase reference sequence may be generated by rotating the original phase reference sequence by δ, so that Φ′i=Φ(i+δ) mod N; or
(2) The enable point may be moved by δ, so that the function A is enabled when the phase reference value is Φ′A=Φ(A−δ) mod N.
Other offsets may be employed as well, these two examples merely being illustrative of two approaches contemplated by the invention.
In some blocks, such as the register example above, the absolute position of a behavior with respect to the phase reference is not determinative. Instead, there may be two (or more) behaviors, and it is the phase difference or distance between them that is determinative. Thus, in the register example above, the invariant that Φread=Φ(write+1) mod N is maintained. Such blocks are immune to changes in the origin of the phase reference.
Reference is now made to
In one contemplated variation of the method 30, the phase reference is sequentially incremented by a counter from 0 to N−1, thereby defining at least one processing period. In another contemplated variation, the method 30 includes receiving a reset signal. Upon receipt of the reset signal, the phase reference is initialized to 0. In still another variation of the method 30, a clock signal is received. The phase reference phase reference, Φi, is incremented in response to receipt of the clock signal.
As may be appreciated from the foregoing, while the method 30 may be employed for a single function, it may alternatively be employed for multiple functions. In such a case, the plurality of functions may be enabled at different ones of the plurality of the predetermined phase reference values ΦA.
In still another contemplated variation, one or more of the functions may be enabled multiple times during the processing period. If implemented, the function or functions may be enabled at two or more of the predetermined phase reference values ΦA.
The invention also contemplates phase shifting.
With reference to
As is apparent from the foregoing, once the internal phase reference, Φ′i is substituted for the phase reference, Φi the method 30, illustrated in
An alternative phase shift is illustrated in
As should be appreciated by those skilled in the art, when various aspects of the invention are practiced, they may be employed during a single processing period. Alternatively, the various methods may be enabled on more than one processing period executed sequentially or in parallel. When executed in parallel, the processing periods may be executed on several separate hardware blocks at the same time.
When multiple hardware blocks process information at the same time, it is contemplated that not all of the processing blocks will operate in the same fashion. For example, in one contemplated embodiment, one or more of the processing blocks may operate without a phase change or offset while others of the hardware blocks may enable functions in a phase-shifted or offset manner. The phase shifts may be enabled according to one or both of methods 40 and 52. As should be apparent, in yet another variation, some of the hardware blocks may employ the method 40, while others employ the method 52.
As discussed above with reference to the method 10 illustrated in
When separate hardware blocks are employed, they may be numbered 0 through N. If so, they may include write functions that write to regs(0) through regs(N), respectively. When shifted, the separate hardware blocks operate according to a shift by S such that the write function writes to regs(0−S) though regs(N−S), where regs(0−S) correspond to regs(N−S+1) through regs(N), respectively.
As should be appreciated by those skilled in the art, there are numerous equivalents and variations of the embodiments described herein in connection with the invention. The equivalents and variations are intended to be encompassed by the invention.
This International Application relies for priority on U.S. Provisional Patent Application Ser. No. 60/989,416, which was filed on Nov. 20, 2007, the contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
60989416 | Nov 2007 | US |