The present application is generally related to systems and methods for permuting a vector.
In a number of applications, it is desirable to process an input vector to permute the vector elements in a random manner to generate an output vector. Also, it is desirable to perform the permutation at very high speeds. An optimal method to perform the permutation is factorial permutation. Factorial permutation uses a source of independent, uniformly distributed discrete random variables of arbitrary span or modulus, i.e. uniform over 0 to N−1 where N is an arbitrary integer. Also, it is assumed that the vector to be permuted is of length M. In factorial permutation, the first input element of the input vector is assigned to one of the M positions of the output vector using a random variable of span M−1. The second element is then assigned to one of the M−1 remaining positions using a random variable of span M−2. The assignment continues in a similar manner until the final element of the input vector is assigned to a position in the output vector. Randomization of the vector entries in this manner enables M! permutations.
Factorial permutation has limitations when applied to high speed applications. In particular, factorial permutation is a sequential algorithm. Although pipelining may be applied to adapt factorial permutation for high speed applications, such adaptation imposes significant complexity and latency in the integrated circuitry. The second and more difficult problem is obtaining uniform random numbers of arbitrary modulus. Some existing algorithms that enable such uniform random numbers to be generated are not generally amenable to high-speed operation. Another existing algorithm involves repeated trials to obtain a value in the allowable range and, hence, is not deterministic in time.
Some representative embodiments are directed to systems and methods that permute an input vector using a “butterfly” structure. The butterfly structure is similar to the butterfly structure used by the fast Fourier transform (FFT) and the fast Hadamard transform (FHT) algorithms. In one embodiment, the vector to be permuted comprises M vector entries and the corresponding butterfly structure comprises log2M stages. The individual butterfly elements of the structure enable two respective vector entries to switch positions as the entries are routed between butterfly stages. Specifically, in each stage (denoted by “s”), the vector entries are grouped in groups of 2s entries. In each stage, the arrangement of the butterfly elements enables the ith vector element to switch positions with the 2s-ith vector element.
Some representative embodiments differ from the butterfly structures used by the FFT and FHT algorithms by implementing the butterfly elements to controllably route the vector entries. In particular, the routing of entries according to FFT and FHT algorithms occurs in a deterministic manner that is defined by the mathematics of the underlying transform. In contrast, some representative embodiments provide a control structure for each butterfly element. Depending upon the state of the control structure, two corresponding vector elements of a group will switch positions or will continue to the next stage without changing positions. The permutation of the input vector occurs by loading the states of the control structures using a randomization algorithm. By implementing the butterfly elements in this manner, any individual vector element can be routed to any position in the output vector depending upon the randomization of the control structures.
By implementing a vector permuter in this manner, some representative embodiments may provide a relatively large amount of randomness. Specifically, the butterfly structure can yield 2ˆ{(M/2)(log2M)} permutations. Additionally, the butterfly elements can be implemented using 2-to-1 multiplexors as an example. Accordingly, the butterfly structure can be readily pipelined and operated at very high speeds. Also, if the vector to be randomized has a number of vector entries that is a power of two, the generation of bits for the control structures may occur using algorithms that are well-suited for high speed operation.
Referring now to the drawings,
The vector to be permuted comprises sixteen vector entries (denoted by x(0)-x(15)). The entries to be permuted can be single bit values or digital words. The number of stages in butterfly structure 100 is four. In the general case, to enable any input vector entry to be routed to any output vector entry, Log2M stages are employed where M represents the total number of vector entries. In each stage of butterfly structure 100, eight (M/2) butterfly elements (not shown) are used to switch corresponding vector elements. Accordingly, the total number of butterfly elements and the total number of control bits equal 32 ((M Log2M)/2)).
For the general case, the vector entries are grouped in groups of 2s entries. In stage 101, there are eights groups (110-1 through 110-8) of two vector entries. In stage 102, there are four groups (120-1 through 120-4) of four vector entries. In stage 103, there are two groups (130-1 and 130-2) of eight entries and, in stage 104, there is only one group 140 of sixteen entries. Depending upon the state of the control structure of a butterfly element, corresponding vector elements will switch positions or will continue to the next stage at the same positions. Specifically, the ith vector entry of a respective group will exchange positions with the 2s-ith vector entry or these two vector entries will maintain their positions.
In reference to stage 101, the vector entries are grouped in respective groups (110-1 through 110-8) of two entries each. For group 110-1, vector entries 111-1 and 111-2 can change positions depending upon the state of the control structure. For example, if the control structure of the corresponding butterfly element is set to “zero,” vector entry 111-1 would be routed to entry 121-1 of stage 102 and entry 111-2 would be routed to entry 121-2. Alternatively, if the control structure is set to “one,” entry 111-1 would be routed to entry 121-2 and entry 111-2 would be routed to entry 121-1. The other entries of the various groups are routed in a similar manner.
In reference to stage 102, the vector entries are grouped in respective groups (120-1 through 120-4) of four entries each. For group 120-1, vector entries 121-1 and 121-4 can change positions depending upon the state of the control structure. If the control structure of the corresponding butterfly element is set to “zero,” vector entry 121-1 would be routed to element 131-1 of stage 102 and entry 121-4 would be routed to entries 131-4. Alternatively, if the control structure is set to “one,” entry 121-1 would be routed to entry 131-4 and entry 121-4 would be routed to entry 131-1. The other entries of the various groups are routed in a similar manner.
The routing of entries continues in a similar manner to stage 104 and then to the output of the butterfly structure (denoted by output vector entries X(0)-X(15)). From the paths shown in
Variations upon butterfly structure 100 may be performed according to other representative embodiments. For example, the arrangement of butterfly structure 100 could be inverted to form a mirror image of the interconnections in a manner similar to the “decimation-in-frequency” implementation of the FFT. Also, although the discussion of butterfly structure 100 has described the implementation of the routing when the number of vector entries in the input vector are a power of two, other embodiments may permute vectors of other sizes. Specifically, the butterfly structure may be extended to an M composite number in the same manner as the FFT structure has been extended to composite numbers.
Although the description of butterfly structure 100 relies on routing only two corresponding vector entries in a dependent manner at each routing location, other routing mechanisms may be employed. Instead of butterfly element 200 shown in
By implementing a vector permuter using suitable permuting structures, some representative embodiments may provide a relatively large amount of randomness with a relatively low degree of circuit complexity. In some embodiments, a butterfly structure can yield 2ˆ{(M/2)(log2M)} permutations. Additionally, the butterfly elements can be implemented using 2-to-1 multiplexors or other low complexity logic devices as examples. Accordingly, butterfly structures can be readily pipelined and operated at very high speeds. Also, if the vector to be randomized has a number of vector entries that is a power of two, the generation of bits for the control structures may occur using algorithms that are well-suited for high speed operation.