This disclosure relates to combinational circuits, and more specifically to a method of optimizing combinational circuits.
Combinational circuits have many possible applications. One such application is computing an inverse in a Galois Field, which is a field containing a finite number of elements. A combinational circuit is a circuit having an output value determined by the values of its inputs. Combinational circuits can be represented as circuit schematics using Boolean logic gates (such as AND gates and XOR gates), or can be represented mathematically using formulas having operations corresponding to logic gates. For example, an AND gate corresponds to a field multiplication operation, and an XOR gate corresponds to a field addition operation. Logic gates can be arranged to calculate functions, and binary string output of a function may be referred to as a “target signal.” In a typical truth table for a function, the target signal corresponding to the function is the last column of the truth table.
A combinational circuit may have both linear and non-linear portions, where the “non-linear” portions contain AND gates and XOR gates, and the “linear” portions contain only XOR gates. A quantity of AND gates of a combinational circuit may be referred to as the “multiplicative complexity” of the circuit. Combinational circuits and their associated formulas can be extremely large and complex in certain applications, such as microprocessors.
A method of simplifying a combinational circuit establishes an initial combinational circuit operable to calculate a target signal. A quantity of multiplication operations performed in a first portion of the initial combinational circuit is reduced to create a first, simplified combinational circuit. The first portion includes only multiplication operations and addition operations. A quantity of addition operations performed in a second portion of the first, simplified combinational circuit is reduced to create a second, simplified combinational circuit. The second portion includes only addition operations. Also, the second, simplified combinational circuit is operable to calculate the target signal using fewer operations than the initial combinational circuit.
A computer-implemented method of simplifying a plurality of formulas establishes a plurality of formulas. The formulas include only addition operations, and the formulas correspond to a portion of a combinational circuit including only addition operations. A basis set including a plurality of input signals is defined. Using a computer, a distance vector is determined that includes one value for each of the plurality of formulas, the one value corresponding to a number of addition operations necessary to calculate a corresponding formula using signals from the basis set. Using the computer, two basis vectors are determined whose sum, when added to the distance vector, reduces at least one value in the distance vector, and the sum is added to the basis set. The steps of determining two basis vectors whose sum, when added to the basis set, reduces at least one value in the distance vector, and adding the sum to the basis set may be selectively repeated until the basis set includes sums corresponding to each of the plurality of formulas.
A combinational circuit for a Substitution-Box for the Advanced Encryption Standard having a total of 115 Boolean gates comprises a first, input portion, a second portion coupled to the first, input portion, and a third, output portion coupled to the second portion. The first, input portion has 23 XOR gates. The second portion has 30 XOR gate and 32 AND gates, and computes the non-linear component of inversion in GF(256). Also, in the second portion 11 of the 30 XOR gates and 5 of the 32 AND gates are operable to perform inversion in GF(16). The third, output portion has 26 XOR gates and 4 XNOR gates.
These and other features of the present invention can be best understood from the following specification and drawings, of which the following is a brief description.
A first, non-linear portion of the combinational circuit is identified (step 104). A method 200 (see
A second, linear portion of the combinational circuit is identified (step 110) that includes only addition operations (XOR gates). A method 300 (see
y=x
2
x
3
x
4
+x
1
x
3
+x
2
x
3
+x
3
+x
4 formula #1)
y
2
=x
1
x
3
x
4
+x
1
x
3
+x
2
x
3
+x
2
x
4
+x
4 formula #2
y
3
=x
1
x
2
x
4
+x
1
x
3
+x
1
x
4
+x
1
+x
2 formula #3
y
4
=x
1
x
2
x
3
+x
1
x
3
+x
1
x
4
+x
2
x
4
+x
2 formula #4
Formulas 5-8, shown below, show four example inputs x1-x4 that may be used with formulas 1-4.
x
1=0000000011111111 formula #5
x
2=0000111100001111 formula #6
x
3=0011001100110011 formula #7
x
4=0101010101010101 formula #8
Inputting the values for x1-x4 shown in formulas 5-8 into formulas 1-4 yields the values for signals y1-y4 shown in formulas 9-12 below.
y
1=0110010001010111 formula #9
y
2=0101001101110001 formula #10
y
3=0000111110010011 formula #11
y
4=0000101001101111 formula #12
If one calculated formulas 1-4 separately, starting from scratch each time, 18 multiplications (AND operations) and 16 additions (XOR operations) would be required. This would be inefficient because certain terms such as x1x3 and x1x4 appear more than once, and recalculating those terms would be a waste of resources, whether those resources were computer processor calculations or wasted space occupied by excess logic gates in a circuit.
The method 200 may be used to simplify formulas 1-4 and reduce a quantity of multiplications (AND operations) performed in formulas 1-4. In one example, the method 200 could be first applied to formula 4 for y4. The initial formula 4 for y4 (reproduced below) uses 5 multiplications and 4 additions. However it can be seen that formula 4 can be simplified by factoring out x1x3 from the first two terms as shown in formula 13 below, which only requires 3 multiplications and 4 additions (a reduction of 1 multiplication). The question then becomes whether y4 can be processed using less than 3 multiplications.
y
4
=x
1
x
2
x
3
+x
1
x
3
+x
1
x
4
+x
2
x
4
+x
2 formula #4
y
4=(x1x3)(x2+1)+(x1+x2)x4+x2 formula #13
To apply the method 200 to y4, a polynomial (formula) to be simplified is obtained (step 202), which in this case will be formula 4 for y4. The formula is representative of one output of a non-linear portion of the combinational circuit.
A set K of known input signals x1-x4 is obtained (step 204). Pairs of the input signals x1-x4 are added together to determine at least one sum using a computer (step 206), and K is expanded to include the sums (step 208) forming an expanded set K′. Because step 208 involves randomly chosen sums, the signals in K′ at this point do not require any additional multiplications. Signals in the set K′ are then multiplied to determine at least one product using the computer (step 210), and K′ is expanded to include the at least one product (step 212). Steps 210-212 yield signals that require at most one more multiplication than the original set of known signals. Steps 206-212 are then selectively repeated (step 214) until either a desired target signal is found, or a maximum number of multiplications is reached (which in the case of formula 4 this is 3 multiplications). A new formula may then be obtained (step 216) and steps 206-214 may be selectively repeated for the new formula.
For y4 the method 200 can yield the following simplified formula:
y
4=(x1+x2)(x4+x1x3)+x2 formula #14
A circuit specification is then generated (step 218) including each addition performed in step 206 and each multiplication performed in step 210, as shown above in formula 14. In one example, step 218 may include creating a set of short equations, or “straight line program,” as shown in formulas 15-19 shown below. Although the term “straight line program” is used throughout this application, it is understood that a straight line program is just one type of a circuit specification. It is understood that other types of circuit specifications could be used, such as Verilog code.
t
1
=x
1
+x
2 formula #15
t
2
=x
1
·x
3 formula #16
t
3
=x
4
+t
2 formula #17
t
4
=t
1
·t
3 formula #18
y
4
=x
2
+t
4 formula #19
The improved formula 14 for y4 requires only 2 multiplications and 3 additions, instead of 3 multiplications and 4 additions as shown in formula 13. Thus, although inputting the values of inputs x1-x4 into formula 4, formula 13 or formula 30 will yield the same result for y4, formula 14 is the most efficient way to achieve this result. As described above, we know that it is not possible to compute y4 using fewer than 2 multiplications, so once formula 14 is determined (which uses two multiplications) step 116 is complete with regards to y4.
The method 200 may then be applied to formula 2 for y2. Looking at formula 2 for y2 (reproduced below) we see that formula 2 has a degree, or δ, of 3 (step 104). So if we can compute y2 using two multiplications the method 100 has succeeded.
y
2
=x
1
x
3
x
4
+x
1
x
3
+x
2
x
3
+x
2
x
4
+x
4 formula #2
At this point the expanded set known signals K′ (step 204) would be as follows:
K′={x
1
,x
2
,x
3
,x
4
,t
1
,t
2
,t
3
,t
4
,y
4} formula #20
Represented in binary notation the new signals are as follows:
t
1=0000111111110000 formula #21
t
2=0000000000110011 formula #22
t
3=0101010101100110 formula #23
t
4=0000010101100000 formula #24
Performing steps 206-212 for formula 2 yields the following formula for y2 that only requires two multiplications:
y
2
=x
4+(x2+x1x3)(x3+x4) formula #25
These steps may be repeated to obtain simplified versions of y3 and y4 as shown in the straight line program of formulas 26-41 below.
t
1
=x
1
+x
2 formula #26
t
2
=x
1
·x
3 formula #27
t
3
=x
4
+t
2 formula #28
t
4
=t
1
·t
3 formula #29
y
4
=x
2
+t
4 formula #30
t
5
=x
3
+x
4 formula #31
t
6
=x
2
+t
2 formula #32
t
7
=t
6
·t
5 formula #33
y
2
=x
4
+t
7 formula #34
t
8
=x
3
+y
2 formula #35
t
9
=t
3
+y
2 formula #36
t
10
=x
4
·t
9 formula #37
y
1
=t
10
+t
8 formula #38
t
11
=t
3
+t
10 formula #39
t
12
=y
4
·t
11 formula #40
y
3
=t
12
+t
1 formula #41
As described above, if one used formulas 1-4 for calculating y1-y4 separately, 18 multiplications (AND operations) and 16 additions (XOR operations) would be required. However, using the straight line program for calculating y1-y4 shown in formulas 26-41, only 5 multiplications and 11 additions are required. So, in the example of formulas 1-4 applying method 200 can yield a reduction of 13 multiplications and 5 additions.
z
0
=w
0
+w
1
+w
2 formula #42
z
1
=w
1
+w
3
+w
4 formula #43
z
2
=w
0
+w
2
+w
3
+w
4 formula #44
z
3
=w
1
+w
2
+w
3 formula #45
z
4
=w
0
+w
1
+w
3 formula #46
z
5
=w
2
+w
3
+w
4 formula #47
Formulas 42-47 for z0-z5 can also be represented in the form of matrix M shown in formula 48 (shown below), with each row of M representing one of the formulas for z0-z5. For example, the first row of M corresponds to z0, and includes a “1” for each of w0, w1 and w2 (which are all included in formula 42) and a “0” for each of w3 and w4 (neither of which are present in formula 42).
If each of formulas z0-z5 were calculated separately from scratch, 14 additions (XOR operations) would be required. One may apply the method 300 to the matrix M to see if a simplified short line program to solve for z0-z5 using a reduced number of additions can be determined by using formula 49 below as a reference.
f(w)=Mw formula #49
where M is the matrix of formula 48; and
An input vector S is shown below in formula 50 and includes the values shown in formulas 51-55. The vector S acts as a set of signals to serve as a basis for the method 200 (step 304). As shown in formulas 51-55, each of the values w0-w4 is a row of an identity matrix.
S={w
0
,w
1
,w
2
,w
3
,w
4} formula #50
w
0=10000 formula #51
w
1=01000 formula #52
w
2=00100 formula #53
w
3=00010 formula #54
w
4=00001 formula #55
The following distance vector is then determined (step 306), as shown in formula 56:
D=[2 2 3 2 2 3] formula #56
Each value in the distance vector D corresponds to a quantity of additions needed to compute a zn value. For example, computing z0 requires 2 additions, computing z1 requires 2 additions, computing z2 requires 3 additions, etc.
Two basis vectors are then chosen (step 308) whose sum, when added to the basis D minimizes the sum of the new distances. In one example w1+w3 may be selected, as shown in formula 57 below. The sum is then added to the input vector S to form Supdated (step 310) as shown in formula 58 below.
t
100
=w
1
+w
3 formula #57
w
1
+w
3=[0 1 0 1 0] formula #58
S
updated
={w
0
,w
1
,w
2
,w
3
,w
4
,t
100} formula #59
As shown in formulas 42-47, formulas w1, w3 and w4 all include the term w1+w3, which can now be replaced by t100 reducing the number of additions required to determine w1, w3 and w4 by one, and thus also reducing corresponding distance vector D to form an updated distance vector D updated (step 312) as shown in formula 60 below:
D
updated=[2 1 3 1 1 2] formula #60
Steps 308-312 may then be repeated until the distance vector D is minimized, if possible to include all zeros, as shown by formulas 61-87 below.
t
101
=w
0
+t
100 formula #61
w
0
+t
100=[1 1 0 1 0] formula #62
[1 1 0 1 0]=z4 formula #63
D
updated=[2 1 3 1 0 2] formula #64
At this point we have found signal z4, so the sums of formulas 57 and 61 are saved in a straight line program.
t
102
=w
2
+t
100 formula #65
w
2
+t
100=[0 1 1 1 0] formula #66
[0 1 1 1 0]=z3 formula #67
D
updated=[2 1 3 0 0 1] formula #68
At this point we have found z3, so formula 65 is added to the straight line program.
t
103
=w
4
+t
100 formula #69
w
4
+t
100=[0 1 0 1 1] formula #70
[0 1 0 1 1]=z1 formula #71
D
updated=[2 0 3 0 0 1] formula #72
At this point we have found z1, so formula 69 is added to the straight line program.
t
104
=w
2
+t
103 formula #73
w
2
+t
103=[0 1 1 1 1] formula #74
[0 1 1 1 1]=z5 formula #75
D
updated=[2 0 2 0 0 0] formula #76
At this point we have found z5, so formula 73 is added to the straight line program.
t
105
=w
0
+w
1 formula #77
w
0
+w
1=[1 1 0 0 0] formula #78
D
updated=[1 0 1 0 0 0] formula #79
t
106
=w
2
+t
105 formula #80
w
2
+t
105=1 1 0 0 1 formula #81
[1 1 1 0 0]=z0 formula #82
D
updated=[0 0 1 0 0 0] formula #83
At this point we have found z0, so formulas 77 and 80 are added to the straight line program.
t
107
=t
103
+t
106 formula #84
t
103
+t
106=[1 0 1 1 1] formula #85
[1 0 1 1 1]=z2 formula #86
D
updated=[0 0 0 0 0 0] formula #87
At this point we have found z2, so formula 84 is added to the straight line program. Also, since the distance vector Dupdated now includes only zeros we are finished. Notice that this last operation added [01111] and [11000], to obtain [10111], so there was a cancellation in the second entry, adding two ones to get a zero. This possibility makes this technique different from prior techniques. For example, under the PAAR algorithm, no cancellation of elements is allowed.
Combined together, here is the straight line program for computing z0-z5, which only requires 8 XOR operations, instead of the 14 XOR operations required if z0-z5 are calculated separately.
t
100
=w
1
+w
3 formula #57
t
101
=w
0
+t
100 formula #61
t
102
=w
2
+t
100 formula #65
t
103
=w
4
+t
100 formula #69
t
104
=w
2
+t
103 formula #73
t
105
=w
0
+w
1 formula #77
t
106
=w
2
+t
105 formula #80
t
107
=t
103
+t
106 formula #84
In one example, if during step 308 there is a tie between multiple pairs of basis vectors (i.e. the sum of two sets of basis vectors achieves a reduction in D of the same magnitude), then the tie may be resolved by using one of a plurality of tie-breaking techniques that utilize a Euclidean norm of the updated distance vector. The Euclidean norm is calculated by calculating a square root of a sum of squares of elements of the updated distance vector.
In a first tie-breaking technique, a pair of basis vectors is selected whose sum induces the largest Euclidean norm in the new distance vector. For example, if a sum of a first pair of basis vectors resulted in a distance vector of [0 0 3 1] (which has a Euclidean norm of √{square root over (02+02+32+12)}=3.16) and a sum of a second pair of basis vectors resulted in a distance vector of [1 1 1 1] (which has a Euclidean norm of √{square root over (12+12+12+12)}=2.00) the first pair would be chosen because it induces a higher Euclidean norm. Of course, the step of actually calculating the square root could be omitted, as 3.162 would still be greater than 2.002.
In a second tie-breaking technique, a pair of basis vectors is selected who has the greatest value of a square of the Euclidean norm minus the largest element in the distance vectors.
In a third tie-breaking technique, a pair of basis vectors is selected who has the greatest value for a square of the Euclidean norm minus the difference between the largest two elements of the distance vector.
In a fourth tie-breaking technique, if a pair of basis vectors induces a Euclidean norm larger than a previous pair of basis vectors, then one of the pairs is randomly chosen (with a probability of ½).
Although the methods 200 and 300 have been described as applied to separate sets of formulas, it is understood that they could be applied to a single circuit or set of formulas. For example, method 200 could be applied to first with the aim of reducing non-linear components of a circuit while possibly extending linear components. Then method 300 could be applied to optimize the linear components. Also, it is understood that if the circuit contained multiple linear portions and multiple non-linear portions, the methods 200 and 300 could be applied to each of those portions to attempt to reduce the total number of gates in the circuit.
Although a preferred embodiment of this invention has been disclosed, a worker of ordinary skill in this art would recognize that certain modifications would come within the scope of this invention. For that reason, the following claims should be studied to determine the true scope and content of this invention.
This application is a divisional of U.S. application Ser. No. 12/367,660 filed on Feb. 9, 2009.
Number | Date | Country | |
---|---|---|---|
Parent | 12367660 | Feb 2009 | US |
Child | 13615795 | US |