Embodiments of the inventive subject matter generally relate to the field of computers, and, more particularly, to a Single Instruction Multiple Data (SIMD) accelerator for data comparison.
The comparison to determine if two sets of data are equal is generally very intensive in terms of the amount of execution needed, especially as the size of the two sets of data increases. For example, two alphanumeric strings can be compared to determine if they are the same. Applications of such string comparisons can be a determination of whether two documents are the same, whether a certain word is located in a document, etc.
Some example embodiments include an apparatus for comparing a first operand to a second operand. The apparatus includes a SIMD accelerator configured to compare first multiple parts (e.g., bytes) of first operand to second multiple parts (e.g., bytes) of the second operand. The SIMD accelerator includes an input logic configured to input the first operand and the second operand. The SIMD accelerator includes a ones' complement subtraction logic configured to perform a first group of logic operations on the first multiple parts of the first operand and the second multiple parts of the second operand to generate a first group of carry out and propagate data across bits of the first multiple parts and the second multiple parts. The SIMD accelerator also includes a twos' complement subtraction logic configured to perform a second group of logic operations on the first multiple parts of the first operand and the second multiple parts of the second operand to determine a second group of carry out and propagate data across bits of the first multiple parts and the second multiple parts. At least a portion of the first group of carry out and propagate data is reused in the second group of logic operations, wherein at least a portion of the second group of carry out and propagate data is reused in the first group of logic operations. The SIMD accelerator includes an output logic configured to output a result to indicate whether the first operand is equal to the second operand based on the first group of logic operations and the second group of logic operations.
The present embodiments may be better understood, and numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings.
The description that follows includes exemplary systems, methods, techniques, instruction sequences and computer program products that embody techniques of the present inventive subject matter. However, it is understood that the described embodiments may be practiced without these specific details. For instance, although examples refer to the data as bytes or half-words of an operand for processing by a Single Instruction Multiple Data (SIMD) accelerator, some example embodiments can process data of any size. In other instances, well-known instruction instances, protocols, structures and techniques have not been shown in detail in order not to obfuscate the description.
Some example embodiments use a SIMD accelerator to determine whether two sets of data (e.g., alphanumeric strings) are equal. Each set of data can be defined as an operand that is input into the SIMD accelerator. The SIMD accelerator can compare subparts of each operand to each other. For example, the operands can be comprised of multiple bytes (e.g., 16), wherein the SIMD accelerator can compare any byte of a first operand to any byte of the second operand. While described such that the SIMD accelerator performs a byte comparison, in some other example embodiments, the subparts that are compared can include half-words, words, etc. One such example wherein half-words are compared is described in reference to
In some example embodiments, the SIMD accelerator can determine whether a byte from a first operand (Operand A) is greater than, less than or equal to a byte from a second operand (Operand B). Similar, the SIMD accelerator can determine whether a half-word from Operand A is greater than, less than or equal to a half-word from Operand B. The SIMD accelerator can compare either aligned or unaligned half-words, words, etc. For the formulas provided herein, it is assumed that the operand B is inverted (for ease of use this is not explicitly shown by an overbar for operand B).
In some example embodiments, the SIMD accelerator performs both an ones' complement subtraction and a twos' complement subtraction as part of the byte comparison. In particular, assume that Operand B is subtracted from Operand A (A-B). Operand B can be inverted and added to Operand A to provide for the subtraction of Operand B from Operand A. For ones' complement subtraction where there is no carry in, there is a carry out of 1 if A>B (otherwise if there is no carry out, A<=B). For a two's complement subtraction that includes a carry in of 1, there is a carry out of 1 if A>=B (otherwise if there is no carry out, A<B).
As further described below, some example embodiments reuse results from logical functions that include both ones' complement and twos' complement subtraction in a SIMD accelerator to determine whether two bytes, half-words, etc. are less than, greater than or equal to each other. In some example embodiments, generate and propagate results from bit operations from the ones' complement and twos' complement subtraction is reused. A generate (g) is defined as occurring if a carry out occurs to the next set of bits. A propagate (p) is defined as occurring if a carry in is carried forward to the next set of bits.
Applications of this SIMD accelerator can include database searches wherein alphanumeric strings are compared for matches of text, documents, etc. Some example embodiments reduces the fan in into the SIMD accelerator because the less than, greater than, and equal to operations are performed across the entire two operands for each of the bytes, half-words, etc.
The computer system 100 includes a SIMD accelerator 125 that can perform the operations for comparison of two sets of data (as further described below). While illustrated as being coupled to the processor 101 through the bus 103, in some other example embodiments, the SIMD accelerator 125 can be coupled to the processor 101 through a dedicated bus or connection. In some example embodiments, the processor 101 can receive instructions to compare two sets of data. For example, the processor 101 can receive instructions to compare two documents to determine if the documents are equal based on comparison of alphanumeric strings within the documents. The processor 101 can then send an instruction to the SIMD accelerator 125 to compare the two sets of data to determine if they are equal. As further described below, the two sets of data can be input as two operands that each comprises multiple sections of data (e.g., multiple bytes, half-words, words, etc.). In some example embodiments, the SIMD accelerator 125 performs a single instruction on these multiple sections of data between the two operands. For example, assume the two operands each comprise two bytes. The SIMD accelerator 125 compares byte 1 of operand A to byte 1 of operand B; compares byte 2 of operand A to byte 1 of operand B; compares byte 1 of operand A to byte 2 of operand B; and compares byte 2 of operand A to byte 2 of operand B.
In some example embodiments, the SIMD accelerator 125 performs three different comparisons for the same two operands: 1) greater than, 2) less than, and 3) equal to. In some example embodiments, the SIMD accelerator 125 reuses the generate and propagate results for both 1's complement and 2's complement operations for the two operands to determine if the two operands are greater than, less than, or equal to each other.
Further, realizations may include fewer or additional components not illustrated in
gp0007=
EQ=(not A>B)*(A>=B) (Equation 0)
In other words, A is equal to B if the following two conditions are true: 1) A is not greater than B and 2) A is greater than or equal to B.
In this example, the SIMD accelerator 200 is comparing a byte (8 bits) of Operand A to a byte (8 bits) of Operand B. The SIMD accelerator 200 inputs the Operand A and Operand B. The SIMD accelerator 200 includes a number of logical functions (a logical function 202, a logical function 204, a logical function 206, a logical function 208, a logical function 210, a logical function 214, a logical function 216, a logical function 218, a logical function 220, a logical function 222, a logical function 224, a logical function 226, a logical function 228, a logical function 230, a logical function 232, a logical function 234, a logical function 236, a logical function 238, a logical function 240, and a logical function 242). As further described below, the logical functions 202-242 comprise different logical gates that create propagate bits and generate (carry) bits during the logical operation. Also, some example embodiments comprise both ones' complement and twos' complement logic during these logic operations. In this example, the following logic functions perform a ones' complement operation: the logical function 202, the logical function 206, the logical function 208, the logical function 210, the logical function 214, the logical function 216, the logical function 218, the logical function 220, the logical function 222, the logical function 226, the logical function 228, the logical function 230, the logical function 232, the logical function 236, and the logical function 238. In this example, the following logic functions are add-ons to perform a twos' complement operation: the logical function 204, the logical function 224, the logical function 234 and the logical function 240. Additionally, the logical function 242 performs the equal function.
In some example embodiments, the logical functions at the same depths in
As shown in Equation 1 below, the logical function 202 performs a logical AND (inverted) of Operand A (byte 0, bit 7) and Operand B (byte 0, bit 7) to determine whether there is a carry out (generate-g0707) from bit 7 (inverted):
g
0707
As described above, for the formulas provided herein, it is assumed that the operand B is inverted (for ease of use this is not explicitly shown by an overbar for operand B). As shown in Equation 2 below, the logical function 204 performs a logical OR (inverted) of Operand A (byte 0, bit 7) and Operand B (byte 0, bit 7) to determine whether there is a propagate from bit 7 (inverted):
p
0707
As shown in Equations 3 below, the logical function 206 performs a logical AND (inverted) of Operand A (byte 0, bit 6) and Operand B (byte 0, bit 6) to determine a generate (g0606) and propagate (p0606) for bit 6 (inverted):
g
0606
As shown in Equations 4 below, the logical function 208 performs a logical AND (inverted) of Operand A (byte 0, bit 5) and Operand B (byte 0, bit 5) to determine a generate (g0505) and propagate (p0505) for bit 5 (inverted):
g
0505
As shown in Equations 5 below, the logical function 210 performs a logical AND (inverted) of Operand A (byte 0, bit 4) and Operand B (byte 0, bit 4) to determine a generate (g0404) and propagate (p0404) for bit 4 (inverted):
g
0404
As shown in Equations 6 below, the logical function 214 performs a logical AND (inverted) of Operand A (byte 0, bit 3) and Operand B (byte 0, bit 3) to determine a generate (g0303) and propagate (p0303) for bit 3 (inverted):
g
0303
As shown in Equations 7 below, the logical function 216 performs a logical AND (inverted) of Operand A (byte 0, bit 2) and Operand B (byte 0, bit 2) to determine a generate (g0202) and propagate (p0202) for bit 2 (inverted):
g
0202
As shown in Equations 8 below, the logical function 218 performs a logical AND (inverted) of Operand A (byte 0, bit 1) and Operand B (byte 0, bit 1) to determine a generate (g0101) and propagate (p0101) for bit 1 (inverted):
g
0101
As shown in Equations 9 below, the logical function 220 performs a logical AND (inverted) of Operand A (byte 0, bit 0) and Operand B (byte 0, bit 0) to determine a generate (g0000) and propagate (p0000) for bit 0 (inverted):
g
0000
In some example embodiments, the subsequent logical functions to be executed in the SIMD accelerator 200 reuse the propagate and generate that were previously determined (as is now described). Accordingly, the SIMD accelerator 200 can leverage the determinations for the propagate and generate of the prior bits. These logical functions are provided to the multi-bit logical functions for reuse therein (as described below). As shown in
As shown in Equation 10, the logical function 222 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit position 6:
g0607=
In particular, the logical function 222 determines whether there is a carry out from bit 6 (g0606) (inverted) and either a carry out (generate) from bit 7 (g0707) (inverted) OR a propagate from bit 6 (p0606) (inverted). This determination is then inverted. Such logic converts into a carry out from bit 6 (g0606) OR a carry out (generate) from bit 7 (g0707) AND a propagate from bit 6 (p0606). If these conditions are true, then there is a carry out (generate) from bits 6 and 7. As shown, the generate from bit 7 (g0707) from the logical function 202 is reused to determine the generate from bits 6 and 7. Also, the logical function 222 uses the generate result from the logical function 206 (g0606) for making its determination.
As shown in Equation 11 below, the logical function 224 performs a logical OR operation to determine whether there is a propagate from bit positions 6-7 based on a twos' complement operation where there is carry in of 1:
p0607=
In particular, the logical function 224 determines whether there is a propagate from bit 6 (p0606) (inverted) OR a propagate from bit 7 (p0707) (inverted). This determination is then inverted. Such logic converts into having a propagate from bits 6-7 if there is a propagate from bit 6 (p0606) AND a propagate from bit 7 (p0707). As shown, the logical function 224 reuses results from other logical functions to make this determination: 1) the logical function 222 to determine the propagate from bit 6 (p0606) and 2) the logical function 204 to determine the propagate from bit 7 (p0707).
As shown in Equation 12, the logical function 226 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 4 and 5 based on a ones' complement operation where there is no carry in:
g0405=
In particular, the logical function 226 determines whether there is a carry out from bit 4 (g0404) (inverted) and either a carry out (generate) from bit 5 (g0505) (inverted) OR a propagate from bit 4 (p0404) (inverted). If these conditions are true, then there is a carry out (generate) from bits 4 and 5. As shown, the logical function 226 reuses results from other logical functions to make this determination: 1) the logical function 210 to determine the carry out (generate) from bit 4 (g0404) and 2) the logical function 208 to determine the carry out (generate) from bit 5 (g0505).
As shown in Equation 13, the logical function 228 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 2 and 4 based on a ones' complement operation where there is no carry in:
g
0203=
In particular, the logical function 228 determines an inverted result of a carry out from bit 2 (g0202) (inverted) and either a carry out (generate) from bit 3 (g0303) (inverted) OR a propagate from bit 2 (p0202) (inverted). If these conditions are true, then there is a carry out (generate) from bits 2 and 3. As shown, the logical function 228 reuses results from other logical functions to make this determination: 1) the logical function 216 to determine the carry out (generate) from bit 2 (g0202) and 2) the logical function 214 to determine the carry out (generate) from bit 3 (g0303).
As shown in Equation 14, the logical function 230 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 0 and 1 based on a ones' complement operation where there is no carry in:
g0001=
In particular, the logical function 230 determines an inverted result of a carry out from bit 0 (g0000) (inverted) and either a carry out (generate) from bit 1 (g0101) (inverted) OR a propagate from bit 0 (p0000) (inverted). If these conditions are true, then there is a carry out (generate) from bits 0 and 1. As shown, the logical function 230 reuses results from other logical functions to make this determination: 1) the logical function 220 to determine the carry out (generate) from bit 0 (g0000) and 2) the logical function 218 to determine the carry out (generate) from bit 1 (g0101).
As shown in
g
0407
In particular, the logical function 232 determines an inverted result of a carry out from bits 4 and 5 (g0405) OR a carry out (generate) from bits 6 and 7 (g0607) AND a propagate from bits 4 and 5 (p0405). As shown, the logical function 232 reuses results from other logical functions to make this determination: 1) the logical function 226 to determine the carry out (generate) from bits 4 and 5 (g0405) and 2) the logical function 222 to determine the carry out (generate) from bits 6 and 7 (g0607).
As shown in Equation 16, the logical function 232 also performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 4-7 AND a propagate from bit positions 4-7:
gp
0407
In particular, the logical function 232 determines an inverted result of a carry out from bits 4 and 5 (g0405) OR a generate/propagate from bits 6 and 7 (gp0607) AND a propagate from bits 4 and 5 (p0405). As shown, the logical function 232 reuses results from other logical functions to make this determination: 1) the logical function 226 to determine the carry out (generate) from bits 4 and 5 (g0405) and 2) the logical function 224 to determine the propagate from bits 6 and 7 (p0607). Also, the logical function 232 can reuse its previous determination regarding the propagate from bits 4 and 5 (p0405) from Equation 15.
As shown in Equation 17 below, the logical function 234 performs logical OR and AND operations to determine whether there is a propagate from bit positions 4-7 based on a twos' complement operation where there is carry in of 1:
p
0407
In particular, the logical function 234 determines an inverted result of a propagate from bits 4-5 (p0405) AND a propagate from bits 6-7 (p0607). As shown, the logical function 234 reuses results from other logical functions to make this determination: 1) the logical function 232 to determine the propagate from bits 4-5 (p0405) and 2) the logical function 224 to determine the propagate from bits 6-7 (p0607).
As shown in Equations 18, the logical function 236 performs logical OR and AND operations to determine whether there is a carry out (generate) and propagate from bit positions 0-3:
g
0003
In particular, the logical function 236 determines an inverted result of a carry out from bits 0 and 1 (g0001) OR a carry out (generate) from bits 2 and 3 (g0203) AND a propagate from bits 0 and 1 (p0001). As shown, the logical function 236 reuses results from other logical functions to make this determination: 1) the logical function 228 to determine the carry out (generate) from bits 2 and 3 (g0203) and 2) the logical function 230 to determine the carry out (generate) from bits 0 and 1 (g0001).
As shown in
g
0007=
In particular, the logical function 238 determines an inverted result of a carry out from bits 0-3 (g0003) (inverted) AND a carry out (generate) from bits 4-7 (g0407) (inverted) OR a propagate from bits 0-3 (p0003) (inverted). If these conditions are true, then there is a carry out (generate) from bits 0-7. As shown, the logical function 238 reuses results from other logical functions to make this determination: 1) the logical function 232 to determine the carry out (generate) from bits 4-7 (g0407) and 2) the logical function 236 to determine the carry out (generate) from bits 0-3 (g0003).
As shown in Equation 20, the logical function 238 also performs a logical OR operation to determine whether there is a propagate from bit positions 0-7 based on a twos' complement operation where there is carry in of 1:
p0007=
In particular, the logical function 238 determines an inverted result of a propagate from bits 0-3 (p0003) (inverted) OR a propagate from bits 4-7 (p0407) (inverted). If either of these conditions are true, then there is a propagate from bits 0-7. As shown, the logical function 240 reuses results from other logical functions to make this determination: 1) the logical function 238 to determine the propagate from bits 0-3 (p0003) and 2) the logical function 234 to determine the propagate from bits 4-7 (p0407).
As shown in Equation 21 below, the logical function 240 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 0-7 AND a propagate from bit positions 0-7:
gp0007=
In particular, the logical function 240 determines an inverted result of a carry out from bits 0-3 (g0003) (inverted) OR a carry out (generate) OR a propagate from bits 4-7 (gp0407) (inverted) AND a propagate from bits 0-3 (p0003) (inverted). If these conditions are true, then there is a carry out (generate) AND propagate from bits 0-7. As shown, the logical function 238 reuses results from other logical functions to make this determination: 1) the logical function 232 to determine the carry out (generate) AND propagate from bits 4-7 (gp0407) and 2) the logical function 236 to determine the carry out (generate) from bits 0-3 (g0003). Also, the logical function 232 can reuse its previous determination regarding the propagate from bits 0-3 (p0003) from Equation 19.
As shown in Equation 22 below, the logical function 242 determines that the byte of Operand A is greater than the byte of Operand B if there is a carry out (generate) from bits 0-7:
GTBy=g0007 (Equation 22)
As shown in Equation 23 below, the logical function 242 also determines that the byte of Operand A is less than the byte of Operand B if there is a carry out (generate) AND propagate from bits 0-7 (inverted):
LT
By=
As shown in Equation 24 below, the logical function 242 also determines that the byte of Operand A is equal to the byte of Operand B if there is a carry out (generate) from bits 0-7 (g0007) (inverted) OR carry out (generate) AND propagate from bits 0-7:
EQ
By=
A byte compare for two operands according to some other example embodiments is now described. In particular,
EQ=(
In this example, the SIMD accelerator 300 is comparing a byte (8 bits) of Operand A to a byte (8 bits) of Operand B. The SIMD accelerator 300 inputs the Operand A and Operand B. The SIMD accelerator 300 includes a number of logical functions (a logical function 302, a logical function 304, a logical function 306, a logical function 308, a logical function 310, a logical function 314, a logical function 316, a logical function 318, a logical function 320, a logical function 322, a logical function 324, a logical function 326, a logical function 328, a logical function 330, a logical function 332, a logical function 334, a logical function 336, a logical function 338, a logical function 340, a logical function 342, a logical function 360, a logical function 362, a logical function 364, a logical function 366, a logical function 368, a logical function 370, and a logical function 372). As further described below, the logical functions 302-342 and 360-372 comprise different logical gates that create propagate bits and generate (carry) bits during the logical operation. Also, some example embodiments comprise both ones' complement and twos' complement logic during these logic operations. In this example, the following logic functions perform a ones' complement operation: the logical function 302, the logical function 306, the logical function 308, the logical function 310, the logical function 314, the logical function 316, the logical function 318, the logical function 320, the logical function 322, the logical function 326, the logical function 328, the logical function 330, the logical function 332, the logical function 336, the logical function 338, and the logical function 342. In this example, the following logic functions perform a twos' complement operation: the logical function 304, the logical function 324, the logical function 334 and the logical function 340. The SIMD accelerator 300 includes the additional logical functions 360-372 (that are not included in the SIMD accelerator 200).
In some example embodiments, the logical functions at the same depths in
As shown in Equation 26 below, the logical function 302 performs a logical AND (inverted) of Operand A (byte 0, bit 7) and Operand B (byte 0, bit 7) to determine whether there is a carry out (generate-g0707) from bit 7 (inverted):
g
0707
As shown in Equation 27 below, the logical function 304 performs a logical OR (inverted) of Operand A (byte 0, bit 7) and Operand B (byte 0, bit 7) to determine whether there is a propagate from bit 7 (inverted):
p
0707
As shown in Equations 28 below, the logical function 306 performs a logical AND (inverted) of Operand A (byte 0, bit 6) and Operand B (byte 0, bit 6) to determine a generate (g0606) and a propagate (p0606) for bit 6 (inverted):
g
0606
As shown in Equations 29 below, the logical function 308 performs a logical AND (inverted) of Operand A (byte 0, bit 5) and Operand B (byte 0, bit 5) to determine a generate (g0505) and a propagate (p0505) for bit 5 (inverted):
g
0505
As shown in Equations 30 below, the logical function 310 performs a logical AND (inverted) of Operand A (byte 0, bit 4) and Operand B (byte 0, bit 4) to determine a generate (g0404) and a propagate (p0404) for bit 4 (inverted):
g
0404
As shown in Equations 31 below, the logical function 314 performs a logical AND (inverted) of Operand A (byte 0, bit 3) and Operand B (byte 0, bit 3) to determine a generate (g0303) and a propagate (p0303) for bit 3 (inverted):
g
0303
As shown in Equations 32 below, the logical function 316 performs a logical AND (inverted) of Operand A (byte 0, bit 2) and Operand B (byte 0, bit 2) to determine a generate (g0202) and a propagate (p0202) for bit 2 (inverted):
g
0202
As shown in Equations 33 below, the logical function 318 performs a logical AND (inverted) of Operand A (byte 0, bit 1) and Operand B (byte 0, bit 1) to determine a generate (g0101) and a propagate (p0101) for bit 1 (inverted):
g
0101
As shown in Equations 34 below, the logical function 320 performs a logical AND (inverted) of Operand A (byte 0, bit 0) and Operand B (byte 0, bit 0) to determine a generate (g0000) and a propagate (p0000) for bit 0 (inverted):
g
0000
In some example embodiments, the subsequent logical functions to be executed in the SIMD accelerator 300 reuse the propagate and generate that were previously determined (as is now described). Accordingly, the SIMD accelerator 200 can leverage the determinations for the propagate and generate of the prior bits. These logical functions are provided to the multi-bit logical functions for reuse therein (as described below). As shown in
As shown in Equations 35, the logical function 322 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit position 6:
g0607=
In particular, the logical function 322 determines whether there is a carry out from bit 6 (g0606) (inverted) and either a carry out (generate) from bit 7 (g0707) (inverted) OR a propagate from bit 6 (p0606) (inverted). This determination is then inverted. If these conditions are true, then there is a carry out (generate) from bits 6 and 7. As shown, the generate from bit 7 (g0707) from the logical function 302 is reused to determine the generate from bits 6 and 7. Also, the logical function 322 uses the generate result from the logical function 306 (g0606) for making its determination.
Also, the logical function 322 performs a logical OR operation to determine whether there is a propagate from bit positions 6-7 based on a twos' complement operation where there is carry in of 1. In particular, the logical function 322 determines whether there is a propagate from bit 6 (p0606) (inverted) OR a propagate from bit 7 (p0707) (inverted). This determination is then inverted. If the determination is true, then there is a propagate from bits 6 and 7. As shown, the logical function 322 reuses results from other logical functions to make this determination: 1) the logical function 322 to determine the propagate from bit 6 (p0606) and 2) the logical function 304 to determine the propagate from bit 7 (p0707).
As shown in Equation 36 below, the logical function 324 performs logical OR and AND operations to determine whether there is a generate/propagate from bit positions 6-7:
gp
0607=
In particular, the logical function 324 determines whether there is a generate from bit 6 (g0606) (inverted) and either a propagate from bit 7 (p0707) (inverted) OR a propagate from bit 6 (inverted). This determination is then inverted. If the determination is true, then there is a generate/propagate from bits 6 and 7. As shown, the logical function 324 reuses results from other logical functions to make this determination: 1) the logical function 322 to determine the propagate from bit 6 (p0606) and 2) the logical function 304 to determine the propagate from bit 7 (p0707).
As shown in Equation 37 below, the logical function 360 performs a logical AND operation to determine whether there is a generate from both bit positions 6 and 7:
G0607=
In particular, the logical function 360 determines whether there is a generate from bit 6 (g0606) (inverted) AND a generate from bit 7 (g0707) (inverted). As shown, the logical function 360 reuses results from other logical functions to make this determination: 1) the logical function 306 to determine the generate from bit 6 (g0606) and 2) the logical function 302 to determine the generate from bit 7 (g0707). As further described below and in contrast to the SIMD accelerator 200 of
As shown in Equations 38, the logical function 326 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 4 and 5:
g0405=
In particular, the logical function 326 determines whether there is a carry out from bit 4 (g0404) (inverted) and either a carry out (generate) from bit 5 (g0505) (inverted) OR a propagate from bit 4 (g0404) (inverted). If these conditions are true, then there is a carry out (generate) from bits 4 and 5. As shown, the logical function 326 reuses results from other logical functions to make this determination: 1) the logical function 310 to determine the carry out (generate) from bit 4 (g0404) and 2) the logical function 308 to determine the carry out (generate) from bit 5 (g0505).
Also, the logical function 326 performs a logical OR operation to determine whether there is a propagate from bit positions 4-5 based on a twos' complement operation where there is carry in of 1. In particular, the logical function 326 determines whether there is a propagate from bit 5 (p0505) (inverted) OR a propagate from bit 4 (p0404) (inverted). This determination is then inverted. If the determination is true, then there is a propagate from bits 4 and 5. As shown, the logical function 326 reuses results from other logical functions to make this determination: 1) the logical function 326 to determine the propagate from bit 5 (p0505).
As shown in Equation 39 below, the logical function 362 performs a logical AND operation to determine whether there is a generate from both bit positions 4 and 5:
G0405=
In particular, the logical function 362 determines whether there is a generate from bit 4 (g0404) (inverted) AND a generate from bit 5 (g0505) (inverted). As shown, the logical function 362 reuses results from other logical functions to make this determination: 1) the logical function 310 to determine the generate from bit 4 (g0404) and 2) the logical function 308 to determine the generate from bit 5 (g0505). As further described below and in contrast to the SIMD accelerator 200 of
As shown in Equations 40, the logical function 328 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 2 and 4:
g0203=
In particular, the logical function 328 determines an inverted result of a carry out from bit 2 (g0202) (inverted) and either a carry out (generate) from bit 3 (g0303) (inverted) OR a propagate from bit 2 (p0202) (inverted). If these conditions are true, then there is a carry out (generate) from bits 2 and 3. As shown, the logical function 328 reuses results from other logical functions to make this determination: 1) the logical function 316 to determine the carry out (generate) from bit 2 (g0202) and 2) the logical function 314 to determine the carry out (generate) from bit 3 (g0303).
Also, the logical function 328 performs a logical OR operation to determine whether there is a propagate from bit positions 2-3 based on a twos' complement operation where there is carry in of 1. In particular, the logical function 328 determines whether there is a propagate from bit 2 (p0202) (inverted) OR a propagate from bit 3 (p0303) (inverted). This determination is then inverted. If the determination is true, then there is a propagate from bits 2 and 3. As shown, the logical function 328 reuses results from other logical functions to make this determination: 1) the logical function 328 to determine the propagate from bit 2 (p0202).
As shown in Equation 41 below, the logical function 364 performs a logical AND operation to determine whether there is a generate from both bit positions 2 and 3:
G0203=
In particular, the logical function 364 determines whether there is a generate from bit 2 (g0202) (inverted) AND a generate from bit 3 (g0303) (inverted). As shown, the logical function 364 reuses results from other logical functions to make this determination: 1) the logical function 316 to determine the generate from bit 2 (g0202) and 2) the logical function 314 to determine the generate from bit 3 (g0303). As further described below and in contrast to the SIMD accelerator 200 of
As shown in Equations 42, the logical function 330 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 0 and 1 based on a ones' complement operation where there is no carry in:
g0001=
In particular, the logical function 330 determines an inverted result of a carry out from bit 0 (g0000) (inverted) and either a carry out (generate) from bit 1 (g0101) (inverted) OR a propagate from bit 0 (p0000) (inverted). If these conditions are true, then there is a carry out (generate) from bits 0 and 1. As shown, the logical function 330 reuses results from other logical functions to make this determination: 1) the logical function 320 to determine the carry out (generate) from bit 0 (g0000) and 2) the logical function 318 to determine the carry out (generate) from bit 1 (g0101).
Also, the logical function 330 performs a logical OR operation to determine whether there is a propagate from bit positions 0-1 based on a twos' complement operation where there is carry in of 1. In particular, the logical function 330 determines whether there is a propagate from bit 0 (p0000) (inverted) OR a propagate from bit 1 (p0101) (inverted). This determination is then inverted. If the determination is true, then there is a propagate from bits 0 and 1. As shown, the logical function 330 reuses results from other logical functions to make this determination: 1) the logical function 330 to determine the propagate from bit 0 (p0000).
As shown in Equation 43 below, the logical function 366 performs a logical AND operation to determine whether there is a generate from both bit positions 2 and 3:
G
0001=
In particular, the logical function 366 determines whether there is no generate from bit 0 (g0000) (inverted) AND a generate from bit 1 (g0101) (inverted). As shown, the logical function 366 reuses results from other logical functions to make this determination: 1) the logical function 320 to determine the generate from bit 0 (g0000) and 2) the logical function 318 to determine the generate from bit 1 (g0101). As further described below and in contrast to the SIMD accelerator 200 of
As shown in
g
0407
In particular, the logical function 332 determines an inverted result of a carry out from bits 4 and 5 (g0405) OR a carry out (generate) from bits 6 and 7 (g0607) AND a propagate from bits 4 and 5 (p0405). As shown, the logical function 332 reuses results from other logical functions to make this determination: 1) the logical function 326 to determine the carry out (generate) from bits 4 and 5 (g0405) and 2) the logical function 322 to determine the carry out (generate) from bits 6 and 7 (g0607).
As shown in Equation 45 below, the logical function 332 also performs a logical AND operation to determine whether there is a propagate from bit positions 4-7 based on a twos' complement operation where there is carry in of 1:
p
0407
In particular, the logical function 332 determines an inverted result of a propagate from bits 4-5 (p0405) AND a propagate from bits 6-7 (p0607). As shown, the logical function 334 reuses results from other logical functions to make this determination: 1) the logical function 332 to determine the propagate from bits 4-5 (p0405) and 2) the logical function 324 to determine the propagate from bits 6-7 (p0607).
As shown in Equation 46, the logical function 334 also performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 4-7 AND a propagate from bit positions 4-7:
gp
0407
In particular, the logical function 334 determines an inverted result of a carry out from bits 4 and 5 (g0405) OR a generate/propagate from bits 6 and 7 (p0607) AND a propagate from bits 4 and 5 (p0405). As shown, the logical function 334 reuses results from other logical functions to make this determination: 1) the logical function 326 to determine the carry out (generate) from bits 4 and 5 (g0405) and 2) the logical function 324 to determine the propagate from bits 6 and 7 (p0607). Also, the logical function 332 can reuse its previous determination regarding the propagate from bits 4 and 5 (p0405) from Equation 44.
As shown in Equation 47 below, the logical function 368 performs logical OR and/or AND operations to determine whether there is a generate from all of bit positions 4-7:
G
0407
=G
0405
*G
0607=
In particular, the logical function 368 determines a result of a generate from bit 4 (g0404) (inverted) AND a generate from bit 5 (g0505) (inverted) AND a generate from bit 6 (g0606) (inverted) AND a generate from bit 7 (g0707) (inverted). As further described below and in contrast to the SIMD accelerator 200 of
As shown in Equations 48, the logical function 336 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 0-3:
g
0003
In particular, the logical function 336 determines an inverted result of a carry out from bits 0 and 1 (g0001) OR a carry out (generate) from bits 2 and 3 (g0203) AND a propagate from bits 0 and 1 (p0001). As shown, the logical function 336 reuses results from other logical functions to make this determination: 1) the logical function 328 to determine the carry out (generate) from bits 2 and 3 (g0203) and 2) the logical function 330 to determine the carry out (generate) from bits 0 and 1 (g0001). The logical function 336 also determines an inverted result of a propagate from bits 0 and 1 (p0001) AND a propagate from bits 2 and 3 (p0203).
As shown in Equation 49 below, the logical function 370 performs logical OR and/or AND operations to determine whether there is a generate from all of bit positions 0-3:
G0003=G0001*G0203=
In particular, the logical function 370 determines a result of a generate from bit 0 (g0000) (inverted) AND a generate from bit 1 (g0101) (inverted) AND a generate from bit 2 (g0202) (inverted) AND a generate from bit 3 (g0303) (inverted). As further described below and in contrast to the SIMD accelerator 200 of
As shown in
g
0007=
In particular, the logical function 338 determines a result of a carry out (generate) from bits 0-3 (g0003) (inverted) AND a carry out (generate) from bits 4-7 (g0407) (inverted) OR a propagate from bits 0-3 (p0003) (inverted). This result is then inverted. If these conditions are true, then there is a carry out (generate) from bits 0-7. As shown, the logical function 338 reuses results from other logical functions to make this determination: 1) the logical function 332 to determine the carry out (generate) from bits 4-7 (g0407) and 2) the logical function 336 to determine the carry out (generate) from bits 0-3 (g0003).
As shown in Equation 51, the logical function 338 also performs a logical OR operation to determine whether there is a propagate from bit positions 0-7 based on a twos' complement operation where there is carry in of 1:
p0007=
In particular, the logical function 338 determines an inverted result of a propagate from bits 0-3 (p0003) (inverted) OR a propagate from bits 4-7 (p0407) (inverted). If either of these conditions are true, then there is a propagate from bits 0-7. As shown, the logical function 338 reuses results from other logical functions to make this determination: 1) the logical function 338 to determine the propagate from bits 0-3 (p0003) and 2) the logical function 334 to determine the propagate from bits 4-7 (p0407).
As shown in Equation 52, the logical function 340 also performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 0-7 AND a propagate from bit positions 0-7:
gp
0007=
In particular, the logical function 340 determines an inverted result of a carry out from bits 0-3 (g0003) (inverted) AND a carry out (generate) OR a propagate from bits 4-7 (gp0407) (inverted) AND a propagate from bits 0-3 (p0003) (inverted). If these conditions are true, then there is a carry out (generate) AND propagate from bits 0-7. As shown, the logical function 338 reuses results from other logical functions to make this determination: 1) the logical function 332 to determine the carry out (generate) AND propagate from bits 4-7 (gp0407) and 2) the logical function 336 to determine the carry out (generate) from bits 0-3 (g0003). Also, the logical function 332 can reuse its previous determination regarding the propagate from bits 0-3 (p0003) from Equation 50.
As shown in Equation 53 below, the logical function 372 performs logical AND operation to determine whether there is a generate from all of bit positions 0-3:
G
0007
=G
0003
*G
0407 (Equation 53)
In particular, the logical function 372 determines a result of a generate from bits 0-3 (G0003) AND a generate from bits 4-7 (G0407). As shown, the logical function 372 reuses results from other logical functions to make this determination: 1) the logical function 370 to determine the generate from bits 0-3 (G0003) and 2) the logical function 368 to determine the generate from bits 4-7 (G0407).
As shown in Equation 54 below, the logical function 342 determines that the byte of Operand A is greater than the byte of Operand B if there is a carry out (generate) from bits 0-7:
GTBy=g0007 (Equation 54)
As shown in Equation 55 below, the logical function 342 also determines that the byte of Operand A is less than the byte of Operand B if there is a carry out (generate) AND propagate from bits 0-7 (inverted):
LT
By=
As shown in Equation 56 below, the logical function 342 also determines that the byte of Operand A is equal to the byte of Operand B if there is a carry out (generate) from bits 0-7 (G0007) AND propagate from bits 0-7:
EQ
By
=G
0007
*p
0007 (Equation 56)
In particular, the logical function 342 determines that Operand A is equal to Operand B if there is not a carry out (generate) of bit 0 (
To illustrate,
The SIMD accelerator 400 includes four different byte comparisons and a half-word comparison based on the byte comparisons. Each of the four different byte comparisons includes a different group of logical functions (that are similar to the logical functions illustrated in
In some example embodiments, the logical functions 401-415 are equal to the logical functions 202-242 in the SIMD accelerator 200 of
In this example illustrated in
GTBy=g0007 (Equation 57)
As shown in Equation 58 below, the logical function 415 also determines that Operand A is less than Operand B if there is a carry out (generate) AND propagate from bits 0-7 (inverted):
LTBy=
As shown in Equation 59 below, the logical function 415 also determines that Operand A is equal to Operand B if there is a carry out (generate) from bits 0-7 (G0007) AND propagate from bits 0-7:
EQ
By
=G
0007
*p
0007 (Equation 59)
In some example embodiments, the logical functions 416-429 are equal to the logical functions 202-242 in the SIMD accelerator 200 of
In this example illustrated in
GTBy=g0015 (Equation 60)
As shown in Equation 61 below, the logical function 429 also determines that Operand A is less than Operand B if there is a carry out (generate) AND propagate from bits 0-7 of Operand A and bits 8-15 of Operand B (inverted):
LTBy=
As shown in Equation 62 below, the logical function 429 also determines that Operand A is equal to Operand B if there is a carry out (generate) from bits 0-7 of Operand A and bits 8-15 of Operand B (G0015) AND propagate from bits 0-7 of Operand A and bits 8-15 of Operand B (p0015):
EQ
By
=G
0015
*p
0015 (Equation 62)
In this example illustrated in
GTBy=g0807 (Equation 63)
As shown in Equation 64 below, the logical function 443 also determines that Operand A is less than Operand B if there is a carry out (generate) AND propagate from bits 8-15 of Operand A and bits 0-7 of Operand B (inverted):
LTBy=
As shown in Equation 65 below, the logical function 443 also determines that Operand A is equal to Operand B if there is a carry out (generate) from bits 8-15 of Operand A and bits 0-7 of Operand B (G0807) AND propagate from bits 8-15 of Operand A and bits 0-7 of Operand B (p0807):
EQ
By
=G
0807
*p
0807 (Equation 65)
In this example illustrated in
GTBy=g0815 (Equation 66)
As shown in Equation 67 below, the logical function 457 also determines that Operand A is less than Operand B if there is a carry out (generate) AND propagate from bits 8-15 of Operand A and bits 8-15 of Operand B (inverted):
LTBy=
As shown in Equation 68 below, the logical function 457 also determines that Operand A is equal to Operand B if there is a carry out (generate) from bits 8-15 of Operand A and bits 8-15 of Operand B (G0815) AND propagate from bits 8-15 of Operand A and bits 8-15 of Operand B (p0815):
EQ
By
=G
0815
*p
0815 (Equation 68)
Additionally, the SIMD accelerator 400 is configured such that results of the byte word comparison can be reused for a half-word comparison. In particular, a logical function 458 can reuse the results from the logical function 415, the logical function 429, the logical function 443, and the logical function 457 to compare a first half-word (bytes 0 and 1) of Operand A to a first half-word (bytes 0-1) of Operand B.
As shown in Equation 69, the logical function 458 performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 0-15:
g
0015
In particular, the logical function 458 determines an inverted result of a carry out from bits 0-7 (g0007) OR a carry out (generate) from bits 8-15 (g0815) AND a propagate from bits 0-7 (p0007). As shown, the logical function 458 reuses results from other logical functions to make this determination: 1) the logical function 415 to determine the carry out (generate) from bits 0-7 (g0007) and 2) the logical function 457 to determine the carry out (generate) from bits 8-15 (g0815).
As shown in Equation 70, the logical function 458 also performs logical OR and AND operations to determine whether there is a carry out (generate) from bit positions 0-15 AND a propagate from bit positions 0-15:
gp
0015
In particular, the logical function 458 determines an inverted result of a generate AND propagate from bits 0-15 (gp0015) (inverted) OR a carry out (generate) AND a propagate from bits 8-15 (gp0815) (inverted) AND a propagate from bits 0-7 (p0007) (inverted). As shown, the logical function 458 reuses results from another logical function to make this determination: the logical function 463 to determine the carry out (generate) AND propagate from bits 8-15 (gp0815).
As shown in Equation 71 below, the logical function 458 performs a logical OR operation to determine whether there is a propagate from bit positions 0-15 based on a twos' complement operation where there is carry in of 1:
p
0015
In particular, the logical function 458 determines an inverted result of a propagate from bits 0-7 (p0007) OR a propagate from bits 8-15 (p0815). As shown, the logical function 443 reuses results from its previous determination regarding the propagate from bits 0-7 (p0007).
As shown in Equation 72 below, the logical function 458 also performs a logical AND operation to determine whether there is a generate from all of bit positions 0-15:
G
0015
=G
0007
*G
0815 (Equation 72)
In particular, the logical function 458 determines a result of a generate from bits 0-7 (G0007) AND a generate from bits 8-15 (G0815). As shown, the logical function 458 reuses results from other logical functions to make this determination: 1) the logical function 460 to determine the generate from bits 0-7 (G0007) and 2) the logical function 463 to determine the generate from bits 8-15 (G0815).
As shown in Equation 73 below, the logical function 458 determines that the half-word of Operand A is greater than the half-word of Operand B if there is a carry out (generate) from bits 0-15:
GTHW=g0015 (Equation 73)
As shown in Equation 74 below, the logical function 458 also determines that the half-word of Operand A is less than the half-word of Operand B if there is a carry out (generate) AND propagate from bits 0-7 (inverted):
LTHW=
As shown in Equation 75 below, the logical function 458 also determines that the half-word of Operand A is equal to the half-word of Operand B if there is a carry out (generate) from bits 0-15 (G0015) AND propagate (p0015) from bits 0-15:
EQ
HW
=G
0015
*p
0015 (Equation 75)
At block 602, a SIMD accelerator receives a first operand having first multiple parts and a second operand having second multiple parts. For example, the first operand and the second operand can comprise multiple bytes, half-words, words, etc. (as described above). Operations of the flowchart 600 continue at block 604.
At block 604, the SIMD accelerator performs, based on a one's complement logic, a first group of logic operations on the first multiple parts of the first operand and the second multiple parts of the second operand to generate a first group of carry out and propagate data across bits of the first multiple parts and the second multiple parts. For example, the SIMD accelerator can perform the logical functions described in reference to
At block 606, the SIMD accelerator performs, based on a two's complement logic, a second group of logic operations on the first multiple parts of the first operand and the second multiple parts of the second operand to determine a second group of carry out and propagate data across bits of the first multiple parts and the second multiple parts. At least a portion of the first group of carry out and propagate data is reused in the second group of logic operations. Also, at least a portion of the second group of carry out and propagate data is reused in the first group of logic operations. For example, the SIMD accelerator can perform the logical functions described in reference to
At block 608, the SIMD accelerator outputs a result to indicate whether the first operand is equal to the second operand based on the first group of logic operations and the second group of logic operations. As described above, this result can be based on multiple byte, half-word, word, etc. comparisons across each of the two operands. Operations of the flowchart 600 are complete.
As will be appreciated by one skilled in the art, aspects of the present inventive subject matter may be embodied as a system, method or computer program product. Accordingly, aspects of the present inventive subject matter may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present inventive subject matter may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present inventive subject matter may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present inventive subject matter are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the inventive subject matter. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
While the embodiments are described with reference to various implementations and exploitations, it will be understood that these embodiments are illustrative and that the scope of the inventive subject matter is not limited to them. In general, techniques for operand comparison as described herein may be implemented with facilities consistent with any hardware system or hardware systems. Many variations, modifications, additions, and improvements are possible.
Plural instances may be provided for components, operations or structures described herein as a single instance. Finally, boundaries between various components, operations and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the inventive subject matter. In general, structures and functionality presented as separate components in the exemplary configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements may fall within the scope of the inventive subject matter.