The present application claims the benefit under 35 U.S.C. § 119 of German Patent Application No. DE 10 2013 208 836.1 filed on May 14, 2013, which is expressly incorporated herein by reference in its entirety.
The present invention relates to a method for generating a hash value as a function of digital input data. The present invention also relates to a device for generating such a hash value.
Hash functions that supply one or more hash values as output values are used in particular in the area of cryptography, specifically for security-relevant applications such as digital signatures, the storage of passwords, and the integrity testing of data files and the like. A widely used group of cryptographic hash functions is based on the so-called Secure Hash Algorithm Version 2 (SHA-2) Standard, described inter alia in the publication “Federal Information Processing Standards Publication, Secure Hash Standard, FIPS PUB 180-3, 2008” and accessible via the Internet at the address http://csrc.insit.gov/publications/fips/180-3. A corresponding U.S. patent is U.S. Pat. No. 6,829,355 B2.
In general, a cryptographic hash function receives a digital input data stream of arbitrary length and generates therefrom a so-called hash value, i.e., digital output data having a specifiable, in particular fixed, length. The hash value is sometimes also referred to as a digital fingerprint.
A particularly important property of the hash value is that even a slight change in the input data of the hash function causes a very large change in the hash value calculated therefrom.
In addition, cryptographic hash algorithms can have three specific properties:
1. The so-called “preimage resistance,” which means that it has to be proven that for all possible output values of the hash algorithm, given finite realistically available computing power it is impossible to discover the associated input data value.
2. The so-called “second preimage resistance,” which means that given knowledge of a data pair made up of an input data value and the associated output data value (hash value) of a hash function, it is realistically not possible to find a second input data value that results in the same output data value, i.e., hash value.
3. “Collision resistance,” which means that it is realistically not possible to find two input data values that result in the same hash value.
An object of the present invention is to improve a method and a device of the type named above in such a way that a simpler, efficient implementation is enabled.
This object may be achieved, for example, with a method including the following steps:
According to an example embodiment of the present invention, it has been recognized that the above-defined rules for modifying the input data blocks and the working data blocks enable a particularly efficient technical implementation of the method for generating the hash value. Particularly advantageously, in this way implementations can be realized that have a much lower requirement for gate equivalents than the conventional implementations, based for example on U.S. Pat. No. 6,829,355 B2.
In addition, in the design of the present invention, it may be particularly advantageous if per working cycle only one input data block has to be modified, and that the functions G, F proposed according to the present invention act only on two working data blocks, namely W0, n+1, W4, n+1.
In a preferred specific embodiment, the steps of division of the input data into 16 input data blocks and of the initialization of eight working data blocks can take place simultaneously. Alternatively, these steps can also be carried out in succession or in overlapping fashion.
In an advantageous specific embodiment, it is provided that
Particularly preferably, the functions ROTRy (x), SHRy (x) are defined in the same way as in “Federal Information Processing Standards Publication, Secure Hash Standard, FIPS PUB 180-3, 2008.”
In another advantageous specific embodiment, it is provided that
B) in the case where m=2
In variant A) of the above-named specific embodiment, the input data from which the hash value is generated are thus divided into 16 input data blocks each having a length of 32 bits. Variant A) of the present specific embodiment represents a starting point for indicating a method for generating a hash value that is compatible with the SHA-2 standard of type SHA 256, as shown below.
Variant B) of the above-named specific embodiment is a variant of the present invention representing a basis for hash value formation according to the SHA-2 standard type SHA512.
With regard to the shift and rotation operations on the various input data blocks and working data blocks, reference is further made to the FIPS standard identified above. The corresponding functions are defined there in detail, and are preferably used in the same way in the present context.
In a further preferred specific embodiment, eight hash data blocks are provided, each of the eight hash data blocks having a length of 32*m bits, and where after execution (r*N) times of step c) according to patent claim 1, the content of the working data blocks is added, preferably blockwise, to the content of the hash data blocks, where r is a whole number greater than or equal to 1. In this way, after each execution of step c), i.e., the modification of the input data blocks and of the working data blocks according to the rules proposed according to the present invention, a hash value is iteratively formed that is stored in the hash data blocks. In a preferred specific embodiment, N=64 and m=1, so that each hash data block has a length of 32 bits.
As long as a length of the input data from which the hash value is to be formed does not exceed for example 512 bits, according to a specific embodiment it is adequate to write the input data completely to the input data blocks and to carry out the method according to the present invention. After the execution N times of step c) of modification, data are then already present in the working data blocks that can be used as the hash value.
For the case in which the hash value is to be formed using the design of the present invention from input data longer than 512 bits, after the step c) of modification has been executed N times it is however possible, as proposed above, to first copy the content of the working data blocks to the hash data blocks, or add it thereto, and to subsequently carry out at least one further execution N times of step c) of the method, so that a hash value is formed, or accumulated, iteratively in the hash data blocks that is a function of the total input data (greater than 512 bits).
According to a further advantageous specific embodiment, the step of addition of the content of the working data blocks to the content of the hash data blocks advantageously has the following steps: assignment of a sum of first data block W7,n and hash data block H7,n to hash data block H0, n+1. In other words, the content of working data block W7 of the current clock cycle n and the content of hash data block H7 of the current working or clock cycle n are used as input quantities for the adder, and the sum thereof is assigned to hash data block H0 for the following working cycle n+1. In addition, a respective value of hash data block HI-1 of the current clock cycle n is assigned to hash data block HI of the following clock cycle n+1, for I=1 through 7.
In a further advantageous specific embodiment, it is provided that m=1 and/or N=64 and/or in the step of initialization of the eight working data blocks the following assignment takes place: W0,0=0×6a09e667, W1,0=0×bb67ae85, W2,0=0×3c6ef372, W3,0=0×a54ff53a, W4,0=0×510e527f, W5,0=0×9b05688c, W6,0=0×1f83d9ab, W7,0=0×5be0cd19, and/or the eight hash data blocks are initialized using the following assignment: H0,0=0×6a09e667, H1,0=0×bb67ae85, H2,0=0×3c6ef372, H3,0=0×a54ff53a, H4,0=0×510e527f, H5,0=0×9b05688c, H6,0=0×1f83d9ab, H7,0=0×5be0cd19. In this specific embodiment, the method according to the present invention corresponds to the SHA-2 method of the type SHA 256 with regard to the result of the hash value. Thus, despite a calculation method deviating significantly from the existing art, the same hash values are obtained as in SHA 256.
Particularly advantageously, this variant of the present invention accordingly enables complete compatibility with the standardized FHA 256 method, although at the same time a significantly more efficient implementation is advantageously possible than in the known devices. In a further advantageous specific embodiment, it is provided that m=2 and/or N=80 and/or in the step of initialization of the eight working data blocks the following assignment takes place: W0,0=0×6a09e667f3bcc908, W1,0=0×bb67ae8584caa73b, W2,0=0×3c6ef372fe94f82b, W2,0=0×a54ff53a5f1d36f1, W4,0=0×510e527fade682d1, W5,0=0×9b05688c2b3e6c1f, W6,0=0×1f83d9abfb41bd6b, W7,0=0×5be0cd19137e2179, and/or the eight hash data blocks are initialized using the following assignment: H0,0=0×6a09e667f3bcc908, H1,0=0×bb67ae8584caa73b, H2,0=0×3c6ef372fe94f82b, H3,0=0×a54ff53a5f1d36f1, H4,0=0×510e527fade682d1, H5,0=0×9b05688c2b3e6c1f, H6,0=0×1f83d9abfb41bd6b, H7,0=0×5be0cd19137e2179.
In this variant of the present invention, a compatibility with the SHA 512 standard is advantageously provided, and again a particularly efficient implementation is enabled that requires a lower number of gate equivalents than the conventional systems.
In further specific embodiments, compatibility can also be produced to the existing standards SHA224 and SHA384. For this purpose, instead of the above-mentioned initialization values for the working data blocks and/or the hash data blocks in SHA256 or SHA512, the values from chapters 5.3.2 (SHA224) or 5.3.4 (SHA384) of the document “FIPS PUB 180-4 FEDERAL INFORMATION PROCESSING STANDARDS PUBLICATION Secure Hash Standard (SHS) CATEGORY: COMPUTER SECURITY SUBCATEGORY: CRYPTOGRAPHY Information Technology Laboratory National Institute of Standards and Technology, Gaithersburg, Md. 20899-8900, March 2012” are to be used, where the parameter m=1 is to be chosen for SHA224 and m=2 is to be chosen for SHA384.
In addition, in a specific embodiment for SHA224 compatibility it can be provided to use only seven of the eight hash data blocks as output hash value (7*32 bits (m=1) yields 224 bits).
In addition, in a specific embodiment for SHA384 compatibility it can be provided to use only six of the eight hash data blocks as output hash value (6*64 bits (m=2) yields 384 bits).
In a further advantageous specific embodiment, it is provided that a first shift register is used for the at least temporary storage of the input data blocks. Alternatively or as a supplement, a second shift register can be used for the at least temporary storage of the working data blocks. Further alternatively or as a supplement, advantageously a third shift register can be used for the at least temporary storage of the hash data blocks.
The use of one or more shift registers for storing the corresponding data blocks is particularly advantageous because the method according to the present invention having the functions T, G, and F can be implemented very efficiently using shift registers. In particular, in this way it is also possible to omit numerous multiplexers or address decoders etc., as are required in conventional implementations, again significantly reducing the complexity and thus also the costs of a corresponding implementation in circuitry of the present invention.
In a further advantageous specific embodiment, it is provided that the step of assignment of the content of input data block Mi,n to input data block Mi=1,n+1 for i=1 through 15 has a preferably blockwise shifting of the content of input data block Mi,n to the first shift register, and/or the step of assignment of the content of working data block Wk,n to working data block Wk+1,n+1 for k=0, k=1, k=2 and for k=4, k=5, k=6 has a preferably blockwise shifting of the content of working data block Wk,n to the second shift register, and/or the step of assignment of the value of hash data block HI-1,n to hash data block HI,n+1 for I=1 through 7 has a preferably blockwise shifting of the content of hash data block HI-1,n to the third shift register.
Particularly advantageously, in a further specific embodiment it is provided that in a first operating phase the first shift register and the second shift register are clocked together for N clock cycles in order to control the preferably blockwise shifting of the content of the first shift register and the preferably blockwise shifting of the content of the second shift register.
During this first operating phase, if a third shift register is provided for the storage of the hash data block this third shift register does not already have to be clocked. In a second operating phase that preferably directly follows the first operating phase, it is provided that the second shift register and the third shift register are clocked together for 8 clock cycles, no clocking of the first shift register preferably taking place during the second operating phase.
In this way, a particularly efficient and energy-saving operation is enabled of the shift registers that can be used according to the present invention.
In a further advantageous specific embodiment, it is provided that
The above calculating enable a particularly efficient determination of the corresponding terms for evaluating the functions proposed according to the present invention, and avoid unnecessary multiple calculations of the same expressions.
Below, exemplary specific embodiments of the present invention are explained with reference to the figures.
a,b,c schematically each show input data blocks and working data blocks according to a specific embodiment, in different clock or working cycles.
A further hash value formation, this time using second input data MSG2 which are different from first input data MSG1, but using the same hash algorithm, is designated by arrow A2 in
In a following step 210, cf.
In a preferred specific embodiment, the steps of division 200 of the input data and of the initialization 210 of eight working data blocks can also take place simultaneously. Alternatively, these steps can also take place in succession or in overlapping fashion.
In a following step 220, the input data blocks, or at least one of the input data blocks, and the working data blocks, or at least one of the working data blocks, are modified according to the rules described in detail below, in order to generate a hash value.
For this purpose,
Device 100 has at the input side a message MSG, and as a function of this message MSG forms a hash value HW, using the method according to the present invention, and outputs it at its output. The hash value formation takes place for example in processing unit 110, which is fashioned for the execution of the method according to the present invention.
Optionally, device 100 can also have a data division unit 120, indicated by a dashed rectangle, which conditions message MSG before this message is supplied to device 110 in the form of digital input data M.
For example, according to a specific embodiment of the present invention in which the parameter m is selected to be m=1, the above-described 16 input data blocks can overall accommodate 512 bits of data. If the message MSG from which hash value HW is to be formed has exactly 512 bits, then message MSG can be supplied directly as digital input data M to device 110 for the hash value formation.
If message MSG has a length less than 512 bits, it is for example possible to adapt the length of message MSG to the reference length of 512 bits by filling bit locations in a predefined manner, in particular using padding. The padding can for example include the annexing of pre-determinable bit sequences at the beginning or at the end of message MSG. In this case, thus, input data M are obtained, using padding, from message MSG, where the padding can be carried out for example by unit 120.
If message MSG has a length greater than 512 bits, the method of the present invention is also applicable; in this case, preferably first a decomposition of message MSG takes place into blocks each having 512 bits plus, if warranted, a remaining data block having a length less than 512 bits, gradually supplied to device 110 for hash value formation.
For the following description, it is assumed that message MSG (
Working data blocks W0 through W7 can also be initialized using pre-determinable values. In a further specific embodiment, however, this is not required; that is, the initialization can take place using the null value or random values or the like.
However, particularly preferably, for the initialization of working blocks W0 through W7 in a further specific embodiment the following values are used for the initialization:
W0,0=0×6a09e667, W1,0=0×bb67ae85, W2,0=0×3c6ef372, W3,0=0×a54ff53a, W4,0=0×510e527f, W5,0=0×9b05688c, W6,0=0×1f83d9ab, W7,0=0×5be0cd19.
In the present notation, the prefix “0×” means that these are initialization values for the working data blocks by hexadecimal numbers. The first index indicates which of the eight working data blocks is concerned, and the second index indicates a working cycle for the execution of the hash value formation. For example, thus working data block W0 is initialized for the 0th working cycle (n=0) with the hexadecimal number 6a09E667 (W0,0=0×6a09e667), and so forth.
After the initialization, there results the state shown schematically in
Working data blocks W0 through W7 are for example initialized with their initialization values W0,0 through W7,0 according to the statements made above.
After the initialization, which corresponds to the zeroth working or clock cycle, i.e. n=0 (cf.
Step 222a (
In a further step 222b (
If a second shift register is used for the at least temporary storage of working data blocks Wk, then the shift operation corresponding to method step 222b according to the present invention can preferably be carried out simultaneously or synchronously with the shift operation according to step 222a, so that the same control signals can be used for the relevant shift registers.
As can be seen from a comparison of input data blocks M0 through M15 and working data blocks W0 through W7 at the first working cycle n=0 with input data blocks M0 through M15 and working data blocks W0 through W7 according to the following working cycle n=1 (
This can be implemented particularly efficiently using shift registers.
Only input data block M15, and working data blocks W0, W4, obtain their new (present at working cycle n=1) content not through a shift operation but rather through the evaluation of functions T, G, F provided according to the present invention.
Accordingly, input data block M15 according to
The assignment of the function values of function T, G, F to the corresponding data blocks takes place, in the flow diagram according to
Particularly preferably, in a specific embodiment the method sequence described above with reference to
In a further advantageous specific embodiment, in particular after execution N times of steps 222a through 228 according to
The new evaluation of function T according to the present invention, this time based on input values of working cycle n=1, has resulted in the assignment of a corresponding functional value T1 only to input data block M15, i.e. M15, 2=T1. The same holds for working data blocks W0, W4, to which the new function values G1, F1 have been assigned, i.e. W0,2 =G1 and W4, 2=F1.
In a particularly preferred specific embodiment, the method sequence according to the present invention of steps 222a through 228 is repeated N=64 times, as indicated by the dashed arrow from step 228 to step 222a in
In a particularly preferred specific embodiment, for which m=1 is selected, there results the following for the definition of functions T, G, F according to the present invention:
T=M
0,n
+M
9,n+(ROTR17 (M14,n) XOR ROTR19 (M14,n) XOR SHR10 (M14,n))+(ROTR7 (M1,n) XOR ROTR18 (M1,n) XOR SHR3 (M1,n)), [Equation 1],
where ROTRy (x) is a bitwise rotation of operand x by y bits to the right, where SHRy(x) is a bitwise logical shift of operand x by y bits to the right, where XOR is an exclusive OR operation;
G=T0+T1 [Equation 2],
where
T0=M0,n+W7,n+(ROTR6(W4,n) XOR ROTR11(W4,n) XOR ROTR25(W4,n)+((W4,n AND W5,n) XOR (NOT(W4,n) AND W6,n))+Kn, [Equation 3],
where T1=(ROTR2(W0,n) XOR ROTR13(W0,n) XOR ROTR22(W0,n))+((W0,n AND W1,n) XOR (W0,n AND W2,n) XOR (W1,n AND W2,n)), [Equation 4],
where AND is a bitwise AND operation, NOT is a bitwise negation, Wk,n is the kth working data block of processing cycle n, Kn is a specifiable constant (preferably, for different working cycles n, in each case a different value is indicated for constant Kn); the function F is defined as:
F=W
3,n
+T0 [Equation 5].
It is to be noted that values T0, T1 according to Equation 3 and Equation 4 are auxiliary quantities for the calculation of functions G, F, and that in particular auxiliary quantity T0 according to Equation 3 is not to be confused with quantity Tn=0, abbreviated T0, which quantity T0 represents the initial value of the function T (Equation 1) at working cycle n=0.
Thus, in order for example to determine an output value T0 of function T for working cycle n=0 (initialization state), as input quantities the contents of input data blocks M0,0, M9,0, M14,0, M1,0 are supplied to the function T as input quantities, and are subjected to corresponding shift or rotation operations. Output value T0 of function T for working cycle n=0 results as the sum of the individual expressions according to the above definition. According to the present invention, this value is assigned to input data block M15 of the next following working cycle (here: n=1; cf.
For the at least temporary storage of the 16 input data blocks M0 through M15, in device 1100 according to
the unit of a corresponding input data block. This means that after a shift operation for example the content of block M15 has been shifted into block M14. This is indicated in
A second shift register SR_W is provided for the at least temporary storage of working data blocks W0 through W7. Second shift register SR_W correspondingly has a total of eight data blocks each having a bit width of 32 bits (in the present case, m=1 is selected). A shift operation takes place for second shift register SR_W in a manner corresponding to first shift register SR_M, i.e., using corresponding controlling by a control signal not shown in
In addition,
The control signals for the above-described shift registers can be generated by a control unit (not shown) of device 1100, realized for example in the form of a state machine or also by an ASIC and/or FPGA or the like.
Device 1100 further has a first function block 1110 that is provided for the execution of first function T. For this purpose, first function block 1110 has an input 1112 via which the relevant input data can be supplied to first function block 1110. In the case in which parameter m=1 is selected, these are for example the contents of input data blocks M0, M1, M9, M14. The supplying of the corresponding input data to input 1112 of function block 1110 is symbolized in
First function block 1110 correspondingly evaluates first function T and outputs, at its output 1114, a corresponding function value that is supplied to shift register SR_M, specifically to the data block that corresponds to input data block M15. In this way, the above-described step c3) of the assignment of an output value of first function T to input data block M15 is realized. For this purpose, output 1114 of first function unit 1110 is preferably connected directly to a preferably parallel input M15E of input data block M15. In terms of circuitry, this can be realized for example by a 32-bit parallel data bus from output 1114 of first function unit 1110 to input M15E of input data block M15.
In a comparable manner, the input data (input data blocks M0, M1, M9, M14) can be supplied via parallel data buses from first shift register SR_M to input 1112 of first function unit 1110.
The output value of function T or of function block 1110, formed in the nth working cycle based on input data M0,n, M1,n, M9,n, M14,n, is designated Tn (cf. also
Advantageously, the connection of components 1114, M15E of the specific embodiment according to
Also shown in
As a function of these input data, second function unit 1120 evaluates second function G according to the present invention and outputs, at output 1124, a corresponding output value (Gn for working cycle n) of function G. In a particularly preferred specific embodiment, this output value is supplied directly to the data block of second shift register SR_W, which corresponds to first working data block W0. For this purpose, preferably a direct data connection is provided between output 1124 of second function unit 1120 and input W0E of the relevant data block W0 of second shift register SR_W, which can be fashioned for example in the form of a 32-bit-wide parallel data bus.
A third function unit 1130 is also shown in
The structure shown in
In a particularly preferred specific embodiment, in a first step input data blocks M0 through M15 (
W0,0=0×6a09e667, W1,0=0×bb67ae85, W2,0=0×3c6ef372, W3,0=0×a54f53a, W4,0=0×510e527f, W5,0=0×9b05688c, W6,0=0×1f83d9ab, W7,0=0×5be0cd19;
cf. step 210 from
In a particularly preferred specific embodiment, hash data blocks H0 through H7 (
H0,0=0×6a09e667, H1,0=0×bb67ae85, H2,0=0×3c6ef372, H3,0=0×a54ff53a, H4,0=0×510e527f, H5,0=0×9b05688c, H6,0=0×1f83d9ab, H7,0=0×5be0cd19.
In a preferred specific embodiment, device 1100 also has, besides the components described above with reference to
In order to generate a hash value as a function of digital input data written to input data blocks M0 through M15 in the context of the initialization process, according to a particularly preferred specific embodiment the method described in the following is carried out.
Beginning from the initialization state (working cycle n=0, working data blocks and hash data blocks initialized as above with non-disappearing values, in the present case indicated as hexadecimal numbers), there takes place a modification of input data blocks M0 through M15, in the present case implemented using shift register SR_M, or its controlling or clocking, according to the method sequence according to
Analogously to this, working data blocks W0 through W7 are also modified according to the method sequence from
In a particularly preferred specific embodiment, the sequence according to steps 222a, 222b, 224, 226, 228 from
After the 64th execution of the method sequence according to
If, however, message MSG forming the basis of the hash value formation (
In a particularly preferred specific embodiment, the addition of the content of second shift register SR_W to the content of third shift register SR_H takes place using adder 1200 and a clocking eight times of shift register SR_W, SR_H, by which second operating phase BP2 (
During this second operating phase BP2, there does not take place a clocking of first shift register SR_M, thus reducing the electrical consumption of energy. In a specific embodiment, the clocking of the shift registers can take place in such a way that in the first operating phase a first shift enable signal SE1 is supplied to shift registers SR_M, SR_W, causing a synchronous clocking of these shift registers SR_M, SR_W, and that in the second operating phase a second shift enable signal SE2 is supplied to shift registers SR_W, SR_H, causing a synchronous clocking of these shift registers SR_W, SR_H.
In the following, the addition of the content of second shift register SR_W to the content of third shift register SR_H is described in more detail.
At the beginning of second operating phase BP2 (
This process, which for example can also include a synchronous clocking of shift registers SR_W, SR_H, is repeated a total of eight times, so that effectively a “256-bit addition” has taken place of the contents of shift registers SR_W, SR_H, whose result is now present in third shift register SR_H in the form of hash data blocks H0 through H7. In contrast to a true 256-bit addition (or 512-bit addition for m=2), in a specific embodiment the addition however preferably takes place in block-by-block fashion per 32-bit block (m=1) or per 64-bit block (m=2), preferably without carrying between adjacent blocks. To this extent, there exists a difference from the addition of true, 256-bit-wide data words.
At the end of second operating phase BP2 (
If a further time index ν is introduced for second operating phase BP2 that is different from the time index, or working cycle index, n of the first operating phase, then the presently described addition of the contents of shift registers SR_W, SR_H in second operating phase BP2 can be indicated by the following rules.
At the beginning of second operating phase BP2, which corresponds to the end of the first operating phase, the index indicating the working cycle of the input data register has the value n=63. At the same time, time index ν is initialized for second operating phase BP2: ν=0. In working data blocks W0 through W7, data W0,n=63, . . . , W7,n=63 are present, also designated in the following as W0,ν=0, . . . , W7, ν=0. Likewise, hash data blocks H0 through H7 are in the following also designated H0,ν, . . . , H7, ν.
During second operating phase BP2, time index ν is incremented up to its maximum value of ν=7, thus defining eight addition cycles. In each addition cycle, the following assignments take place:
H
0, ν+1
=W
7,ν
+H
7,ν
H
k, ν+1
=H
k−1, ν for k=1, . . . , 7
First, according to a specific embodiment, in first operating phase BP1 shift registers SR_M, SR_W are thus clocked N=64 times in order to form at least temporary hash values in working data blocks W, and subsequently in second operating phase BP2 shift registers SR_W, SR_H are clocked eight times in order to carry out the addition of the at least temporary hash values from the working data blocks to the values of the hash data blocks. Thus, up to this point a total of 64+8=72 working or clock cycles of device 1100 are required.
After the above-described addition of the content of second shift register SR_W to third shift register SR_H, in a preferred specific embodiment there advantageously takes place a renewed initialization of input data blocks M0 through M15, in which a next data block, including 512 bits length of message MSG, is written to first shift register SR_M. Subsequently, the two shift registers SR_M, SR_W are in turn operated for 64 clock pulses (n=64 to n=127) in the manner described above (cf. for example
Details concerning the division of a message MSG having more than 512 bits (for m=1) (and having more than 1024 bits for m=2), and its decomposition into 512-bit blocks (1024-bit blocks), or concerning a padding, can be learned for example from standard document FIPS180-2, described above.
If the values proposed according to a specific embodiment are used for the initialization of working data blocks W and hash data blocks H, the values also being the basis of standard document FIPS180-2, then the example method according to the present invention, using the implementation according to
The above-indicated initialization values can for example also be found in the document “FIPS PUB 180-4 FEDERAL INFORMATION PROCESSING STANDARDS PUBLICATION Secure Hash Standard (SHS) CATEGORY: COMPUTER SECURITY SUBCATEGORY: CRYPTOGRAPHY Information Technology Laboratory National Institute of Standards and Technology Gaithersburg, Md. 20899-8900 March 2012,” and for the initialization of the working data blocks and/or hash data blocks according to chapter 5.3.3 of the document, and for the initialization of constants Kn from Equation 3 of the present application according to chapter 4.2.2 of the document, namely for K0 , . . . , K63 these are the values 0×428a2f98, 0×71374491, 0×b5c0fbcf, 0×e9b5dba5, 0×3956c25b, 0×59f111f1, 0×923f82a4, 0×ab1c5ed5, 0×d807aa98, 0×12835b01, 0×243185be, 0×550c7dc3, 0×72be5d74, 0×80deb1fe, 0×9bdc06a7, 0×c19bf174, 0×e49b69c1, 0×efbe4786, 0×0fc19dc6, 0×240ca1cc, 0×2de92c6f, 0×4a7484aa, 0×5cb0a9dc, 0×76f988da, 0×983e5152, 0×a831c66d, 0×b00327c8, 0×bf597fc7, 0×c6e00bf3, 0×d5a79147, 0×06ca6351, 0×14292967, 0×27b70a85, 0×2e1b2138, 0×4d2c6dfc, 0×53380d13, 0×650a7354, 0×766a0abb, 0×81c2c92e, 0×92722c85, 0×a2bfe8a1, 0×a81a664b, 0×c24b8b70, 0×c76c51a3, 0×d192e819, 0×d6990624, 0×f40e3585, 0×106aa070, 0×19a4c116, 0×1e376c08, 0×2748774c, 0×34b0bcb5, 0×391c0cb3, 0×4ed8aa4a, 0×5b9cca4f, 0×682e6ff3, 0×748f82ee, 0×78a5636f, 0×84c87814, 0×8cc70208, 0×90befffa, 0×a4506ceb, 0×bef9a3f7, 0×c67178f2, i.e. for example K0=0×428a2f98, K7=0×ab1c5ed5, K8=0×d807aa98. These values for Kn also hold for SHA224; however, for SHA224 other values are to be selected for the initialization of the working data blocks and/or hash data blocks than for SHA256.
If, for the initialization of the working data block W and hash data blocks H, the values proposed according to a further specific embodiment are used, also forming the basis of standard document FIPS180-2, then the method according to the present invention, using the implementation according to
These values for Kn also hold for SHA384; however, for SHA384 other values are to be selected for the initialization of the working data blocks and/or hash data blocks than for SHA512.
Differing from conventional SHA-2 implementations, device 1100 according to the example embodiment of the present invention however uses a less complex design, which is reflected in particular in a substantial reduction of the number of required gate equivalents for the implementation of device 1100 according to the present invention. In particular, the example implementation according to
In a further preferred specific embodiment, the value 2 is chosen for the parameter m. In this case, input data blocks M0 through M15, working data blocks W0 through W7, and, if warranted, hash data blocks H0 through H7 each have a data width of 64 bits. The same preferably holds for data buses that may be present between the relevant data blocks, or registers containing them. The number of data blocks itself does not change. To this extent, the structure shown in
Differing from the above under described specific embodiment, which relates to the 32-bit implementation with m=1, for a 64-bit implementation (m=2) the following definitions are to be selected for the functions T, G, F:
T=M
0,n
+M
9,n +(ROTR19 (M14,n) XOR ROTR61 (M14,n) XOR SHR6(M14,n))+(ROTR1 (M1,n) XOR ROTR8(M1,n) XOR SHR7 (M1,n)),
G=T0+T1,
where
T0=M0,n+W7,n+(ROTR14(W4,n) XOR ROTR18(W4,n) XOR ROTR41 (W4,n))+((W4,n AND W5,n) XOR (NOT (W4,n) AND W6,n))+Kn,
where T1 =(ROTR28(W0,n) XOR ROTR34(W0,n) XOR ROTR″(W0,n))+((W0,n AND W1,n) XOR (W0,n AND W2,n) XOR (W1,n AND W2,n)), where
F=W
3,n
+T0.
Given the selection of the parameter m=2, N=80 and the initialization values for working data blocks W0 through W7 and hash data blocks H0 through H7 according to the following equations, it is advantageously ensured that the method according to the present invention is completely compatible, with regard to the obtained hash values, with the SHA-2 standard of type SHA512.
W0,0=0×6a09e667f3bcc908, W1,0=0×bb67ae8584caa73b, W2,0=0×3c6ef372fe94f82b, W3,0=0×a54ff53a5f1d36f1, W4,0=0×510e527fade682d1, W5,0=0×9b05688c2b3e6c1f, W6,0=0×1f83d9abfb41bd6b, W7,0=0×5be0cd19137e2179,
H0,0=0×6a09e667f3bcc908, H1,0=0×bb67ae8584caa73b, H2,0=0×3c6ef372fe94f82b, H3,0=0×a54ff53a5f1d36f1, H4,0=0×510e527fade682d1, H5,0=0×9b05688c2b3e6c1f, H6,0=0×1f83d9abfb41bd6b, H7,0=0×5be0cd19137e2179.
In further advantageous specific embodiments, initialization values can also be used that deviate from the above-proposed initialization values, and/or other values can be chosen for constants Kn; in this case, complete compatibility with the SHA-2 methods is then not present. However, here as well powerful implementations result for the determination of hash values with low hardware complexity.
The use of the design according to the present invention may result in a reduction of approximately 40% with regard to the required number of gate equivalents. In addition, when there is compatibility with SHA 256 only 72 working cycles are required, and when there is compatibility with SHA 512 only 88 working cycles are required.
If, instead of the “serial” addition brought about by eight clockings of shift registers SR_W, SR_H and of the 32-bit-wide, or 64-bit-wide, adder 1200 (
In a further advantageous specific embodiment, in which message MSG forming the basis of the hash value formation is longer than 512 bits, a subsequent initialization of first shift register SR_M (for cycles n=64 through n=127) with a following block M of message MSG can take place already 16 working cycles earlier than for the above-described specific embodiment. This is possible because functions G, F proposed according to the present invention advantageously require only the content of input data block M0, but not the content of further input data blocks. Due to the topology of the specific embodiment according to
A further advantage of the shift register-based architecture illustrated by
In a further advantageous case of application, in which the length of message MSG forming the basis of the hash value formation is exactly 512 bits, the sum formation (cf. e.g. step 229 from
The design according to the present invention can be realized in the form of an ASIC and/or FPGA and/or microcontroller and/or DSP (digital signal processor, or through direct implementation in circuitry, resulting in the particular advantages of low complexity and efficient hash value formation. According to investigations carried out by applicant, the design according to the present invention can be realized for example in a standard CMOS process, using a maximum of approximately 12,000 gates or gate equivalents. For example, device 1100 according to
In a further particularly preferred specific embodiment, the round constants Kn can also be stored in an SRAM (static random access memory), further significantly reducing the complexity of the implementation in circuitry. This is recommended in particular in the case of an at least partial implementation of the present invention in an FPGA.
The design according to the present invention can for example also be realized in the form of VHDL code; here planned electronic circuits can be expanded by the functionality according to the present invention through supplementation corresponding VHDL codes.
Number | Date | Country | Kind |
---|---|---|---|
102013208836.1 | May 2013 | DE | national |