This invention relates to computing, communications, data security, and hash functions (hashing).
Hash functions are well known in the field of data security. The principle is to take data (a digital message, digital signature, etc.) and use it as an entry to a hash function resulting in an output called a “digest” of predetermined length which is intended to uniquely identify (“fingerprint”) the message. A secure (cryptographic) hash is such that any alteration in the message results in a different digest, even though the digest is much shorter than the message. Secure hash functions are also collision-resistant and one-way and preimage and second preimage resistant.
Cryptography and data security deal with digital signatures, encryption, document authentication, and hashing. In all of these fields, there is a set of basic tools/functions which are widely used, for instance hash functions. Several properties are required for the use of hash functions in cryptographic applications: preimage resistance, second preimage resistance and collision resistance.
In the recent years, much energy has been expended finding new hash functions, since collisions (weaknesses or successful attacks) have been found in the widely used SHA-0/1 and MD-5 standard hashes.
In cryptography, hash functions are essential for many primitives and protocols. After the above mentioned security crisis for MD5 and SHA-0/1, two hash standards used for a long time without much concern about their security, the U.S. NIST (National Institute of Standards and Technology) launched an international competition to define the new standard for hash functions. The competition started in 2008. Amongst the competitors, many were broken easily in the first round, since the developers were not aware of the cryptographic issues. In the remaining submissions of the second round, one called “Shabal” is one of the fastest-running (in terms of execution time) submissions.
Shabal has two parts: there is a mode of operation (see the published Shabal specification at: http://www.shabal.com/wp-content/uploads/Shabal.pdf, at Sec. 2.2), which is defined for a sufficiently secure permutation or function; and there is a proposed permutation designated P itself, described in Sec. 2.32 of the same document.
Certain cryptanalyses have been proposed following the NIST competition, which do not break Shabal hash function, but show a certain non-randomness of the permutation P (see http://ehash.iaik.tugraz.at/wiki/Shabal).
Shabal makes use of three 32-bit word arrays designated A, B and C shown in
When the last block of message Mk has been processed, the process arrives at what is called the final rounds (see
The size of the A, B. C, and M can be changed; the Shabal developers proposed B, C, M of 16 32-bit words, and A of 12 32-bit words.
Disclosed here is a new cryptographic (secure) hash function or process. The goal is a highly modular hash function that is also computationally efficient. The present hash function can conventionally be used for document integrity for exchanges and signatures. It can be also used as a derivation function or as a HMAC (hash message authentication code) by adding a key conventionally (as in for instance the well known HMAC-SHA1) and the term “hash” as used herein is intended to encompass all these uses, both keyed and non-keyed.
A hash function is a deterministic procedure that accepts an arbitrary input value, and returns a hash value. The input value is called the message, and the resulting output hash value is called the digest. The message is authenticated by comparing the computed digest to an expected digest associated with the message.
Thus disclosed here is a new function using in one embodiment the Shabal mode of operation.
The presently disclosed exemplary embodiments use the Shabal mode of operation, since it is quite natural, simple and has been proven secure. Some of the present modifications to Shabal are in the definition of the permutation P. In their specification, the Shabal developers tried to be efficient for both hardware (e.g., dedicated integrated circuits such as FPGA or ASIC) and software implementations, and to avoid using too much memory, in order to be practical to be embedded as hardware in low-cost devices such as smart cards or RFID. Being efficient in that way is less of a concern here. Furthermore, the first year of external cryptanalysis showed weaknesses in the current Shabal P permutation. The present P function is believed to be free of these defects. Finally, the present hash function uses a function instead of a permutation, and is also compatible with the Shabal mode of operation. A function is not necessarily a permutation; a permutation is invertible, while a function may be noninvertible.
In the present hash function, A, B, C and M are each 32 32-bit word arrays (i.e., 1024 bits). Having such large blocks has certain advantages (a security parameter called the ‘capacity’ is larger in this case than in Shabal), but at the same time, it is more complicated to build a sufficiently random function when the blocks are large. To minimize this difficulty, the present hash function uses a relatively large security parameter for the hash function, and may include more than 3 blank (final0 rounds at the end.
For the present P function, one goal is to make array A very hard for the user (e.g., an attacker) to control; notably, the user cannot insert directly message words into array A, since here the modifications of array A are only performed indirectly through modifications of the other arrays. This increases security against such known message attacks. Note that an attacker typically manipulates inputs to find collisions, which means breaking the hash function using a known message approach.
The new P function is as follows, expressed in computer software pseudo-code for ease of understanding: (This is conventionally similar to actual code but less detailed and not executable.) This P function includes (1) pre-steps, (2) left shift steps, and (3) final steps, per the comments set forth by the notation /* and */. This P function is used in the Shabal mode of operation explained above, in place of the Shabal P permutation.
Here operator “̂” indicates the XOR (exclusive OR) logical operation. “mod” indicates modulo. The prefix “0x” indicates a hexadecimal number. Operator “&” indicates the Boolean logic bit-by-bit AND operation.
The following per the pseudo-code is done in the pre-steps of function P different than in the Shabal P permutation:
The following is done in the P function left shift steps different than in the Shabal P permutation. Here the P function uses the LFSR-operation as in the Shabal P permutation, but with several possible modifications (depending on the embodiment) listed immediately below to make the P function more secure:
The following is done in the final steps of the P function different than in the P permutation in Shabal:
Two other possible modifications of the Shabal P permutation used in the present P function are the following:
The computer code is conventionally stored in code memory (computer readable storage medium, e.g., ROM) 40 (as object code or source code) associated with processor 38 for execution by processor 38. The incoming message to be hashed is received at port 32 and stored in computer readable storage medium (memory, e.g., RAM) 36 where it is coupled to processor 38. Processor 38 conventionally partitions the message into suitable sized blocks at software partitioning module 42. Other software (code) modules in processor 38 include the hash function algorithm module 46 which carries out the code functionality set forth above for the P function 39 and further includes code for the Shabal mode of operation 41. Coding software for the Shabal mode of operation 41 would be routine.
Also coupled to processor 38 are the P function computer readable storage medium (memory) 52, as well as a third storage 58 for the resulting hash digest. Storage locations 36, 52, 58 may be in one or several conventional physical memory devices (such as semiconductor RAM or its variants or a hard disk drive).
Electric signals conventionally are carried between the various elements of
Computing system 60 can also include a main memory 68 (equivalent to memories 36, 58), such as random access memory (RAM) or other dynamic memory, for storing information and instructions to be executed by processor 64. Main memory 68 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 64. Computing system 60 may likewise include a read only memory (ROM) or other static storage device coupled to bus 62 for storing static information and instructions for processor 64.
Computing system 60 may also include information storage system 70, which may include, for example, a media drive 62 and a removable storage interface 80. The media drive 72 may include a drive or other mechanism to support fixed or removable storage media, such as flash memory, a hard disk drive, a floppy disk drive, a magnetic tape drive, an optical disk drive, a compact disk (CD) or digital versatile disk (DVD) drive (R or RW), or other removable or fixed media drive. Storage media 78 may include, for example, a hard disk, floppy disk, magnetic tape, optical disk, CD or DVD, or other fixed or removable medium that is read by and written to by media drive 72. As these examples illustrate, the storage media 78 may include a computer-readable storage medium having stored therein particular computer software or data.
In alternative embodiments, information storage system 70 may include other similar components for allowing computer programs or other instructions or data to be loaded into computing system 60. Such components may include, for example, a removable storage unit 82 and an interface 80, such as a program cartridge and cartridge interface, a removable memory (for example, a flash memory or other removable memory module) and memory slot, and other removable storage units 82 and interfaces 80 that allow software and data to be transferred from the removable storage unit 78 to computing system 60.
Computing system 60 can also include a communications interface 84 (equivalent to port 32 in
In this disclosure, the terms “computer program product,” “computer-readable medium” and the like may be used generally to refer to media such as, for example, memory 68, storage device 78, or storage unit 82. These and other forms of computer-readable media may store one or more instructions for use by processor 64, to cause the processor to perform specified operations. Such instructions, generally referred to as “computer program code” (which may be grouped in the form of computer programs or other groupings), when executed, enable the computing system 60 to perform functions of embodiments of the invention. Note that the code may directly cause the processor to perform specified operations, be compiled to do so, and/or be combined with other software, hardware, and/or firmware elements (e.g., libraries for performing standard functions) to do so.
In an embodiment where the elements are implemented using software, the software may be stored in a computer-readable medium and loaded into computing system 60 using, for example, removable storage drive 74, drive 72 or communications interface 84. The control logic (in this example, software instructions or computer program code), when executed by the processor 64, causes the processor 64 to perform the functions of embodiments of the invention as described herein.
This disclosure is illustrative and not limiting. Further modifications will be apparent to these skilled in the art in light of this disclosure and are intended to fall within the scope of the appended claims.