1. Field
The present application relates, generally, to data encryption and, more particularly, to conditioning of data to be substantially immune to frequency analysis.
2. Description of the Prior Art
In an age that depends on the private exchange of sensitive information, it is critical to have encryption methods that are fundamentally secure and relatively easy to implement. Such methods should also be reasonably immune from ever-increasing computational power; brute force and nuanced cryptanalysis have become relatively easy to apply. To combat these and other attacks, the key lengths have become longer, and encryption/decryption algorithms have become significantly more complicated.
Encryption methods date back to at least the time of the ancient Greeks, and now take a multitude of forms. The most robust modern approaches (e.g., Advanced Encryption Standard and Triple Data Encryption Standard) increase effectiveness by combining basic methods, often including cryptographic primitives such as hash functions and cryptographically secure pseudorandom number generators (CSPRNGs), into cryptographic systems. Indeed, given the often complicated combinations of techniques and operations comprising modern ciphers, it is sometimes difficult to categorize methods that combine fundamental algorithmic concepts, such as those underlying block ciphers, stream ciphers, substitution, transposition, and permutation ciphers.
It is generally accepted that there is an inverse relation between what is secure in a provable sense and what is secure from a practical standpoint. It is, therefore, a common goal in cryptography to find methods that can rigorously demonstrate security, while at the same time being practical to implement. It is also critical that such methods not rely on algorithmic secrecy, but rather remain open to inspection and evaluation.
Accordingly, a system and method are provided for data encryption. In an embodiment, at least one computing device has instructions that, when executed, cause the at least one computing device to receive at least one block of the data, modify the at least one block of data to cause each unique data element within the at least one block to appear with a respective predetermined frequency ratio, and to encrypt the block of data into ciphertext based at least on an encryption key.
In one or more embodiments, the at least one computing device may generate the encryption key or may receive the encryption key. Further, the encryption key may be generated based at least on a key exchange algorithm.
Moreover, the at least one computing device may be configured to decrypt at least one other block of ciphertext and modify the decrypted ciphertext so that each unique data element of the decrypted ciphertext appears in the unique data element's ratio prior to being modified with a respective predetermined frequency ratio.
In one or more embodiments, the at least one block of data comprises a plurality of blocks of data, and further wherein the encryption key is different for at least one of the plurality of blocks of data. Further, in one or more embodiments, the at least one computing device has instructions that, when executed, cause the at least one computing device to derive the different encryption key from a respective one of the plurality of blocks of data.
In one or more embodiments, the at least one computing device further receives instructions that, when executed, cause the at least one computing device to modify a character code associated with the data, wherein the respective predetermined frequency ratio is achieved by at least using the modified character code. The at least one computing device may have further instructions that, when executed, cause the at least one computing device to receive the modified character code. Moreover, the modified character code may include a plurality of characters that contain a substantially same distribution of elements.
In another embodiment, a system and method for encrypting data is provided that includes constructing at least one character code to cause each character in the at least one character code to have substantially a same distribution of elements. Further, the at least one character code is stored on at least one computing device, and the data are encoded with the stored at least one character code. Moreover, the encoded data are encrypted into ciphertext by the at least one computing device, based at least on one encryption key.
Other features and advantages of the present application will become apparent from the following description, which refers to the accompanying drawings.
For the purpose of illustration, there is shown in the drawings an embodiment which is presently preferred; it being understood, however, that the teachings herein are not limited to the precise arrangements and instrumentalities shown.
The present application regards conditioning data to provide encryption methods that are substantially impervious to cryptanalytic attack, such as via frequency analysis.
Referring to
Although the example shown in and described with reference to
In an embodiment, eight pseudorandom streams are generated, each being composed of 128 32-bit numbers, which may be integers or decimals.
Continuing with reference to
In an alternate embodiment, the CSPRNG 9 simply outputs 1019 32-bit values, so that no elements require removal.
To ensure a bijective encryption function (one with a unique inverse), the resulting 1019-element random array is then indexed from 1 to 1019 by array processor 11 to produce array, T:
T={{1,d1},{2,d2}, . . . ,{1019,d1019}}.
Thereafter, T is sorted on di and only the resulting sequence of indices are extracted. This yields an unpredictable and unknowable sequence of 1019 unique position pointers. This sequence is referred to herein, generally, as the pre-key, J 12.
If the CSPRNG 9 is properly implemented and has not been compromised by an adversary, the non-linearity of the hash function ensures that, without knowing the exact value of DHK 1, it is highly unlikely that specific knowledge of J 12 will be determined.
It is envisioned herein that the generation of J 12 is not limited to the precise methods or techniques with respect to the use of cryptographic primitives, described above. Many suitable alternative combinations and configurations are useable in accordance with the teachings herein. For example, J 12 may be generated using a Knuth-type shuffle or a hardware-based source of entropy and thereafter directly shared using a secure channel.
Although J 12 may be used to directly shuffle blocks of data, it is possible to incorporate more than one shuffle algorithm. If the block and pre-key lengths are a prime number, then the sequence of position pointers may be autoshuffled in a bijective fashion. As used herein, the term, autoshuffle, refers generally to an algorithm that rearranges the order of a sequence in a non-linear manner and based on the content and number-theoretic characteristics of the sequence itself. Moreover, the encryption method described herein is also suitable for use with non-prime block and pre-key lengths. In cases of non-prime lengths, however, having fewer elements whose values are coprime to the sequence length is not as efficient and requires greater care in implementation. Depending on the specific design of the CSPRNG 9, autoshuffling J 12 helps to erase any statistical properties that might yield useful information, such as in a cryptanalytic attack.
If the J1th element is equal to 1, 510, 1018, or 1019, then the J1th element is preferably used in lieu thereof. This process of examining subsequent elements may be repeated, as necessary, in order to establish a value for ε that is not equal to 1, 510, 1018, or 1019.
To construct G1, generator sequence and matrix processor 13 preferably begins with the number “1.” Successive elements are obtained by adding ε to the current value and taking the result modulo 1019. For example, with ε=273,
G1={1,274,547,820,74,347,620,893,147,420,693, . . . ,747}.
Finally, the value “0” is replaced with the value “1019.”
Next, a base matrix M 14 is preferably generated by setting the first row to G1. Each successive row Gi rotates the previous row one position to the left. Thus, for the present example,
G2={274,547,820,74,347,620,893,147,420,693,966, . . . ,1}.
The last row G1019 is equivalent to G1 rotated 1018 positions to the left, or in this example:
G1019={747,1,274,547,820,74,347,620,893,147,420, . . . ,474}.
Continuing with reference to
Array processor 16 operates to shuffle J 12 using the rows of S 15 as follows. First, array processor 16 shuffles J using S1 (row 1 of S), and then shuffles the output of this operation on S2. Array processor 16 continues to shuffle each output on the following row, Si+1. After 1019 shuffle operations, the final output is encryption key E 17, which is an unpredictable and unknowable sequence of 1019 position pointers with standard statistical measures and fixed point distribution that are substantially indistinguishable from those of a true random permutation of 1019 elements.
Although the example method shown in and described with reference to
Continuing with reference to the example method in
In accordance with the teachings herein, two basic methods for conditioning blocks of plaintext or data, herein referred to as “P” (19
In an embodiment, frequency-conditioning is accomplished by a form of coding referred to herein, generally, as parity adjusted character code (PACC). As described in greater detail below, and with reference to Table 1A, Table 1B, and Table 2, a PACC is particularly useful to deter a frequency analysis attack.
Table 1A shows Example 1 of a Parity Adjusted Character Code. Table 1B shows a mapping of extended characters for Example 1 of a Parity Adjusted Character Code. Table 2 shows Example 2 of a Parity Adjusted Character Code.
Referring now to Table 1A and Table 1B, below, one example of PACC is provided. The foundation of PACC is relatively simple and can be implemented in a variety of different ways. PACC may be adapted to any language, including as set forth in Table 1, English. As with ASCII, each character is encoded by one byte of information (a Unicode-style, 16-bit format or, alternatively, a custom length format may be employed). For example, given was the ordered set of values for one byte,
w={b1,b2, . . . b8}, for biε{0,1}.
Thereafter, all upper and lower case letters, all numbers, and the most common punctuation are assigned to the 8-bit numbers represented by w for which
There are 70 (8C4) of these values. The 27 other printable ASCII characters may be carefully be assigned by dividing them between the two groups for which
For example, if the code for the left bracket symbol “[” contains a combination of 3 ones and 5 zeros, the code for the right bracket symbol “]” would contain a combination of 5 ones and 3 zeros. Because these symbols are most often used in pairs, this approach helps to preserve parity in a given block of data.
—
Tables 1A and 1B show a sample character mapping for the common English symbols. Of course, many other mapping schemes are supported by the present application, any of which can be designed and chosen, for example, to maximize encoding efficiency for a given computing environment and language.
Table 2, below, shows an alternative and more robust mapping that includes one byte for converting lowercase to uppercase, and a second byte for mapping to an alternate character set. Here, the CAP byte is used to convert lowercase letters to uppercase. Upon decoding, the conversion effectively subtracts 32 from the subsequent ASCII character code, thereby converting the lower case letter to a capital letter. The ALT byte indicates that the subsequent byte(s) or block(s) are to be mapped using a different, predetermined and possibly custom character set. Indeed, using this scheme, the ALT byte may be employed to indicate the use of a different permutation of charter-to-code mappings within the current PACC system.
By extension and in an alternate embodiment, any block can specify the precise permutation of charter-to-code mappings to be used within the established PACC. This may take many forms such as:
1) an explicit assignment list;
2) an explicit permutation list;
3) a seed value to be used in conjunction with a CSPRNG for rearranging the assignments using, for example, a Knuth-type shuffle; or
4) a precoded instruction sequence for how to reassign the charter-to-code mappings for a given section of the plaintext (e.g., parameters for performing an autoshuffle).
For example, using only the first 5 entries in Table 2,
a=00001111
b=00010111
c=00011011
d=00011101
e=00011110
in an embodiment using an explicit assignment list, the first block of plaintext might begin
{0001110100011110000101110000111100011011}
indicating that the following charter-to-code assignments are being used:
a=00011101
b=00011110
c=00010111
d=00001111
e=00011011.
In the specific case of a binary-based PACC using an odd-length code, there cannot be a precisely equal distribution of elements. For example, a 9-bit code contains either 5 ones and 4 zeros, or 4 ones and 5 zeros. Codes comprising a PACC contain the substantially same distribution of elements.
Thus, and in accordance with an embodiment that uses this method, any alphanumeric character may be transformed into any other character by a simple permutation rule. Encrypting frequency-conditioned data in conjunction with the block permutation cipher, substantially as described above, is equivalently secure to the known and provably secure one-time pad method. Unlike the one-time pad which requires the key to be a random string, the key employed according to the teaching herein is a random permutation. Similar to the case of a one-time pad, if the key is a truly random permutation, then, theoretically, any given ciphertext can be deciphered into any plaintext message of the same length.
By extension, any word or message can be transformed into any other word or message of the same length, simply by rearranging its constituent ones and zeros. For example, the following two messages are both 18 characters (144 bits) in length (including spaces and punctuation):
Message 1: “Attack immediately”
Message 2: “Quick, run away!!!”
Using the PACC shown in Table 1, 72 of the constituent bits for each message are ones and the other 72 are zeros. Either message can be bitwise rearranged and thereby transformed into the other. There are (72!)2, or approximately 3.7×10207 different ways to map Message 1 to Message 2. Moreover, any 144-bit long ciphertext can be decrypted into either message depending on what key is chosen by a potential adversary.
Other base representations other than base 2 may be used in accordance with the teachings herein. Ensuring all digits (or symbols) appear with substantially equal frequency, any such frequency conditioning technique can be employed in a manner consistent with the present application.
Moreover and in an alternative embodiment, character processing methods other than PACC may be used in accordance with the teachings herein. For example, a compression algorithm (e.g., gzip) may be used on plaintext prior to encryption. In this case, the compressed file may be padded to achieve parity, as discussed below.
A block length of 1019 bits can accommodate 1016 bits of 8-bit PACC encoded plaintext or up to 1019-bits of a general purpose parity-balanced ASCII or compressed file. Thus, when employing 8-bit PACC, 508 of the bits should be configured in each block to be ones.
Referring now to
If plaintext message P 19 contains a fully frequency-balanced encoding (e.g., the 70 primary PACC characters from Tables 1A and 1B, or the PACC encoding shown in Table 2), P 19 need only be padded to a multiple of 1016 bits using an equal number of ones and zeros (or └1019/n┘ bits for an n-bit PACC, where “└ ┘” denotes the Floor function).
In cases where the plaintext message or file P 19 does not have a monobit frequency of ½ (e.g., ASCII or compressed formats), the length to which the message is to be padded to achieve parity is calculated. This value is referred to herein, generally, as the parity point,
If
For example, for the 12-bit data block p={10101110110} that contains 7 ones and 5 zeros,
The same process of determining
When using 8-bit PACC with 1019-bit blocks, three bits (e.g., {101}) must be added to the 127 byte frequency-conditioned plaintext block before encrypting the frequency-conditioned plaintext in order to have 1019 elements. For general purpose parity-balanced data, one bit is preferably added. Depending on the specific implementation, this bit may be either padding or an additional data bit.
Although the above-described examples employ a block cipher technique, frequency-conditioning data in accordance with the teachings herein is not limited to use in block ciphers. Other encryption techniques such as stream cipher techniques, may be used. For example, blocks may be of any arbitrary size, including the full length of the plaintext P 19, and the output of a CSPRNG may be XORed with a frequency-conditioned block of data in a manner consistent with the teachings herein.
After block encoder/processor 18 converts P 19 into a frequency-conditioned format and appropriately pads each block, key manager 20 may use output from block encoder/processor 18 to determine a length of the key sequence, such that each block may be encrypted with a unique key of the appropriate length. Said key sequence can be obtained by establishing a new DHK 1, via a secure channel or, alternatively, by expanding the current key using various stages of key establishment. In a preferred embodiment, the key sequence for additional blocks can be derived from a frequency-conditioned seed encrypted within a block of data. This seed may then be used in conjunction with a CSPRNG, in a manner similar to that described above and with reference to
In an example that includes 1019-bit blocks of data with the 8-bit PACC shown in Table 2, the first block may simply contain:
{{560 bits explicit PACC assignments}, {459 bits CSPRNG seed}}.
Subsequent blocks may then contain the plaintext message which has been encoded by block encoder/processor 18 with the designated PACC assignments and shuffled by the output of a CSPRNG initialized using the specified seed.
Generally, any block may substantially take the form of a collection of one or more substrings, such as in the example below:
{{v},{w},{x},{y},{z}}
where the substrings
v=control or instruction character(s) (e.g., “ALT”)
w=specification of charter-to-code mappings
x=CSPRNG seed
y=data
z=padding
may appear in a suitably arranged order and in multiple instances. One skilled in the art will recognize that data blocks may contain other suitable substrings, as well.
Moreover, it is possible to generate new key material in numerous ways consistent with the teachings herein. Beginning with E 17, key manager 20 generates key sequence K 21. For a one block message, K will be equal to E (K=E). The binary data B 22 is encrypted by array processor 23 using K 21 in the following manner.
For each block, an array Q is generated by attaching each element of K 21 to the corresponding bit bi in B 22. Using the first block as an example,
Q={{b1,K1},{b2,K2}, . . . ,{b1019,K1019}}.
Sorting Q on K 21 shuffles B 22 which, when extracted, yields ciphertext C 24. If F is the bijective function that applies K 21 to B 22, then F(B)=C.
Referring to
R={{1,K1},{2,K2}, . . . ,{1019,K1019}}.
Sorting R on K and extracting the shuffled indices produces the decryption key sequence, K′ 26.
Thereafter, an array, Q′, is generated by attaching each element of K′ to the corresponding bit ci in C 24:
Q′={{c1,K′1},{c2,K′2}, . . . {c1019,K′1019}}.
Sorting Q′ on K′ unshuffles C 24 which, when extracted, yields B 22. If F−1 is the inverse function that applies the decryption key K′ to C, then F−1(C)=B.
Once B 22 is retrieved, block decoder/processor 27 strips any post data padding, and converts the remaining binary code is to plaintext P 19 consistent with the chosen method of data conditioning.
Although the present application has been described in relation to particular embodiments thereof, many other variations and modifications and other uses will become apparent to those skilled in the art. It is preferred, therefore, that the present application be limited not by the specific disclosure herein, but only by the appended claims.
This application is based on and claims priority to U.S. Provisional Patent Application Ser. No. 61/297,722, filed on Jan. 22, 2010 and entitled GRANULAR PERMUTATION CIPHER, the entire contents of which are hereby incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
4146046 | Dobras | Mar 1979 | A |
4386233 | Smid et al. | May 1983 | A |
4516246 | Kenemuth | May 1985 | A |
5054067 | Moroney et al. | Oct 1991 | A |
5285497 | Thatcher, Jr. | Feb 1994 | A |
5533128 | Vobach | Jul 1996 | A |
5686715 | Watanabe et al. | Nov 1997 | A |
5727062 | Ritter | Mar 1998 | A |
5838796 | Mittenthal | Nov 1998 | A |
5902929 | Okamoto et al. | May 1999 | A |
6052786 | Tsuchida | Apr 2000 | A |
6097812 | Friedman | Aug 2000 | A |
6182216 | Luyster | Jan 2001 | B1 |
6269164 | Pires | Jul 2001 | B1 |
6295093 | Park et al. | Sep 2001 | B1 |
6459792 | Ohmori et al. | Oct 2002 | B2 |
6507678 | Yahagi | Jan 2003 | B2 |
6553516 | Suda et al. | Apr 2003 | B1 |
6804355 | Graunke | Oct 2004 | B1 |
6909783 | Incarnato et al. | Jun 2005 | B2 |
6934388 | Clark | Aug 2005 | B1 |
7006629 | Murray | Feb 2006 | B2 |
7026964 | Baldwin et al. | Apr 2006 | B2 |
7313235 | Liang | Dec 2007 | B2 |
7529365 | Liang | May 2009 | B2 |
7707431 | Liang | Apr 2010 | B2 |
7711549 | Feinberg et al. | May 2010 | B2 |
20060153382 | Mai | Jul 2006 | A1 |
20060291650 | Ananth | Dec 2006 | A1 |
20070098179 | Nave | May 2007 | A1 |
20080301431 | Hea | Dec 2008 | A1 |
20090045988 | Lablans | Feb 2009 | A1 |
20090132746 | Tom | May 2009 | A1 |
Number | Date | Country | |
---|---|---|---|
20110194687 A1 | Aug 2011 | US |
Number | Date | Country | |
---|---|---|---|
61297722 | Jan 2010 | US |