The present invention relates to data security techniques. More particularly, some embodiments of the present invention relate to variable-length ciphers used for access authorization.
Token systems have been in use in modern civilization in various implementations to provide and control many forms of access. Access that can be and often times is controlled by tokens can include physical access to rooms, buildings, areas and so on; electronic access to servers and data files; electronic account access; and so on. Another form of access controlled by tokens is the ability to conduct transactions such as, for example, credit, debit and other financial transactions. Credit cards, charge cards, debit cards, loyalty cards and other purchase-related tokens are used to provide the consumers with ready access to funds. Such transactions can enhance convenience of purchases, extend credit to customers, and so on.
As modern society has evolved, so have our tokens. Early tokens included physical objects such as coins, documents, and other physical objects. One example of a simple physical object token is the subway token made famous by the New York subway system. This simple token resembled a coin, could be purchased at kiosks, and was used to control access to the subway system. Another example of simple physical token for granting access was the early railway token developed in the 19th century for the British railway system. This token was a physical object, such as a coin, that a locomotive engineer was required to have before entering a particular section of the railway. When the train reached the end of the section, the driver left the token at a drop point so it could be to be used by the next train going the other way. Because there was only one token for a given section of railway, the token system helped to ensure that only one train would be on that section of the track at a given time.
The railway token system minimized the likelihood of head on collisions, but this simple token also limited the ability for trains to follow one another along a given section. As such, the system evolved into a token and ticket system. In this system, if a train reached a checkpoint and the token was present, the driver was given a ticket to pass, leaving the token in place in case another train approached that section traveling in the same direction. Safeguards were implemented to ensure that tickets were correctly issued. As technology evolved, the physical token and ticket system evolved to include electronic signaling to control access to sections of the railway.
Another example of tokens to grant access is charge cards, credit cards and debit cards. Some attribute the ‘invention’ of credit cards to Edward Bellamy, who described them in his 19th century novel Looking Backward. Early cards were reportedly used in the early 20th century United States by fuel companies and by Western Union. By mid century, Diners Club produced a charge card for merchant purchases, which was followed shortly thereafter by American Express. These cards, now ubiquitous in our society, allow customers to make purchases and conduct transactions with relative ease. Early cards were embossed with a customer account number, which was manually transferred to a receipt via a carbon transfer process. Modern cards, or tokens, have evolved to use electronic mechanisms of storing data including, for example, magnetic stripes, RFID tags, and smart card and chip card technologies.
Other examples of tokens include government issued IDs such as driver's licenses and passports. Such tokens can also be used to control access in various forms. For example, a passport can be used to control access to countries and regions. Passports can also be used to access employment and licensing opportunities as a document to prove the holder's citizenship. A driver's license is another form of token, allowing access to driving privileges, and to establishments requiring proof of identity, residency or age. Still other examples of tokens can include bank drafts, stock certificates, currency and other token items relating to finance. Still further token examples can include tokens for physical access and security such as keys, card keys, RF or LC cards, RFID tokens, toll road transponders, and the like.
As these examples illustrate, the use of tokens for various forms of access has gained popularity in various business and industries and has evolved to embrace newly developed technologies. Tokens are not limited to these examples, but can take on various forms and use various instrumentalities and control, govern or arbitrate various forms of access in a variety of different ways. Tokens can be static tokens, where the token data does not change, or dynamic tokens, where the data changes over time or with each token use. An example of a static token is a magnetic stripe bankcard whose data remains the same with each swipe. An example of a dynamic token is a garage door opener employing rolling codes, wherein the code changes with each use. Dynamic tokens are generally thought to be more secure than static tokens because with dynamic tokens, although data might copy from a given use, that data is not valid for subsequent uses. Likewise, there can be two types of token-based access systems—static access systems and dynamic access systems. Say
One downside of token access, however, is the opportunity to defraud the system. For example, stolen or counterfeited tokens are often used to gain unauthorized access. In fact, the Federal Trade Commission reports that credit and charge card fraud costs cardholders and issuers hundreds of millions of dollars each year. As the importance of token access has grown so has the ability of those seeking to defraud the system. These attackers often seek to gain access to valuable data through multiple means including operating system and application security weaknesses and often use sophisticated computer algorithms to attack token security. Such attacks may take the form of repetitive attempts to access the protected system, with each attempt providing additional information. The security of the data is improved when an attacker must make a tremendous number of encryption queries or invest an unreasonable amount of computation time to gain access to encrypted information.
However, simple static tokens such as bank cards for example, typically do not require sophisticated algorithms for attack. Because these tokens are static and the data does not change from used to use, the token can be compromised simply by copying the token data to another token. Indeed, bankcard data is often copied or skimmed by attackers who gain access to the cards and perform an authorized swipe a card reader that stores information or who attach their own counterfeit card reader to a legitimate card reader (such as at an ATM terminal) to skim the data from an unwitting user when he or she uses the ATM terminal.
Token systems are not the only data systems that are susceptible to attacks. Accordingly, a number of encryption, ciphering or other obfuscation techniques have been developed to secure blocks of data in a number of applications and environments. For example, the Data Encryption Standard (DES) is an encryption technique based on a symmetric-key algorithm that uses a 56-bit key to encrypt data. DES was adopted as an official Federal Information Processing Standard (FIPS) for the United States and has enjoyed widespread use internationally as well. In more recent applications, the Advanced Encryption Standard (AES) cipher has also been used.
All of these techniques require that the data be transmitted in one or more fixed size blocks. Each block is typically, for example, 64, 128, 256 or 512 bytes in length. If the data does not conform to the block size used by the cipher, the remaining portion of the block must still be sent for the data to be recovered. Accordingly, data strings are often padded to fill out the data block, resulting in inefficiencies. In addition these techniques also restrict the data to a defined symbol set. In the case of DES, each of the eight bytes contained within a sixty-four bit block contain all values from 0-255. Many existing transmission formats require that a byte value be limited to the digits zero through nine or the letters A-Z.
In operation, DES takes a input comprising a fixed-length string of cleartext bits and encodes it to form a ciphertext bitstring of the same length. Like many encryption techniques, DES uses a key to perform the the encryption, adding a measure of security in that the key is typically required to decrypt the ciphertext string. The DES algorithm divides the data block into two halves and processes them using a Feistel routine, one half at a time.
The Advanced Encryption Standard, is also a block cipher that works on fixed-length blocks of data. Like DES, AES also takes an input block of a certain size, usually 128, and produces a corresponding output block of the same size, and uses a secret key to perform the encryption. Unlike DES, which is based on the Feistel scheme, AES is a substitution-permutation network, which is a series of mathematical operations that use substitutions and permutations.
According to one or more embodiments of the invention, various features and functionality can be provided to provide improved security for various forms of token transactions. Particularly, in accordance with one aspect of the invention, data security techniques such as, for example, various forms of variable-length ciphers, can be implemented for data transmission, including data transmission for use with token systems to provide an increased measure of security in the token data. In one embodiment, variable-length ciphers can be implemented while maintaining a fully deterministic system where any encrypted data decrypts to only the original data.
Accordingly, in some embodiments, a general cipher is used to capture encryption preserving arbitrary formats using a format-preserving Feistel such that the encryption can be format-preserving so that if the plaintext has some prescribed format, the encrypted ciphertext will have the same format. Consider a simple example of a cipher to map a name and address and credit card number in a predefined format. The cipher in this example can be configured to map an input (plaintext) of the form Name, Addr, CC to an output (ciphertext) of the form Name*, Addr*, CC*. Name*, like Name, must consist of two strings of letters each beginning with an upper case letter. Addr*, like Addr, must consist of alphanumeric characters, spaces, or commas followed by a valid postal code. CC*, like CC, must consist of 8-19 decimal digits followed by the output of the function L when applied to those digits. Furthermore, in this example, the ciphertext must be of the same length as the plaintext. For example, the ciphertext must occupy the same space as the plaintext and have the format necessary to be accepted by the software.
In some embodiments, a new primitive referred to as a general cipher is used. Unlike a conventional cipher, a general cipher has associated to it a family {Dom()}∈I of domains where I is some index set. For every key K, index , and tweak T the cipher specifies a permutation ET, K on Dom(). Making the general ciphers tweakable can provide enhanced security. A construction (called format-preserving Feistel) of a general cipher is provided that is able to produce a cipher over an arbitrary given domain {Dom()}∈I, which enables FORMAT-PRESERVING ENCRYPTION preserving arbitrary formats.
Consider the example of format-preserving encryption of credit card numbers. In this example, the goal is for the ciphertext CC*, like the plaintext CC, to be a sequence of digits whose last digit is the value of the function L applied to the other digits. Likewise, the length len(CC*) of CC* should be the same as the length len(CC) of CC. Assume the length ranges from 8 to 20 digits. Let I={8, 9, . . . , 20} and let Dom() be the set of all -digit numbers whose last digit is the value of L applied to the other digits. Now a general cipher E over {Dom()}∈I can be used. Encrypt CC (under key K and tweak T) by letting =len(CC) and letting the ciphertext be CC*=ET, K (CC), which has the desired format.
Moreover, this approach can be extended to cover more complex formats and a domain can be specified that captures the full example discussed above. The format-preserving Feistel in some embodiments is able to provide a general cipher over an arbitrary, given domain. The starting point can be the arbitrary domain cipher of Black and Rogaway, which combines a generalization of an unbalanced Feistel network with a technique called cycle walking. A format-preserving Feistel can be implemented to extend this to handle multiple domains with the same key and also to incorporate tweaks, and can be customizable. The round function is a parameter and can be based on a block cipher such as AES or DES, or on a cryptographic hash function such as SHA-256.
In various embodiments, some information about the plaintext (namely the format) is leaked by the ciphertext. One notion of security adapts the traditional PRP notion to general ciphers to capture no more than the format being leaked. Another uses a weaker message privacy (MP) and a still weaker notion of message recovery (MR) of security, because MP and MR are more efficient and may provide better security than PRP. (In the latter case a lot of security is often lost due to birthday attacks that don't threaten MP or MR.) This is particularly important in a context where domains may be small—for example, encrypting only 12 digits of a credit card number.
In an embodiment of the invention, a method is provided for enciphering data such as, for example, token information or other data. The method can be configured to use DES, AES or any other block cipher as the randomizing element of a modified Feistel network, and can also be implemented where the transmitted data is not limited to a fixed size block. Accordingly, the data can be of any length. For example, in some embodiments the string length ranges from one digit to 19 digits for VDES, and one to 66 digits for VAES. In accordance with one aspect of the invention, any secure randomizing function such as a deterministic random number generator could be used in place of the described block cipher randomizing function where the transmitted block size is related to the key length.
In an embodiment of the invention, the modified Feistel network is configured to use modulo addition or subtraction rather than XOR functions in each round of the encryption. Modulo addition and subtraction allow any symbol set to be encrypted while provided ciphertext in a block size that is equal to the plaintext block size. For example, ten decimal digits encrypt to 10 decimal digits while 10 alpha numeric characters encrypt to 10 alpha numeric characters. This can be advantageous, for example, in environments where encryption is added to legacy systems that are expecting the data to be delivered in predetermined block sizes. This is of particular value in the above-described example environment of encrypting bankcard token information in an existing transaction network, where the length of the encrypted data and the resultant symbol set must match the data to be transmitted using exiting infrastructures.
In an embodiment of the invention, a method for deterministically encrypting a plaintext symbol set having a variable block size, includes the steps of dividing the plaintext symbol set into first and second portions; applying a first encryption key to encrypt a data string and generate a second encryption key, wherein the data string includes a tweak; computing a determined number of encryption rounds using the second encryption key to create an enciphered symbol set, wherein the encryption rounds comprise successive encryption and modulo combination of alternating portions of the symbol set; and providing the enciphered symbol set in the same form as the plaintext symbol set. In some embodiments, computing a determined number of encryption rounds using the second encryption key can include in a first encryption round encrypting the first portion of the symbol set using the second key and combining the encrypted first portion with the second portion of the symbol set using a modulo operation; in a second encryption round, encrypting the second portion of the symbol set using the second key and combining the encrypted second portion with the output of the first round using a modulo operation; and in a subsequent encryption round, encrypting the output of the previous round set using the second key and combining the encrypted output of the previous round with the output of a round prior to the previous round using a modulo operation; and providing enciphered symbol set in the same form as the plaintext symbol set.
Applying the first key can include defining first and second parameters based on the tweak, and encrypting a combination of the first and second parameters using the first key to generate the second key. The encrypting can be done by encrypting a first parameter with the first key to obtain an encrypted first parameter, combining the encrypted first parameter with the second parameter and encrypting the combination of the encrypted first parameter and second parameter using the first encryption key.
The plaintext symbol set can include a variety of different types of information. One example includes token information. For example, the plaintext symbol set can comprise bankcard track data in a standard bankcard track data format, and wherein the enciphered symbol set is provided in the same format as the plaintext bankcard track data. Similarly, the plaintext symbol set can comprise bankcard track data comprised of symbols selected from the group consisting of decimal digits zero through nine, the modulo combination comprises modulo 10 addition or subtraction and the enciphered symbol set is comprised of symbols selected from the group consisting of decimal digits zero through nine. As another example, the modulo combination can comprise modulo 62 addition or subtraction to encipher a plaintext symbol set comprised of symbols selected from the group consisting of alphanumeric upper and lower case characters and decimal digits zero through nine and to output.
Other features and aspects of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, which illustrate, by way of example, the features in accordance with embodiments of the invention. The summary is not intended to limit the scope of the invention, which is defined solely by the claims attached hereto.
The present invention, in accordance with one or more various embodiments, is described in detail with reference to the following figures. The drawings are provided for purposes of illustration only and merely depict typical or example embodiments of the invention. These drawings are provided to facilitate the reader's understanding of the invention and shall not be considered limiting of the breadth, scope, or applicability of the invention. It should be noted that for clarity and ease of illustration these drawings are not necessarily made to scale.
The figures are not intended to be exhaustive or to limit the invention to the precise form disclosed. It should be understood that the invention can be practiced with modification and alteration, and that the invention be limited only by the claims and the equivalents thereof.
Various embodiments described herein are directed toward a system and method for providing a system for increasing security of transactions in various forms. In one embodiment, system provides systems and methods are described for using variable-length ciphers across a communication medium.
Before describing the invention in detail, it is useful to describe an example environment with which the invention can be implemented. One such example is that of a transaction card network including a token used to facilitate purchases or other transactions.
The token data is then sent to the appropriate financial institution or institutions, or other entities for processing. Processing can include, in one or more steps, authorization, approval and settlement of the account. As the example in
As only one example of a token 101, a credit card can be used with a conventional magnetic stripe included on one side thereof. Conventional magnetic stripes can include three tracks of data. Further to this example, the ISO/IEC standard 7811, which is used by banks, specifies: that track one is 210 bits per inch (bpi), and holds 79 six-bit plus parity bit read-only characters; track two is 75 bpi, and holds 40 four-bit plus parity bit characters; and track three is 210 bpi, and holds 107 four-bit plus parity bit characters. Most conventional credit cards use tracks one and two for financial transactions. Track three is a read/write track (that includes an encrypted PIN, country code, currency units, amount authorized), but its usage is not standardized among banks.
In a conventional credit card token, the information on track one is contained in two formats. Format A, is reserved for proprietary use of the card issuer. Format B includes the following:
The format for track two can be implemented as follows:
Although a credit card with magnetic stripe data is only one example of a token that can be used in this and other environments, this example environment is often described herein in terms of a credit card implementation for clarity and for ease of discussion. Upon entering into a transaction, a merchant may ask the customer to present his or her form of payment, which in this example is the credit card. The customer presents the token 101 (e.g., credit card) to the merchant for use in the transaction terminal 104. In one embodiment, the credit card can be swiped by a magnetic stripe reader or otherwise placed to be read by the data capture device 103. In the current example where a credit card utilizing a magnetic stripe is the token 101, data capture device 103 can include any of a variety of forms of magnetic stripe readers to extract the data from the credit card. Other forms of data capture devices 103, or readers, may also be used to obtain the information from token 101. For example, bar code scanners, smart card readers, RFID readers, near-field devices, and other mechanisms can be used to obtain some or all of the data associated with token 101 and used for the transaction.
The data capture device is in communicative contact with a terminal 104, which can include any of a number of terminals including, for example, a point of sale terminal, point of access terminal, an authorization station, automated teller machine, computer terminal, personal computer, work stations, cell phone, PDA, handheld computing device and other data entry devices. Although in many applications the data capture device 103 is physically separated, but in communicative contact with, the terminal 104, in other environments these items can be in the same housing or in integrated housings. For example, terminals such as those available from companies such as Ingenico, Verifone, Apriva, Linkpoint, Hypercom and others.
Continuing with the credit card example, the customer or cashier can swipe the customer's credit card using the card-swipe device, which reads the card data and forwards it to the cashier's cash register or other terminal 104. In one embodiment, the magnetic stripe reader or other data capture device 103 is physically separated, but in communicative contact with, the terminal 104. In other environments these items can be in the same housing or in integrated housings. For example, in current implementations in retail centers, a magnetic stripe reader may be placed on a counter in proximity to a customer, and electronically coupled to the cash register terminal. The cash register terminal may also have a magnetic stripe reader for the sales clerk's use.
The customer may be asked to present a form of ID to verify his or her identity as imprinted on the token 101. For other transactions such as debit card transactions, the user may be required to key in a PIN or other authentication entry.
Continuing with the current credit card example, the terminal 104 can be configured to print out a receipt (or may display a signature page on a display screen) and the customer may be required to sign for his or her purchases, thus providing another level of authentication for the purchase. In some environments, terminal 104 can be configured to store a record of the transaction for recordkeeping and reporting purposes. Further, in some environments, a record of the transaction may be kept for later account settlement.
Typically, before the transaction is approved, terminal 104 seeks authorization from one or more entities in a transaction-processing network 123. For example, the merchant may seek approval from the acquiring bank, the issuing bank, a clearing house, or other entity that may be used to approve such transactions. Thus, depending on the token type, institutions involved and other factors, the transaction-processing network 123 can be a single entity or institution, or it can be a plurality of entities or institutions. As a further example, in one embodiment, transaction-processing network may include one or more processors or clearing houses to clear transactions on behalf of issuing banks and acquiring banks. The transaction-processing network also includes those issuing banks and acquiring banks. For example, one or more entities such as Global Payments, Visa, American Express, and so on, might be a part of transaction-processing network. Each of these entities may have one or more processing servers to handle transactions.
As illustrated in
Although transaction-processing network 123 is illustrated using only one block in the example block diagram environment of
Having thus described an example environment, the present invention is from time-to-time described herein in terms of this example environment. Description in terms of this environment is provided to allow the various features and embodiments of the invention to be portrayed in the context of an exemplary application. After reading this description, it will become apparent to one of ordinary skill in the art how the invention can be implemented in different and alternative environments, including environments where it is necessary or desirable to encrypt data for transmission or storage. Indeed, the invention is not limited to bank card environments and can be implemented for numerous different forms of data encryption.
Various systems and methods for utilizing variable-length ciphers for arbitrary format data are described. These are described in terms of examples of token access and for providing enhanced security measures for token access are herein described. Particularly, in terms of the example and related environments, embodiments provide security measures for financial transactions. One embodiment in this example application provides for variable length enciphering of some or all of the token data (credit card, charge card, debit card or other tokens) using a Variable Data Encryption Standard (VDES) algorithm. Another embodiment in this example application provides for variable length enciphering of some or all of the token data (credit card, charge card, debit card or other tokens) using a Variable Advanced Encryption Standard (VAES) algorithm. VAES may also be known as Rijndael encryption. Decryption can be performed at one or more appropriate points along the transaction path to reverse the enciphering using a predefined secret key. There are a number of conventional block cipher formats that have been developed and used in addition to DES and AES, including Blowfish, RC2, Skipjack, LOKI, RC5 and GOST. After reading the description contained herein, one of ordinary skill in the art will appreciate how the systems and methods described herein can be implemented with these and other alternative cipher algorithms.
Additionally in embodiments of the invention, the data encoded in token 111 can be encrypted using the variable-length ciphers VDES or VAES. Although token data may be referred to as being “on” for “in” a token, or encoded “onto” or “into” a token, such as token 111, these terms are not meant to imply or require a particular physical structure for encoding the token with data.
Accordingly, in a Step 88, an encryption module 132, which can include one or more encryption algorithms, is used to encrypt some or all of the token data. Although the encryption in accordance with the invention can take place at a number of different points along the data stream, it is preferable for security purposes that the encryption take place as soon as possible or practical in the data read cycle. Therefore, in one embodiment of the invention, the encryption module is in the data path immediately following the data capture. For example, encryption can take place in the read head or elsewhere in the token-reader terminal. Preferably, then, the data can be encrypted as soon as it is read to enhance the security of the system.
In a step 94, the data captured by data capture device 113, and encrypted with encryption module 132, is forwarded to terminal 114 in furtherance of the transaction. In an application in accordance with the example environment, terminal 114 can include a cash register or other point of sale station or terminal, as an example. In other environments terminal 114 can be implemented as appropriate including, for example, checkpoint terminals, customs station terminals, point of access terminals, point of sale terminals, or other terminal appropriate for the given application.
In the application of a point of sale terminal, the terminal 114 can, in one embodiment, be a card-swipe terminal such as, for example, portable or countertop card-swipe terminals found as retail point-of-sale terminals. Other point of sale terminals might include, for example, gas pumps, ATM machines, vending machines, remote pay terminals, and so on. As another example, a terminal might include a token reader in communicative contact with a personal computer or other computing device for purchases such as, for example, internet purchases or for online banking. As a further example, in one embodiment, the terminal can include a magnetic stripe reader (including one or more read heads), a keypad (for example, for PIN entry, or other user entry), and a display. Thus, in this embodiment, the terminal 114 is integrated into the same package or housing as the data capture device 113. The terminal can also be integrated with or in communicative contact with a cash register or other point-of-sale or point-of-access station.
Illustrated in
In a step 96, terminal 114 routes the data to the transaction-processing network 123 to obtain authorization or approval for the transaction from one or more entities as appropriate. The data stream 137 routed by terminal 114 can include some or all of the data provided in the secure data stream 135, and can be augmented to provide additional data as may be appropriate for the environment or type of transaction.
Illustrated in the example provided in
As also discussed above with reference to
Gateways can be implemented using hardware, software, or a combination thereof. In one embodiment, gateway 120 is implemented as one or more processing devices configured to run software applications for the gateway functionality. In one or more embodiments discussed in this document, functions such as encryption, decryption, key storage and other related functions are at times discussed as being performed at or by a gateway. This description encompasses implementations where functions are performed using a separate module or appliance called by or otherwise accessed by the gateway. For example, in one or more embodiments, these functions are described as being performed by a secure transaction module that can be either a part of the gateway or accessed by the gateway. As will be apparent to one of ordinary skill in the art after reading this description, such discussion can indicate that the same devices that perform gateway functionality can also include hardware or software modules used to perform the encryption, decryption or other functions as well.
Alternatively, separate modules can be in communicative contact with the gateways and their functions called, accessed or used by the gateway to perform the encryption, decryption or other related functions. Indeed, in one embodiment, one or more separate appliances are provided to perform various decryption, encryption, key storage and updating and other functions, and the appropriate transaction data routed to the appropriate appliance for processing. Such appliances can themselves be implemented using hardware software or a combination thereof, and can be coupled in communicative contact with the gateway. As discussed herein, such appliances (sometimes also referred to as secure transaction modules) can be associated with entities other than the gateway, including issuing banks, acquiring banks, clearing houses, merchants and other entities that may be associated with, the transaction-processing network 123.
In a step 98, the encrypted information is decrypted for processing of the transaction. In the example illustrated in
As another example, connections between the gateway 120 and the transaction-processing network 123 may themselves be secure connections. In such situations, it may be desirable to decrypt some or all of the transaction data stream at gateway 120 prior to routing to the transaction-processing network 123. In furtherance of this example, consider a credit card transaction in which the entire account information is encrypted. It may be desirable in such a situation to have the gateway decrypt the account information to obtain the bank identification number to facilitate routing. With a secure connection, the decrypted information can be left in the clear for transfer to the transaction-processing network 123. In another embodiment, the gateway can be configured to re-encrypt some or all of the decrypted information prior to routing.
As another example, even where the routing data is clear, it may be desirable to have a secure transaction module available at the gateway to decrypt the transactions routed by that gateway. As such, a centralized (or somewhat centralized in the case of multiple gateways) decryption process can be implemented to handle decryption in one location (or in predetermined locations) for multiple transactions for multiple merchants and multiple issuers. In such an application, centralized decryption can be implemented to provide centralized key management or centralized of other encryption algorithms or information. Likewise, information can be encrypted for storage in a database or other storage facility. Because different information fields that might be transmitted or stored can be of varying data lengths, it may be preferable to use a variable length cipher for such data.
Thus, to illustrate two of the possible decryption-placement scenarios, a decryption module is illustrated as decryption module 122A associated with transaction-processing network 123 and a decryption module 122B associated gateway 120. As these examples serve to illustrate, decryption of some or all of the information can be performed at one or more points along the network as may be appropriate for a given transaction. As also discussed in further detail below, various levels of encryption and decryption using one or more keys for portions of the data can be included to facilitate routing and handling of transactions in a secure manner.
In a step 99, an authorization response is provided from the transaction-processing network 123 indicating the status of the authorization. For example, where the transaction is approved, such authorization is transmitted to terminal 114 and can be stored at the terminal or in a storage device associated with the terminal for record-keeping purposes or further transactions. For example, considering again the application in a credit card transaction, when the initial transaction is carried out, terminal 114 typically seeks authorization from the transaction-processing network 123 and, once authorized, stores transaction information in a data file or other database for later settlement of the transaction. Thus, in this example, terminal 114 could store information associated with the authorized transaction for later settlement as illustrated by step 100.
In one embodiment, the encrypted information is stored as data in one of the tracks on the card. Typically, magnetic stripe cards, including most bank or credit cards have three tracks of data. For the access information, Tracks 1, 2 or 3 can be used. Note that in one environment, a conventional bank card may have traditional track 2 information encoded at 75 BPI. The tracks may not be perfectly timed and may include variations or jitter. In other words, the spacing between the transitions may vary from transition to transition and from card to card. Additionally, because of these variations and the characteristics of the flux patterns on the magnetic strip, it is difficult to accurately recreate, or copy, magnetic stripe data from an original token to a new token and maintain the same characteristics. These transition characteristics create a level of uniqueness in the magnetic stripe data. Furthermore, because of these variations, the relationship of the tracks to one another may be affected. Therefore, it may be useful to encipher the jitter data using a variable length block encoding cipher for increased security as well as coping with the inherent variability of the magnetic stripe record. VDES and VAES are tweakable, variable input length block ciphers that encipher decimal numbers such as those found in account numbers for credit cards, charge cards, debit cards, and the like. The primitive used for embodiments of the invention is a tweakable block cipher for enciphering numbers up to m-digits in length. The value of m depends on the algorithm.
Let Dom be the set of all strings over Z10={0, . . . , 9} that have a length between 2 and m. The primitive is specified by an enciphering function
ε: TwSp×KeySp×Dom→Dom
and associated deciphering function
D: TwSp×KeySp×Dom→Dom
The notation means that encryption utilizes in this example three inputs: a tweak T, which is selected from the set TwSp of possible tweaks; a key K, which is selected from the set KeySp of possible keys; and the plaintext data P, which in this case is a d-digit number, where 2≦d≦m. The output, denoted ε(T, K, P), is another d-digit number. In an embodiment of the invention, the tweak T is the bank identification number (BIN), the last four digits of the account number, plus the expiration date of the token. In VDES, the key K comprises a DES key and two auxiliary 64-bit keys. In another embodiment of the invention, the key K comprises a single 128-bit AES key.
The deciphering algorithm reverses the enciphering, therefore,
D(T, K, ε(T, K, P))=P
The variable-length ciphers in embodiments of the present invention are designed to guard against an attacker who has access to the merchant terminal. In attempting to decrypt a target enciphered card, the attacker can create and swipe tokens or cards of its choice through the token reading device. This is known as a chosen plaintext attack in cryptographic terms.
To begin the scenario described above, a target tweak T* and a target plaintext P* are selected. Next, a target key K is chosen at random from KeySp. Ciphertext C* is computed where C*=ε(T*, K, P*). It is assumed that the attacker has: The target tweak T* and target ciphertext C*, and an enciphering oracle or device, Enc. The enciphering oracle may be viewed as a device that the attacker can call on for any inputs T, P of choice. The enciphering oracle returns ε(T, K, P). In this example the enciphering is done under the target key K, which the attacker does not have. To succeed, the adversary must output P*, the target plaintext.
The situation described above is comparable to that of an identity thief who has an enciphering of some of the digits of a user's account number, and also has the BIN, last four digits of the user's account and the expiration date of the account. The target plaintext P* is the digits of the user's account that are enciphered and which the thief does not have. The target ciphertext C* is the enciphered digits, where the enciphering is under the target tweak, and the target key K used is the key used in the merchant's terminal. The enciphering oracle models the possibility that an attacker can create and swipe, or read, cards of the attacker's choice. When the attacker does this, he effectively provides himself some tweak T and plaintext P and receives in return the enciphering ε(T, K, P) under the target key K. Generally, more effective security is provided as the encryption method chosen requires the attacker to make a higher quantities of encryption queries, or invest an increasing amounts of computer running time.
The algorithms in accordance with embodiments of the invention are now described. First, some functions auxiliary to the algorithms are described and defined. These functions are used in the algorithms as set forth herein.
NtSl takes as input an integer N in the range 0≦N<2l and returns its encoding as a binary string of exactly l bits. For example, NtS6(3)=000011.
StNl takes as input an l-bit string y and returns the integer N(0≦N<2l) whose binary representation is y. For example, StN6(000011)=3.
NtDl takes as an input an integer N in the range 0≦N<10l and returns its representation as an l-digit number. For example, NtD5(326)=00326. This operation consists merely of prepending sufficient zeros to bring the number of digits to exactly l. In addition, this provides for the variable block length in the cipher.
DtN takes as input an i-digit number y and returns the corresponding integer N (0≦N<10l) for example DtN(00326)=326. This operation includes removing sufficient leading zeros and is a further example of the variable block length of the cipher.
|X|10 denotes the number of digits in a digit-represented number. For example, |00326|10=5. The leading zeros are counted.
N div D returns the quotient obtained when integer N is divided by integer D. For example, 17 div 2=8 and 14 div 2=7.
N mod D returns the remainder obtained when integer N is divided by integer D. This is an integer in the range 0, . . . , D-1. This operation may be applied even when N is negative. For example (30−70) mod 100=60.
If s1, s2, . . . , sn are strings then s1∥s2∥ . . . ∥sn denotes the concatenation of those strings.
If x is a 64-bit string, the x↓56 denotes the first 56 bits of x.
The tweak T[0] . . . T[t−1] is a t-digit number. For a token access security application, such as described above, one example for the tweak can be defined to include the BIN, the last four digits of the account number and the expiration date of the card or token. In such a case t would be equal to fourteen—i.e., the tweak would be a 14-digit number. Although t can be chosen based on the needs of a given application, in various embodiments, the example algorithm is configured to handle a tweak of up to t=16. With the example algorithm, the tweak of length t is decided at the time the key is chosen. However, the algorithm can be extended to handle tweaks of variable length.
In various embodiments, the tweak can be an important aspect of the implementation. The actual tweak chosen can take a variety of forms and flexibility can used when choosing the tweak to implement. An example of the importance of the tweak can be illustrated in terms of a bank card environment where relatively short strings of symbols sets are encrypted. Indeed, for short PANs, the system might only be encrypting a few digits. In such uses, it would be trivial to build a dictionary with only a few digits worth of encrypted values. Accordingly, the tweak can be an important element of security in that it changes the encrypted values based on unencrypted values. Accordingly, this makes the size of the useful dictionary much larger thereby improving the security of the cipher.
The process, 400, begins with the step 402 of identifying operators based on the number of digits, d, in the plaintext. The plaintext P[0] . . . ,P[d−1] is a d-digit number where d can range in some embodiments from 2 to 28. The algorithm can be implemented to split it into two parts roughly equal in size. The first part has d(0) digits and the second part has d(1) digits. For example, if d=5, the d(0)=2 and d(1)=3. Thus, in one embodiment, these operators are defined as
d(0)=d div 2
d(1)=d−d(0).
Accordingly, d(0) is defined as the whole number quotient obtained when the number of digits, d, is divided by two, and d(1) is the number of digits, d, less d(0). Therefore, d(1) and d(0) together represent the entire data length, d.
In steps 404 and 406, the parameters w1 and w2 are defined based on the tweak, and represented as a bit string. In this example, the number of digits t in the tweak and the number of digits d in the data are each converted to a string. These strings are padded with zeroes to create bit strings w1 and w2 of desired length. In one embodiment, the tweak is a one-byte string and padded with 48 zeroes to create a bit string for w1 that is 56 bits in length. Particularly, in one embodiment, the tweak is first modified to remove leading zeroes. This can be done, for example, using the DtN, operation described above, which takes as input an l-digit number (in this case the tweak) and returns the corresponding integer value with leading zeroes removed. Then, the modified tweak is further modified to create the binary string for the desired number of bits, which in this example is 56. This can be done, for example, using the NtSl operation described above (where l=56):
w
1
=NtS
56(DtN(T[0] . . . T[t−1]))
In step 406 d digits of plaintext are converted to a fixed-length string, w2. For example, in one embodiment, the plaintext digits are converted to a one-byte string. This can be done, fore example using:
w
2
=NtS
8(d)
where W2 is fixed length data that is eight bits long, and l=8.
In a step 408 a key is defined. For example, in one embodiment the encryption algorithm DES is applied to the 64-bit string w1∥w2 with key, K0, to produce a 64-bit string. The first 56 bits of the latter form the DES round key K1.
K=DES(K0w1∥w2)↓56
In step 410, the data set is operated on in two sections, L0 and R0, and the leading zeros are removed from the plaintext strings for each section of the dataset. In one embodiment, the dataset is divided in half. The division of the dataset can be such that L0 is P[0] to P[d(0)−1], and R0 is Pd(0) to P[d−1], for a plaintext data set of length d. Removing the leading zeroes can be accomplished with the DtN operation as
L0←DtN(P[0] . . . P[d(0)−1])
R0←DtN(P[d(0) . . . P[d−1])
In step 412 through 418, Li and Ri are determined for a number of rounds r, where i=1, . . . , r. The number r chosen is the number of rounds of processing for steps 412-418 described below. Security is typically increased with a greater number of rounds, r. An embodiment provides for an r of at least seven. In this case, with an r of seven, then a total of nine DES operations is required. In one embodiment, the maximum value of r is 28−1, which comes from the fact that a round count of 1 byte (28) is included in the round function. Increasing the number of rounds is computationally expensive, so preferably the number of rounds is chosen so as to not exceed that required to achieve the desired level of security.
Referring now to steps 412-418, in a step 412, the current value of the left half L, Li, is set to the previous value of the right half R, Ri−1.
L
i
=R
i−1.
Step 414 returns the remainder of i/2, which in one embodiment is calculated as s, where
s=1−(i mod 2).
In step 416, x is defined as a 48 bit binary representation of Ri−1. This can use the NtSl operation, where
x=NtS
48(Ri−1).
Next, in step 418 Ri is calculated as a function. In one embodiment, this is a function combined with Li−1 where
R
i=(F(i, d, x)+Li−1)mod 10d(s).
The computation of the Feistel F(i, d, x) in one embodiment is described in
As stated above, steps 412 through 418 are repeated for r iterations. Accordingly, the value of i is checked in step 420. If the value returned for i is less than or equal to r, then i is incremented, the process returns to step 412, and the next iteration of Ri is calculated. If, on the other hand r rounds have already been run, the process continues to step 422.
In step 422 a function s is defined as the remainder of r divided by two
s=r mod 2.
At this point the first half of the output may be computed in step 424. In this step, Lr is converted to a number of d(s) digits to form the first half of the output data according to the formula:
C
out1
=C[0] . . . C[d(s)−1]=NtDd(s)(Lr)
Next the second part of the output is computed in step 426. In this step, Lr is converted to a number of d(s) digits to form the second half of the output data according to the formula:
C
out2
=C[D(S) . . . C[d−1]=NtDd(1−s)(Rr)
At this point, the process concludes, step 428 and returns the d-digit cipher text string
C[0] . . . C[d−1].
a=NtS
8(i)∥NtS8(d)∥x.
The next step, 504 performs DES encryption operations to get y. in one embodiment, a and Kin, defined above, are encrypted using key K, and combined with Kout. These operations take a 64 bit input string, a, and return a binary value of y. These operations are X OR operation on the DES with keys to form y, defined below:
y←DES(K1,a⊕Kin)⊕Kout
Step 506 is defines the function z, which in one embodiment is the operation to return y, as a 64-bit integer value, z. The equation defining z is shown below:
z←StN64(y)mod 10d(s).
The process concludes when z is returned. This subroutine is repeated for each “round” of the DES function.
Several operational scenarios are now described to further highlight features and advantages of the embodiments described above. The inventions described herein and their multiple embodiments are not limited to applications as discussed in these scenarios. These scenarios are included to provide additional descriptive materials.
The variable DES algorithm offers numerous security advantages over a traditional DES algorithm. The VDES algorithm implements a Feistel network. The i-th round (1≦i≦r) splits the input into a left half of d(0) bits and a right half of d(1) bits if d is odd. It would be unusual for d to be odd, however. The variable nature of the VDES means that split sizes are not maintained across rounds as in traditional Feistel networks. Varying the split sizes does not appear to degrade security.
In order to accommodate enciphering digit sequences rather than bit sequences, the round function outputs digits. The traditional XORs have been replaced by modulo operations of appropriate powers of ten.
The round function is based on Data Encryption Standard Block Cipher-Key Whitening (known as DESX) and discussed in “How to Protect DES Against Exhaustive Key Search (an Analysis of DESX)” J. Kilian and P. Rogaway, J. Cryptology 14(1):17-35 (2001). The selection of DESX allows the same DES key to be used every round, instead of a different key every round, however, the effect of per round keys is induced by the inclusion of the round number in the DES output. This reduces key size and permits cost savings by reducing re-keying costs.
Tweakability is provided by specifying the round key K1 as a pseudorandom function (PRF) of the tweak, the number t of digits in the tweak, and the number d, of digits in the input, under key K0. (K0 is part of the base key of the cipher). The role of the pseudorandom function is taken by the based DES based Cipher Block Chaining (CBC) MAC. In computing this CBC-MAC the number of iterations of DES is always exactly two, thereby providing security against splicing attacks. “Birthday” attacks may be possible, indicating that in approximately 232 calls to the encryption oracle, an attacker may find two different tweaks which result in the same round key K1 and thus are equivalent from an encryption point of view. However, this does not appear to assist with plaintext recovery, which in the scenario of this application is the goal of the attacker. In addition, 232 is approximately four billion, and it is not clear that the number of tweaks in use is this high. The round function converts bits to digits by interpreting the DESX output as an integer, and then takes the remainder upon division by the appropriate power of 10, as discussed above in the context of
The process, 600, begins with the step 602 of identifying operators based on data length. These operators are defined as follows:
d(0)=d div 2
d(1)=d−d(0).
Steps 604 and 606 define the parameters w1 and w2 based on the tweak, and represents them as a bit string. In this example, the number of digits t in the tweak and the number of digits d in the data are each converted to a string. These strings are padded with zeroes to create bit strings w1 and w2 of desired length. In one embodiment, the tweak is a one-byte string and padded with 48 zeroes to create a bit string for w1 that is 56 bits in length. Particularly, in one embodiment, the tweak is first modified to remove leading zeroes. This can be done, for example, using the DtNl operation described above, which takes as input an l-digit number (in this case the tweak) and returns the corresponding integer value with leading zeroes removed. Then, the modified tweak is further modified to create the binary string for the desired number of bits, which in this example is 56. This can be done, for example, using the NtSl operation described above (where l=56):
w1=NtS56(DtN(T[0] . . . T[t−1]))
At 606, d digits of plaintext are converted to a fixed-length string, w2. For example, in one embodiment, the plaintext digits are converted to a one-byte string. This can be done, fore example using:
w
2
=NtS
8(d)
where w2 is fixed length data that is eight bits long, and l=8.
In step 608 a key is defined. For example, in one embodiment the encryption algorithm DES is applied to the 64-bit string w1|w2 with key, K0, to produce a 64-bit string. The first 56 bits of the latter form the DES round key K1.
K
1
=DES(K0w1|w2)↓56
In step 610, s is defined as the remainder of r divided by two
s←r mod 2.
At step 612, the data set is operated on in two sections, Lr and Rr, and the leading zeros are removed from the plaintext strings for each section of the dataset. In one embodiment, the dataset is divided in half. The division of the dataset can be such that Lr is C[0] . . . C[d(s)−1], and Rr is C[d(s) . . . C[d−1], for a plaintext data set of length d. Removing the leading zeroes can be accomplished with the DtN operation as
Lr←DtN(C[0] . . . C[d(s)−1])
Rr←DtN(C[d(s) . . . C[d−1]).
In steps 611-618, Li−1 and Ri−1 are determined for a number of rounds r, where i=1, . . . , r.
In step 611, Ri−1 is defined as follows:
Ri−1←Li
Step 613 returns the remainder of i/2, which in one embodiment is calculated as s, where
s=1−(i mod 2).
In step 616, x is defined as a 48 bit binary representation of Ri−1. This can use the NtSl operation, where
x=NtS
48(Ri−1).
Next, in step 618 Li−1 is calculated as a function. In one embodiment, this is as shown below:
Li−1←(Ri−(F(i, d, x)(mod 10d(s))).
The Feistel function, F(i, d, x), can be computed the same as it is in the encryption process such as, for example, as described above. The term in the modulo function, mod 10d(s) is a 10 digit term and provides the variable length of the present embodiment of the invention to handle 10 integer digits. The mod operation may be adjusted to represent other forms of data such as, for example, the twenty-six letters of the alphabet in one case.
At this point in the operation the value of i is checked in step 621.
i≦r
If the value returned for i is less than or equal to r, then the process increments i and returns to step 611 and recomputes Ri−1, and the subsequent steps in the round. If, on the other hand, the value returned at step 621 is that i is larger than r and 1 the process continues to step 624.
At this point the first half of the output may be computed in step 624. In this step, Lr is converted to a number of d(s) digits to form the first half of the output data according to the formula:
P
out1
]=P[0] . . . P[d(0)−1]←NtDd(0)(L0)
Next the second part of the output is computed in step 626. In this step, Lr is converted to a number of d(s) digits to form the second half of the output data according to the formula:
P
out2
=P[D(0) . . . P[d−1]=NtDd(1−0)(R0)
At this point, the process concludes at 628 and returns the value below:
P[0] . . . P[d−1].
w
1
=NtS
8(t)∥NtS8(d)∥NtS48(0)
w
2
=NtS
64(DtN(T[0] . . . T[t−1]))
K
1
=DES(K0, DES(K0, w1l )⊕w2)↓56
Also, in the example illustrated in
z←StN64(y).
Likewise, similar changes to these factors are made in the decryption process to mirror the changes in the encryption algorithm. These are as also shown in
A first type of attack against VDES to be considered is the exhaustive key search attack. This attack is impractical due to the 184 bit size of the cipher. A more important attack beyond the exhaustive key search attack is Patarin's attack. Patarin's attack is discussed in “Security of Random Feistel Schemes with Five or More Rounds” J. Patarin, Advances in Cryptology—CRYPTO '04, Lecture Notes in Computer Science Vol. 3152, M. Franklin ed., Springer-Verlag, 2004. Patarin's attack is ineffective against VDES because it is a distinguishing attack, not a message recovery attack. Recall that the goal of the attacker here is to recover the message, in this case, the account number. A Patarin attack can distinguish between an instance of the cipher and a random permutation, but does not lead to recovery of the target plaintext given the target ciphertext. In addition, a Patarin attack requires a prohibitive amount of computation time, even when the number being enciphered is only two digits long. The time would be significantly increased by the ten to sixteen or more digits used in a typical account number.
A further embodiment of the invention provides a variable advanced encryption standard (VAES) cipher that utilizes AES as part of the encryption and decryption algorithms instead of the DES cipher.
Tweakability can be obtained by specifying the round key K1 as a pseudorandom function of the tweak, the number t of digits in the tweak, and the number d of digits in the input, under key K0. VAES directly applies AES as the pseudorandom function, since the block length is long enough to handle tweaks of sufficient length. “Birthday” attacks are precluded since no two tweaks result in the same common round key. While the key size of 128 bits is less than VDES, the key size is large to enough to preclude exhaustive key search attacks. Furthermore, Patarin's attack remains ineffective. All of these features provide enhanced security.
The key K=K0 comprises a 128-bit AES key. The tweak, T[0] . . . T[t−1], is a t-digit number. The tweak in some embodiments is of any length in the range t=1 to t=33. The VAES algorithm in this embodiment does not use w1 and w2 as was the case with the example VDES described above. Instead a string w is formed, which in this example is a 128-bit string is formed by concatenating the following strings: a one byte representation of the number t of digits in the tweak; a once byte representation of the number d of digits in the plaintext; and a 112-bit representation of the tweak T[0] . . . T[t−1].
w←NtS8(t)∥NtS8(d)∥NtS112(DtN(T[0] . . . T[t−1]))
The key in this example, K1 is determined by an AES encrpytion of w using K0 as the key. This produces a 128-bit AES round key K1.
K
1
=AES(K0, w)
As with VDES, the data set is operated on in two sections, L0 and R0, and the leading zeros are removed from the plaintext strings for each section of the dataset. In one embodiment, the dataset is divided in half. The division of the dataset can be such that L0 is P[0] to P[d(0)−1], and R0 is Pd(0) to P[d−1], for a plaintext data set of length d. Removing the leading zeroes can be accomplished with the DtN operation as
L0←DtN(P[0] . . . P[d(0)−1])
R0←DtN(P[d(0) . . . P[d−1]).
Also, as with VDES, Li−1 and Ri−1 are determined for a number of rounds r, where i=1, . . . , r. However, with this example VAES algorithm, x is set as a 112-bit string representation of Ri−1 as opposed to the 48-bit representation in the VDES example provided above. In an embodiment of the invention, the number of rounds is at least seven. The total cost of the algorithm is eight AES rounds. The maximum allowed value of r in some embodiments is 28−1.
The subroutine F(i, d, x) is also similar in this example, except that in this case, y is calculated as an AES encryption of a using key K1, and z is returned as a 128-bit value.
y←AES(K0, w); z←StN128(y)
VAES is simpler and yet offers more security than VDES, although the algorithms are similar in design. VAES is capable of handling longer tweaks and plaintexts than VDES.
Essentially, the decryption process reverses the steps of the encryption. The decryption process for this example VAES algorithm begins with the step of identifying operators based on data length. These operators are defined as follows:
d(0)=d div 2
d(1)=d−d(0).
Then, w is defined as set forth for the encryption process.
w←NtS8(t)∥NtS8(d)∥NtS112(DtN(T[0] . . . T[t−1]))
The key in this example, K1 is determined by an AES encrpytion of w using K0 as the key. This produces a 128-bit AES round key K1.
K
1
=AES(K0, w)
A function s is defined as the remainder of r divided by two
s←r mod 2.
At this point the halves of the output may be computed. Lr and Rr are defined as
L0←DtN(P[0] . . . P[d(0)−1])
R0←DtN(P[d(0) . . . P[d−1])
The following are several examples of format-preserving encryption using a general cipher. These illustrate how general ciphers can capture requirements and format-preservation constraints that cannot be captured in the framework of enciphering over a (single) finite set. The first example performs length-preserving encryption of (varying length) credit card numbers. The second example provides a general method for encrypting multi-alphabet strings. The third example shows how to use this multi-alphabet string cipher to perform format-preserving encryption of user records as per the example discussed above. The technique throughout utilizes the encode-then-encipher paradigm. For this section it is assumed that access to a general cipher GE=(E,KeySp,TwSp,I,Dom) with TwSp⊂{0, 1}* and with index set and domain sufficient for the examples.
In this example, the inputs are strings over —={0, 1, . . . , 9} where length can vary, say from 4 to 20. This is because one might want to encrypt varying amounts of the credit card numbers and discretionary data, leaving the rest in place. For example, the BIN or last four might be left in clear text, as might the cardholder's name. The output must be a string over _* of the same length as the input. (Note this example ignores the functioning of the Luhn digit). Tweaks would be some of the (un-encrypted) credit card number digits encoded into bit strings in a natural way. This can be captured by setting by letting Domcc()=_ for all ∈ I={4, . . . , 20}. We seek a general cipher GEcc=(E,KeySp,TwSp,I,Domcc), and use the ability to have a domain with multiple sets in a crucial way to capture the length-preservation requirement. GEcc is constructed using encode-then-encipher with the enciphering via general cipher GE. Let s()=10 for I. Thus the indexes are just the number of digits in the credit card number. Next the encoding function Enc and decoding function Dec are described. Enc(,X), where X is a sequence of decimal digits, simply views X as the decimal representation of an integer N and returns N. Conversely for Dec. The final enciphering algorithm is E T, K(X)=Dec(, ET,K(Enc(,X)).
This example constructs a general cipher for multi-alphabet strings. A multialphabet string is a sequence of digits pk . . . p1 where pi ∈_i for 1≦i≦k and alphabets _k, . . . 1. The index set for the cipher is any sequence of (uniquely encoded descriptions of) alphabets, e.g. _k_k−1 . . . 1 for any alphabets _k for any k>1. To any alphabet _, associate an injective map num_: —→[0 . . . |—|−1]. Its inverse is denoted via char_. Next a general cipher is constructed as GEmas=(˜E,KeySp,TwSp,I,Dommas) that inherits the key space and tweak space of the underlying general cipher GE, has index set I specifying the space of arbitrary string formats (the sequence of alphabets digits are taken from), and Dommas that maps an index to a set Zs(). Next the encode-then-encipher paradigm is used with encoding and decoding functions as given as follows:
Enc(, P)
Parse t as _k, . . . ,—1
Parse P as pk, . . . , p1 where pi ∈ _i
x←num—1 (p1)+Pk
i=2 “num_i (p1)·Qj<i|_j|”
Return x
Dec(, y)
Parse as _k, . . . ,—1
For i=k to 2 do
d←Qj<i|j|; ci←└y/d┘; y←y−ci·d
Return char_k (ck)k . . . kchar—2 (c2)kchar—1 (y)
Given an index I encoding alphabets _k, . . . ,—1, the routines map a k-digit multi-alphabet string (from said alphabets) to a numerical value between 0 and s()−1 where s()=Qk i=1|_i|. The domain map Dommas() outputs Zs(). Notice that many values of will lead to the same Zn, but this will not be a problem either for security or functionality. The final enciphering algorithm is ˜ET, K(X)=Dec(, ET, K(Enc(,X)).
This example is provided in terms of the example above, where a name, address and credit card number are enciphered. The goal is to have a format-preserving encryption that maps input plaintext Name, Addr, CC to output ciphertext Name*, Addr*, CC*. Name and Name* must consist of two strings of letters, each being an upper case letter followed by lower case letters. Addr and Addr* must be alphanumeric characters, spaces, or commas followed by a valid postal code. CC and CC* must be 8-19 decimal digits followed by a valid Luhn digit. These formats can be handled by interpreting a record as a multi-alphabet string and using GEmas. All that is needed is how to derive an index for a given record. Let _lc={a, . . . , z}, _uc={A, . . . ,Z}, _let=_lc ∪_uc, _num={0, . . . ,9}, and _α=—let ∪_num ∪{,} ∪{␣}. Let _ zip be the set of all valid (5-digit US) postal (ZIP) codes. Let _i cc be the set of all i-digit credit card numbers with valid check digit. That is _i cc={nkL(n)|n ∈ _i−1 num}. Write h_i to mean an (unambiguous) encoding of _into a string. Assume an encoding for which the concatenation of several such encodings can also be unambiguously decoded. The input is the triple of strings (Name, Addr, CC). Produce an index string as follows. Initially set =ε. Scan Name from left to right. For the first character, append h_uci to . For each further non-space character in the first string, append h_leti. Append h{␣}i. Then append h_uci and append h_leti for each subsequent character. Then scan Addr, appending h_αi for each character stopping before the final 5 digit postal code. Append here h_zipi. Finally, append h_cc ci where c is the number of decimal digits in CC. The result is a string encoding the appropriate alphabets of the characters used within the input record. (This example treats the ZIP code and credit card number as individual characters from particular alphabets.) Then encipher (for any desired tweak T) via ˜ET, K (, (Name, Addr, CC)). The ciphertext will have the desired format. Note that ciphertexts contain all the information required to reproduce the appropriate for decryption.
Embodiments of format-preserving Feistel cab be analyzed in terms of previously proposed attacks and novel attacks. Patarin's attack on Feistel networks is the only previous attack known to the inventors that applies to Feistel networks with rounds greater than 6. Recall, seven is the preferred minimum number of rounds. This attack attempts to distinguish between a Feistel network and a random permutation by making many queries. The attacker then computes all possible round functions, and checks if the queries and their responses are consistent with any one of the functions, outputting 1 if so. This attack requires an intractable 12 amount of computation, even for small sets. Consider attacking format-preserving Feistel using r rounds on domain Zn for n=ab. Then the number of possible instances of format-preserving Feistel is C=(ab)└r/2┘·(ba)r−└r/2┘. For simplicity (it won't affect the implications significantly) assume a=b, in which case C=ara. The probability that a random function from Zn to Zn (ignore permutivity constraints for simplicity) matches at least one of these functions for any set of q distinct domain points is at most C/nq. To achieve advantage one, then, the adversary needs to choose q so that C/nq=1. Rearranging we have that ara=nq=a2q and this implies that q=ra/2. Say a=b=10, which corresponds to using format-preserving Feistel to encrypt 2-digit numbers, and r=7. Then q=70/2=35, which is pretty small. Fortunately, the running time in this case is about qC=35·1070≈2237, making the attack practically intractable.
Highly unbalanced Feistel networks are susceptible to highly efficient attacks that succeed with exponentially vanishing probability as the number of rounds increases. Still, for small, fixed round number, the attacks could be dangerous. Assume a PRP adversary, A, against format-preserving Feistel for some such that split(t) outputs a, b. Assume without loss of generality that a≦b. Let T ∈ TwSp. Denote r() by r and assume r is even (the attack easily extends to the case that r is odd). Then adversary A works as described below adversary AO(,,):
First, analyze Pr[ARand 1]. Adversary A outputs 1 exactly when D−R0+R′0≡D′ (mod b). Thus Pr[ARand 1]=1
Then, analyze Pr[AReal1]. Let d=r/2. This is the number of times a value Ri is assigned in E for i>0 and even. Let Z1, . . . , Zr be the outputs of the round function F for rounds 1 to r (respectively) when evaluating ET, K (L0,R0) in response to A's first query. Similarly let Z′1, . . . ,Z′r be the outputs of the round function F for rounds 1 to r (respectively) when evaluating ET, K(L′0,R′0) in response to A's second query.
Consider the situation in which Zi=Z′i for all i>0 and i even. This occurs with probability at least a-d. This is true because the inputs to each of the relevant d round function applications will collide with probability 1/a. Then, in this case it holds with probability one that D−R0≡D′−R′0 (mod b) since:
i (mod b).
Therefore Pr[AReal 1]≧a-d. Combining this with the upper bound on Pr[ARand1] given above yields
For certain values of a, b, r this is large. Say r=7 and N=2p for some relatively large prime p. Then a=2, b=p and A's advantage is ⅓−1/(2p−2)−½p.
The above attack can be adapted to mount message recovery attacks. The distinguishing attack establishes a relationship D−R0≡D′−R′0 (mod b) for distinct messages with high probability. If R0 is unknown, one can recover it if D, D′, and R′0 are known. This requires a single known-plaintext and its associated ciphertext, which will have the desired collisions with the unknown plaintext with probability a-d. From the known plaintext, ciphertext pair one can recover the unknown plaintext portion R0. Then L0 can be guessed (with probability of success 1/a). The attack succeeds in recovering the full plaintext with probability at least a-d-1.
As used herein, the articles “a” or “an” when referring to an item are not limited to requiring one and only one of the referenced item, and the various embodiments can include additional of the referenced items (or an alternative item) unless the context clearly dictates otherwise. As used herein, the terms “module” and “control logic” are used to describe a given unit of functionality that can be performed in accordance with one or more embodiments of the present invention. As used herein, a module or control logic can be implemented utilizing any form of hardware, circuitry, processing systems, software (including firmware), or a combination thereof. In implementation, the various control logic blocks or modules described herein can be implemented as discrete components or the functions and features described can be shared in part or in total among one or more modules and control logic items. Likewise, although a given item may be described as a module, that item may itself contain various modules to perform desired functionality. As would be apparent to one of ordinary skill in the art after reading this description, the various features and functionality described herein may be implemented in any given application can be implemented in one or more separate or shared modules or logic in various combinations and permutations.
Where features of the invention are implemented in whole or in part using software, in one embodiment, these elements can be implemented using a computing system capable of carrying out the functionality described with respect thereto. One such example computing system is shown in
Referring now to
Computing system 900 can also include a main memory 908, preferably random access memory (RAM) or other dynamic memory, for storing information and instructions to be executed by processor 904. Main memory 908 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 904. Computing system 900 can likewise includes a read only memory (“ROM”) or other static storage device coupled to bus 902 for storing static information and instructions for processor 904.
The computing system 900 can also include information storage mechanism 910, which can include, for example, a media drive 912 and a removable storage interface 920. The media drive 912 can include a drive or other mechanism to support fixed or removable storage media. For example, a hard disk drive a floppy disk drive, a magnetic tape drive, an optical disk drive, a CD or DVD drive (R or RW), or other removable or fixed media drive. Storage media 918, can include, for example, a hard disk, a floppy disk, magnetic tape, optical disk, a CD or DVD, or other fixed or removable medium 914 that is read by and written to by media drive 912. As these examples illustrate, the storage media 914 can include a computer usable storage medium having stored therein particular computer software or data.
In alternative embodiments, information storage mechanism 910 may include other similar instrumentalities for allowing computer programs or other instructions or data to be loaded into computing system 900. Such instrumentalities can include, for example, a removable storage unit 922 and an interface 920. Examples of such can include a program cartridge and cartridge interface, a removable memory (for example, a flash memory or other removable memory module) and memory slot, and other removable storage units 922 and interfaces 920 that allow software and data to be transferred from the removable storage unit 918 to computing system 900.
Computing system 900 can also include a communications interface 924. Communications interface 924 can be used to allow software and data to be transferred between computing system 900 and external devices. Examples of communications interface 924 can include a modem, a network interface (such as an Ethernet or other NIC card), a communications port (such as for example, a USB port), a PCMCIA slot and card, etc. Software and data transferred via communications interface 924 are in the form of signals which can be electronic, electromagnetic, optical or other signals capable of being received by communications interface 924. These signals are provided to communications interface 924 via a channel 928. This channel 928 can carry signals and can be implemented using a wireless medium, wire or cable, fiber optics, or other communications medium. Some examples of a channel can include a phone line, a cellular phone link, an RF link, a network interface, a local or wide area network, and other communications channels.
In this document, the terms “computer program medium” and “computer usable medium” are used to generally refer to storage media such as, for example, memory 908, storage device 914, and a hard disk installed in hard disk drive 912. These and other various forms of computer usable media may be involved in carrying one or more sequences of one or more instructions to processor 904 for execution. Such instructions, generally referred to as “computer program code” (which may be grouped in the form of computer programs or other groupings), when executed, enable the computing system 900 to perform features or functions of the present invention as discussed herein.
In an embodiment where the elements are implemented using software, the software may be stored in a computer program medium and loaded into computing system 900 using removable storage drive 914, hard drive 912 or communications interface 924. The computer program logic (in this example, software instructions or computer program code), when executed by the processor 904, causes the processor 904 to perform the functions of the invention as described herein.
While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not of limitation. Likewise, the various diagrams may depict an example architectural or other configuration for the invention, which is done to aid in understanding the features and functionality that can be included in the invention. The invention is not restricted to the illustrated example architectures or configurations, but the desired features can be implemented using a variety of alternative architectures and configurations. Indeed, it will be apparent to one of skill in the art how alternative functional, logical or physical partitioning and configurations can be implemented to implement the desired features of the present invention. Also, a multitude of different constituent module names other than those depicted herein can be applied to the various partitions. Additionally, with regard to flow diagrams, operational descriptions and method claims, the order in which the steps are presented herein shall not mandate that various embodiments be implemented to perform the recited functionality in the same order unless the context dictates otherwise.
Although the invention is described above in terms of various exemplary embodiments and implementations, it should be understood that the various features, aspects and functionality described in one or more of the individual embodiments are not limited in their applicability to the particular embodiment with which they are described, but instead can be applied, alone or in some combination, to one or more of the other embodiments of the invention, whether or not such embodiments are described and whether or not such features are presented as being a part of a described embodiment. Thus the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments.
Terms and phrases used in this document, and variations thereof, unless otherwise expressly stated, should be construed as open ended as opposed to limiting. As examples of the foregoing: the term “including” should be read as mean “including, without limitation” or the like; the term “example” is used to provide exemplary instances of the item in discussion, not an exhaustive or limiting list thereof; and adjectives such as “conventional,” “traditional,” “normal,” “standard,” “known” and terms of similar meaning should not be construed as limiting the item described to a given time period or to an item available as of a given time, but instead should be read to encompass conventional, traditional, normal, or standard technologies that may be available or known now or at any time in the future. Likewise, where this document refers to technologies that would be apparent or known to one of ordinary skill in the art, such technologies encompass those apparent or known to the skilled artisan now or at any time in the future.
The presence of broadening words and phrases such as “one or more,” “at least,” “but not limited to” or other like phrases in some instances shall not be read to mean that the narrower case is intended or required in instances where such broadening phrases may be absent. The use of the terms “module” and “appliance” or the depiction of a box in a diagram does not imply that the components or functionality described or claimed as part of that item are all configured in a common package. Indeed, any or all of the various components of an item, whether control logic or other components, can be combined in a single package or separately maintained and can further be distributed across multiple locations. Likewise, multiple items can be combined into single packages or locations.
Additionally, the various embodiments set forth herein are described in terms of exemplary block diagrams, flow charts and other illustrations. As will become apparent to one of ordinary skill in the art after reading this document, the illustrated embodiments and their various alternatives can be implemented without confinement to the illustrated examples. For example, block diagrams and their accompanying description should not be construed as mandating a particular architecture or configuration.
This application claims priority from U.S. Provisional Patent Application Ser. No. 61/159,333 filed Mar. 11, 2009 and U.S. Provisional Patent Application Ser. No. 61/073,328 filed Jun. 17, 2008.
Number | Date | Country | |
---|---|---|---|
61159333 | Mar 2009 | US | |
61073328 | Jun 2008 | US |