Portions of the subject matter of this application were invented under a contract with an agency of the United States Government, under NSF contract No. 0098361.
1. Field of the Invention
The present invention relates to systems using biometric signal processing for authentication in connection with a secure communication protocol.
2. Description of the Related Art
In February 2003, a computer hacker breached the security systems of Visa and MasterCard and accessed 5.6 million valid account numbers, which represents approximately 1% of all 574 million valid account numbers in the United States. Though the accounts were not used fraudulently, a burdensome recall and replacement of valid cards throughout many financial institutions was required. On the Internet, a number of black-market sites sell active credit card account numbers and expiration dates for a modest price. In brick-and-mortar credit card scenarios, photograph identification or signatures are inconsistently checked in normal purchases; hence, fraudulent transactions are commonplace. These situations are just a few which expose the current flaw in traditional transaction protocols, which is mainly a flaw in authentication. Identity theft results in losses of well over a billion dollars a year for credit card issuers, and is even more widespread since the advent of e-commerce on the Internet. The primary reason for the continued success of identity theft is the lack of the ability to prove that an account is used by the genuine, authorized, consumer.
The present invention solves these and other problems by providing a secure embedded system that uses cryptographic and biometric signal processing to provide identity authentication. In one embodiment, the secure embedded system is configured as a wireless pay-point device, called a thumbpod, for brick-and-mortar and/or e-commerce applications. In one embodiment, the thumbpod localizes a sensitive biometric template and does not require transmission of biometric data for authentication. In one embodiment, a key-generation function uses a dynamic key generator and static biometric components. An embedded system design methodology known as hardware/software acceleration transparency is provided to improve performance of the thumbpod. In one embodiment, acceleration transparency is provided in a systematic method to accelerate Java functions in both software and hardware of, for example, an encryption function.
In one embodiment, the thumbpod is designed as a secure embedded device that provides a protocol for wireless pay-point transactions in a secure manner. The protocol uses secure cryptographic primitives as well as biometric authentication techniques. The security protocol used in the thumbpod is based on a protocol that uses the thumbpod as an interface between an authentication server and a user.
In one embodiment, the thumbpod includes a microcontroller, a fingerprint image sensor, signal processing hardware acceleration, cryptographic hardware acceleration, and a memory module enclosed within a form factor similar to an automobile keychain transmitter. The thumbpod provides flexible communication via ports, such as, for example, a port for wireless communication and/or a wired port for fast wire-line communication. The wireless port can be, for example, an infrared port, a radio-frequency port, an inductive coupling port, a capacitive coupling port, a Bluetooth port, a wireless Ethernet port, etc. The wired port can be, for example, a USB port, a firewire port, a serial port, an Ethernet port, etc. The thumbpod can be used for a wide variety of authentication-related transactions, such as, for example, wireless credit card payments, keychain flash memory replacement, universal key functionality (house, car, office), storage of sensitive medical data, IR secure printing, etc.
In one embodiment, a security protocol binds the user to the device through biometrics, combines biometrics and traditional security protocols, protects biometric data by keeping at least a portion of the biometric data in a protected form that does not leave the device, and provides that biometric calculations are provided on the device. In one embodiment, biometric algorithms are provided to fit a relatively constrained environment of embedded devices. In one embodiment, algorithms are provided in fixed point arithmetic. In one embodiment, memory storage optimization and hardware acceleration are provided by converting a least a portion of one or more software algorithms into hardware.
The various features of the present are described with reference to the following figures.
FIGS. 29(a) and (b) shows one embodiment of a software acceleration architecture.
As security is only as strong as the weakest link, a breech in any of the abstraction layers 101-105 can compromise the entire security model. Hence design of the secure embedded system is based on a top-down design flow and security scrutiny at each abstraction level.
The communication port 204 can include a wireless port and/or a wired port to provide flexible communication. In one embodiment, the port 204 includes a wireless port, such as, for example, an infrared port, a radio-frequency port, an inductive coupling port, a capacitive coupling port, a Bluetooth port, a wireless Ethernet port, etc. In one embodiment, the port 204 includes a wired port, such as, for example, a USB port, a firewire port, a serial port, an Ethernet port, a PCMCIA port, a flash memory port, etc.
The thumbpod 200 is configured to be used in connection with a security protocol (as described in connection with
The thumbpod 200 uses biometrics to bind a user to an identification code, such as, for example, an account number, an access code, a password, an the like (hereinafter referred to generically as an account number). At each transaction, the user's biometric data (e.g., fingerprint) is used to digitally sign a transaction as proof of identification. This fingerprint is digitally verified by an authentication server. The protocol used by the thumbpod and the authentication server ensure that sensitive biometric data is not transmitted freely, particularly across wireless or other insecure channels. The protocol described below provides an authentication scheme in which no actual biometric data is transmitted and no biometric data is stored at the server. Rather, biometric information is captured in the thumbpod 200 and used to generate a key K (which is stored at the authentication server) for symmetric-key encryption. This key is used to encrypt challenge and response functions, based on a random number, which are in turn transmitted across the wireless channel.
The protocol 301 is an example of a complex application in which thumbpod 200 uses both cryptographic and signal processing functionality. There are various other protocols for other applications for the thumbpod 200 that share one or both of the common denominators of cryptography and biometric signal processing. Other applications include encryption/decryption and/or verification for audio and video systems.
In the protocol 301, the thumbpod 200 sends an account number to the transaction register 401. The transaction register 401 then sends the account number and data regarding the transaction (e.g., a transaction dollar amount), to the server 310. The transaction register 401 and the server 310 provide mutual authentication through standard protocols, such as, for example, the SET protocol. The server 310 uses the account number to look up the identity of the thumbpod 200 and to obtain a secret key known to the thumbpod 200. The server 310 generates a first authentication vector and encrypts the first authentication vector using the secret key. The encrypted first authentication vector is then sent to the transaction register 401. The transaction register forwards the first authentication vector to the thumbpod 200. The thumbpod 200 decrypts the first authentication vector and verifies the identity of the authentication server 310. The thumbpod also authenticates the user and generates a second authentication vector. The second authentication vector is encrypted using the secret key. The thumbpod 200 returns the authentication vector to the transaction register 401, which forwards the second authentication vector to the authentication server 310. The authentication server 310 decrypts the second authentication vector and verifies the identity of the thumbpod 200. Once the identity of the thumbpod has been verified, the authentication server 310 sends a “transaction complete” message to the transaction register 401. The transaction forwards the transaction complete message to the thumbpod 200, which then increments a transaction counter. In one embodiment, streaming encryption is provided between the thumbpod 200, the transaction register 401, and/or the server 310.
In order to make a transaction at the transaction register 401, the user 303 uses the thumbpod wireless port 204 to initiate communication with the register. Challenge and response functions are negotiated between the user 303 and the server 310, routed through the merchant's register 410 (which cannot interpret the data because it does not posses secret keys known to the thumbpod 200 and the server 310). In the course of the authentication protocol, the user 303 places his/her finger on the fingerprint sensor 202 to provide identity verification. This information is processed within the thumbpod 200 and, if a match is made, cryptographic hash functions and keys are generated using encryption algorithms and the protocol continues to its completion.
In the protocol 301, three items are used for valid authentication transactions: 1) the account number stored in the thumbpod 200, 2) the thumbpod 200 itself (which generates the secret key K), and 3) the correct biometric component (e.g., a finger, a retina, etc.) for live-scan sensing by the sensor 202. These three elements provide a strong tie between the user 310 and the thumbpod 200 account number. In an e-commerce situation, merely having a stolen account number (and expiration date) would be insufficient to make a transaction. Likewise, an account number and a stolen thumbpod 200 are also insufficient. All three components are required to make a valid transaction.
In the protocol 301 a threefold-authentication takes place: 1) the server 310 authenticates the thumbpod 200, 2) the thumbpod 200 authenticates the server 310 (and transaction register 410), and 3) the thumbpod 200 authenticates the user 310. Unlike traditional schemes, the user 303 authenticates the server and the transaction register 401, providing protection against fraudulent or malicious merchants. Hence, the protocol retains the advantages of the current credit card-type protocols, while supplementing the protocols with stronger security, transaction device-to-user binding, and authentication directionality. Other advantages of the protocol 301 include fraud detection as well as authentication at each transaction.
As shown in
The server 310 begins its side of the authentication process by loading the user's secret key K, which is shared only between the server 310 and the thumbpod 200. In one embodiment, the secret key is at least 128 bits. The server 310 also loads a user's counter value SQNAS and an institution authentication parameter AMF. In one embodiment, the counter value SQNAS is at least 48 bits, and the institution authentication parameter AMF is at least a 16 bits. The counter value SQNAS is stored both on the server and on the thumbpod 200 and is used to prevent replay attacks. The server 310 loads and encrypts an operator code OP, producing OPC (which can be optionally pre-stored). In one embodiment, the operator code OP is at least 128 bits. Finally, the server 310 generates a random value RAND and uses K with Rijndael primitives to generate a set of authentication parameters for the specific transaction. In one embodiment, RAND is at least 128 bits. The authentication parameters include:
After the above authentication parameters are generated, the server 310 transmits a subset of the authentication parameters—the authentication vector—to the transaction register, which forwards the vector to the Thumbpod 200. The authentication vector includes:
J As in 3GPP authentication, the authentication between the thumbpod 200 and the server 310 is a mutual authentication based on the shared secret key K. The random session value RAND is coupled with K to provide the two primary challenge/response vectors: MACAS and RESTP. The MACAS vector proves the identity of the server 310 to the thumbpod 200. Only the server 310 with the precise value of K (and the current random session value RAND) will be able to produce the proper MACAS. When the thumbpod 200 verifies this value by comparison with its generated expected value of XMACTP (based on K and RAND) it determines whether the proper key K was used, and hence whether the server 310 is genuine. The same argument holds for the RESTP vector, which is used to verify the identity of the thumbpod 200 to the server by comparison with XRESAS. When both challenge/response values are verified, then mutual authentication is assured. The random number RAND and the sequence number SQNTP/AS are used to prevent replay attacks on previously-used authentication vectors obtained through eavesdropping on the channel. Since the sequence number follows a deterministic pattern (bit increment at each transaction), it is masked by a one-use anonymity key AK as it is transmitted over the channel to prevent smart replay attacks.
At this point, the protocol 301 enters into a biometric authentication portion which differs from 3GPP or other wireless authentication protocols. The thumbpod 200 stores the authentication vector and begins biometric authentication by requesting that the user 303 to provide biometric data (e.g., place his/her finger on the fingerprint sensor 202). The Thumbpod 200 performs imaging, feature extraction, matching, and decision. During imaging, the thumbpod 200 images fingerprint to produce a bitmap of raw data. In one embodiment, the bitmap is at least 128×128 8-bit grayscale. During feature extraction, the thumbpod 200 processes the raw data, enhances the image, and extracts the minutiae types (ridges, bifurcations) and locations of the candidate fingerprint. During the matching process, the thumbpod 200 loads a stored fingerprint template and performs a matching function to produce a match score. During the decision process, the thumbpod 200, using the match score, decides if the candidate fingerprint is a match to the template.
If the algorithm detects an incorrect match, an error vector is transmitted to the server 310 and the protocol 301 is terminated. If the algorithm detects a match, the authentication protocol 301 continues. Using Rijndael in CBC-MAC mode, the shared secret key K is created by hashing the fingerprint template using a pre-stored 128-b key generator value KG according to K=HASHKG (template). (Alternatively, the value of K can also be pre-stored in the embedded device.)
In one embodiment, after loading the received values of RAND and AMF, the thumbpod 200 loads OP and uses the secret key K and Rijndael primitives to generate:
CK (optional): a cipher key to allow for streaming encryption after authentication is performed. In one embodiment CK is at least 128 bits.
IK (optional): an integrity key to allow for optional data integrity and origin authentication of streaming encryption data. In one embodiment IK is at least 128 bits.
If the message authentication code generated by the server (MACAS) is equal to the message authentication code generated by the thumbpod 200 (XMACTP), then the identity of the server 310 is authenticated. If they are unequal, then the user immediately recognizes that either the transaction register or the server is fraudulent.
If the counters of the server 310 and the thumbpod 200 are synchronized, then the process continues. If the counters are not synchronized but the MAC test passes, the system enters into re-synchronization mode to restore synchronization.
If the two authentication tests are passed, the thumbpod 200 sends a response vector RESTP to the transaction register 401, which forwards this vector to the authentication server 310.
To complete the protocol, the authentication server 310 verifies that XRESAS=RESTP. If the values are not equal, it immediately indicates a fraudulent user or fraudulent thumbpod 200, allowing the server 310 to act accordingly. If these values are equal, then the identities of the thumbpod 200 and the user 310 are verified by the server 310. Hence, the protocol 301 provides mutual authentication between the server and the thumbpod 200. After verification of the user's identity, the server 310 increments its local counter variable SQNAS and sends a transaction-complete vector to the transaction register 401. The register 401 then completes the transaction by printing a receipt and forwarding the transaction complete vector to the thumbpod 200. The thumbpod 200 increments its local counter variable SQNTP to conclude the authentication protocol 301.
Four functions require a relatively large amount of computation in the thumbpod 200: 1) authentication vector generation, 2) feature extraction, 3) template matching, and 4) the key generation hash function.
The protocol 301 and the thumbpod 200 can use any robust encryption method. In one embodiment, the cryptographic engine used in the thumbpod 200 is the Rijndael algorithm (e.g., using a 128-b key and 128-b data), otherwise known as the Advanced Encryption Standard (AES). In one embodiment, Rijndael was chosen for security considerations and the absence of any known vulnerabilities to attack. The Rijndael kernel is used in three configurations: ECB, CBC-MAC, and OFB/Counter for optional streaming encryption applications.
In one embodiment, the generation of authentication vectors in the server 310 is shown in
A closer examination of the functions f1-f5 is provided in
The protocol 301 is resistant or immune to the following cryptographic attacks: false register or false authentication server attack, stolen account number authentication attack, stolen account number synchronization attack, multiple synchronization attempts attack, stolen thumbpod attack, timeout attack, and incorrect data format transmission attack.
One aspect of the protocol 301 is the key generation function, which traverses security issues found in prior art biometric systems. A deficiency with biometrics in general is the issue of true identity theft: once a biometric identity (fingerprint, iris scan, etc.) is stolen, it is forever compromised, as a person possesses only a finite number of biometric templates. Although the thumbpod 200 can be housed in a tamper-proof casing, in one embodiment, the biometric template in the thumbpod is stored in a matter that prevents biometric data from being extracted from a stolen thumbpod 200.
In one embodiment, the thumbpod 301 uses a key generation concept whose security relies on both a static component (e.g., fingerprint template) and a dynamic component (a key generator variable), K=HASHKG(template). The shared secret key K is obtained by using a KG as the key for the Rijndael CBC-MAC engine, which operates on the user fingerprint template (5,120 bytes). This is similar, at least in principle, to a split-key security system, where two users possess separate, different keys and both keys are necessary to activate the device in question. Prior art biometric authentication systems merely require a template match in order to allow access, and a stolen template gives a criminal full access to the user's identity.
If the thumbpod 200 is lost or stolen, for precautionary measures, the user 303 would notify his/her financial institutions to request a new KG. After obtaining a new thumbpod 200 and enrolling a new template, a new secret key K would be generated, rendering the old key useless. Hence, in the case that a criminal obtains the user's fingerprint template from a thumbpod 200, the system is not entirely compromised due to the split-key key generation function. Another security benefit of the split-key generation model is that the server 310 never receives a copy of the user's template; it only stores the current secret key K. Due to the one-way property of hash functions, a stolen secret key K would not allow a criminal to re-generate the user's fingerprint template, even with knowledge of the key generator KG. This localization of sensitive data, rather than a widespread distribution of biometric data to each financial institution, allows for both psychological as well as cryptographic security.
Since the thumbpod 200 performs biometric identification, relatively computation-intensive biometric signal processing is typically required for both the feature extraction and matching algorithms. Designing for secure embedded systems results in partitioning which is based not only on communication-computation tradeoffs, but also partitioning which is based on security considerations. For example, though transmitting plaintext raw fingerprint data over the wireless channel would perhaps save energy in the thumbpod 200, it is insecure in that a passive attacker could listen on the channel and steal the fingerprint data. The following section describes the security-based partitioning of the biometric functions used for the protocol 301.
For purposes of explanation, and not by way of limitation, the thumbpod 200 is described in terms of six subsystems: 1) Data collection subsystem, 2) Signal processing subsystem, 3) Matching subsystem, 4) Storage subsystem, 5) Decision subsystem, and 6) Communication subsystem.
In one embodiment, the data collection subsystem includes the sensor 202. In one embodiment, the sensor 202 includes an Authentec AF-2 CMOS imaging sensor. An alternative placement of the sensor is within the merchant's transaction register 401. However, studies have shown the relative ease in which a fingerprint can be stolen from a traditional CMOS sensor. Hence, placing a sensor on the transaction register 401 presents a security risk in that a fingerprint can be easily stolen by a malicious merchant or another consumer. As for the resolution of the CMOS sensor, it is chosen based on consideration of security strength and system cost. In some embodiments, the size of thumbpod 200 limits computational power and energy consumption, thus the collected data from CMOS sensor is sized to be precise enough to obtain a reasonable matching result but small enough to meet a system requirement in such an embedded system.
The raw data collected by the sensor 202 is processed to extract biometric features for identification. In a fingerprint verification system, the features to be extracted are the minutiae type (ridge or bifurcation) and the location of the minutiae via a process is known as feature extraction or minutiae detection. In one embodiment, the thumbpod 200 uses the standard floating-point C NIST detection algorithm.
In one embodiment, the thumbpod 200 uses a fixed-point variation of the well-known standard floating-point NIST detection algorithm. There are several steps in the minutiae extraction algorithm, many of which require significant signal processing. The first step is to generate image quality maps, which include the detection of fingerprint ridge directions, image refinement, and detection of low contrast areas, which are assigned lower quality factors. A binarization of the image is generated, and the detection algorithm scans this binary image of the fingerprint to identify localized pixel patterns that indicate the ending (ridge) or splitting of a ridge (bifurcation). In one embodiment, a fixed-point refinement and table lookup of mathematical functions are used to reduce the computational and energy burdens.
The matching subsystem includes a set of algorithms used to match a pre-stored fingerprint template (or multiple fingerprint templates) with a candidate fingerprint obtained from the sensor. After extracting the minutiae of the fingerprint, two steps are used to estimate the similarity of the input minutiae set and the template minutiae set. The first step is to discover the correspondence of these two minutiae sets. For each minutia, the distance and relative direction to its neighborhood is taken as its local structure. Since this local structure is rotation and translation invariant, it is used to choose the corresponding pair in the input and template minutiae sets. The second step is to align the other minutiae by converting them to a polar coordinate system based on the corresponding pair, then computing how similar the overall minutiae distributions are in the input pattern and template pattern. The total similarity is represented by matching score. For security reasons, the matching algorithm is embedded within the thumbpod 200. Thus, sensitive minutiae data is not required to be transmitted over the channel.
The storage of the fingerprint template is also partitioned onto the thumbpod 200. The template is stored on-device in order to localize the most sensitive information in the entire system—the user's fingerprint information. If the template is distributed to various financial institutions, a breech in only one system would cause a loss of the user's template data. The aforementioned split-key generation function, coupled with the template storage on the thumbpod 200, is used to address this security issue.
The decision subsystem receives the results of the matching algorithm and makes a decision based on a pre-defined correlation score
Since the biometric subsystems are embedded within the thumbpod 200 device, it allows for the communication subsystem to transmit data across an insecure wireless channel. The only unencrypted sensitive data sent over the channel is the initial account information required to begin the authentication protocol. All other transmitted information is either encrypted or irreversible (one-way hash values used for authentication verification).
The aggregate result of this system partitioning allows for two unique system characteristics. First, the protocol describes a biometric authentication system in which no biometric information is transmitted across any medium, wireless or wired. Second, as previously mentioned, the biometric data is stored only in the thumbpod 200 and not in any financial institution server. The localization of sensitive data minimizes the cost of breeches in the entire security context.
Fingerprint Identification
In one embodiment, the algorithm used to extract minutiae of the fingerprint image is originated from NIST Fingerprint Image Software.
The minutiae detection process is based on finding a directional ridge flow map. To get this map, the fingerprint image (e.g., 256×256 pixels) is first divided into a grid of blocks (e.g., 8×8 pixels). For each block, there is a surrounding window (e.g., 24×24 pixels) centered by this block. For each block, the surrounding window is rotated incrementally and a DFT analysis is conducted at each orientation. In one embodiment, the number of orientation is set to 16, creating an increment in angle of 180°/16, i.e. 11.25°. Within an orientation, the pixels along each rotated row of the window are summed together, forming a vector of 24 pixel row sums. The 16 orientations produce 16 vectors of row sums, as shown in
The resonance coefficients produced by convolving each of the 16 row sum vectors with the 4 different discrete waveforms are stored and then analyzed. The dominant ridge flow direction for the block is determined by the orientation with the maximum waveform resonance calculated from Equation (1):
Each pixel is assigned a binary value based on the ridge flow direction associated with the block to which the pixel belongs. Following the binarization 1005, the detection block 1006 scans the binary image of a fingerprint, identifying localized pixel patterns that indicate the ending or bifurcation of a ridge.
After two minutiae sets (e.g., an input fingerprint image and a template fingerprint image, respectively) are extracted, the matching algorithm can be described.
The first step 1401 in the algorithm 1400 is to find out the correspondence of these two minutiae sets. Each minutia, N, can be described by a feature vector: N=(x,y,φ,i), where (x,y) is its coordinate, φ is the local ridge direction and t is the minutia type (ridge ending or bifurcation). However, x,y and φ cannot be directly used for matching because they are dependent on the rotation and translation of the fingerprint. To solve this problem, it is useful to construct a rotation and translation invariant feature:
M=(d1,d2,θ1,θ2,φ1,φ2,n1,n2,t,t1,t2) (2)
where p and q are the numbers of minutiae in the input fingerprint and the template fingerprint, respectively. |Ml(i)−MR(j)|W is the weighted difference of Ml(i) and MT(j). A(W) is the threshold which is related to the weight vector W. Set W=(1,1,8,8,8,8,3,3,⅓,⅓,⅓) and A(W)=55. By searching of sl(i,j), one pair (b1,b2) can be obtained so that
The next step 1402 is to align the other minutiae by converting them to a polar coordinate system based on the corresponding pair (b1,b2). For minutia N, the new polar coordinate is Mp=(r,θ,φ), where
The function diff( ) is the difference between two angles. Based on the aligned minutiae sets, we can compute the matching level of each minutia in the input fingerprint and each one in the template fingerprint:
In Equation 5, diff_total=|Mlp(i)−MTp(j)|W
To avoid one minutia being used more than once for matching, ml(i,j) is set to “0” if there is any k that make ml(i,k)>ml(i,j) or ml(k,j)>ml(i,j). Afterwards, the final matching score can be calculated by:
The algorithm 1400, provides fingerprint verification on thumbpod 200. In one embodiment, the sensor 202 used for fingerprint scanning has relatively small area (13×13 mm2), so the performance is relatively strongly dependent on which part of the finger is captured by sensor. In one embodiment, the thumbpod 200 uses a two-template system to deal with the small sensor area. The fingerprint image sets (templates) used by the thumbpod 200 include 10 fingerprints per finger from 10 different fingers for a total of 100 fingerprint image templates. Each fingerprint is compared with every fingerprint template in pairs, and the two match scores from each pair are ported into a decide engine in order to get the final matching result. A total of 7,200 decisions involved for the matched case and a total of 81,000 decisions are involved for the mismatched case. The size of captured image is 256×256 pixels. In one embodiment, the thumbpod 200 provides a 0.5% FRR (False Rejected Rate) and a 0.01% FAR (False Accepted Rate).
Implementing the fingerprints minutiae detection and matching on an embedded platform such as the thumbpod 200 involves performance, speed, and low power tradeoffs, since the whole process needs to be finished in a relatively short time and the battery lifetime in such devices is limited.
Software optimization aims to reducing the cycle number of processors as well as the power consumption. To get better performance, the first step is to find out the hot-points of the system.
When considering the pattern of a fingerprint, the neighboring blocks tend to have a similar direction. In the example fingerprint map shown in
The first direction data, upper left in
The execution speed as well as the matching error rate is measured when changing ETH from 10M to 35M. The results are shown in
The software optimization reduces the number of DFT and results in significant speedup of the minutiae detection. However, there are still more than 7,000 times of DFT calculations for 256×256 pixels image, even if setting ETH=27M. Therefore, DFT hardware acceleration is useful in addition to the software optimization (
The final specification of the accelerator is decided to deal with only Multiply/Accumulate (MAC) computations for sine and cosine part separately. In the Multiply operation, Canonic Signed Digit (CSD) is used for saving hardware resources. The energy calculation part is not included because it needs square operation of 16 bits data, which requires a general multiplier.
As a result, the execution time of the minutia detection is reduced to about 4 sec and 3 sec for ETH is 27M and 10M, respectively as shown in
Thus in one embodiment of the thumbpod 200, a modified algorithm is implemented. In the modified algorithm, before calculating the real local feature vector, one additional module called “Pre-Checking” is added. For each pair of minutiae, the weighted difference |Ml(i)−MT(j)|W is calculated. In the Pre-Checking module, define W=Wd=(1,1,0,0,0,0,0,0,0,0,0), which means only the distance information is needed in this procedure. If the weighted distance |Ml(i)−MT(j)|W
The computation time after adding the Pre-Checking module and the result degrade depends on the value of the threshold MTH. The relationship between MTH and the performance is shown in
During the regular process of setting flags to the possible multiple-used matching level ml(i, j), one loop with a size of p×q×(p+q) is used, where q and p is the number of minutiae in the input and template fingerprints, respectively. For a sample case, where p=37 and p=39, the instruction cycle number to finish this process is 1.4M (million), which is 38.9% of the entire matching process. The value ml(i,j) is calculated from the local feature difference of the ith minutia in the input fingerprint and the jth minutia in the template fingerprint. For most of the pairs (i,j), the local feature vector is so different that ml(i,j) is 0, which means that it contributes nothing to the overall matching score. Based on this characteristic, the process of marking possible multiple-used ml(i,j) can be optimized. Whenever the ml(i, j) is “0”, all the remaining comparison steps can be skipped and the process can advance straight to the next pair. After the above optimizations, the total cycle number is 1.34M. Hence the execution time is reduced to 26.80 ms, as shown in
Thus, by implementing optimized minutiae detection and matching algorithms, as well as DFT hardware accelerator, execution time for the minutiae detection and matching process can be substantially reduced
Hardware/Software Acceleration Transparency
However, though advantages exist in these domains, Java is slower than its counterpart in C, and much slower than its counterpart in pure hardware. An example of Java's performance drawback can be seen in Table 1, where the 128 bit input, 128 bit key Rijndael function in Electronic Code Book (ECB) is performed. The Java (KVM) and C figures are on a 1 mW/1 MHz Sparc processor. This configuration is used to emulate an embedded environment. The ASIC figures are based on an ASIC configured to implement the algorithm. As can be seen in the table, a hardware solution is five orders of magnitude superior in both performance and energy consumption (as measured in Gb/s per Watt). For streaming encryption applications described in the previous section, pure embedded software solutions are inadequate. Hardware acceleration is used.
In order to incorporate software and hardware acceleration and simultaneously allow for incremental refinement in the design flow process, it is advantageous to use a technique called hardware software acceleration transparency. Hardware/software acceleration transparency is described below in further detail and involves three related items: 1) incremental acceleration, 2) Java function emulation, and 3) interface transparency.
The first principle of acceleration transparency is incremental refinement acceleration. In the example shown in
Hardware/software acceleration transparency also includes Java function emulation, a term used to describe the interface relationship between the Java application and the accelerated function. For example, a Java application wishes to access a Rijndael function via a function call rijndael( ). From the above discussion, the Java application has one of three alternatives to obtain the implementation: 1) a Java function, 2) a C function, or 3) hardware acceleration.
Hardware/software acceleration transparency means that, to the Java application, each of these alternatives is accessed with the same Java function signature. In the pure Java case, this is already apparent: A Java Rijndael function is accessed by the Java application with a simple function call rijndael( ). For C acceleration, interfaces are constructed such that the Java application can access the C Rijndael function with the same function call rijndael( ). For hardware acceleration, HW/SW interfaces to the crypto-processor are designed such that Rijndael functionality is again accessed by the same function call rijndael( ). In this way, from the Java application vantage point, each of these alternatives “looks” exactly the same. To the application, each of the three alternatives takes in the same input, produces the same output, and is accessed by the same Java function and hence functionally is the same, as seen in
Part of the previously mentioned Java function emulation is the concept of interface transparency. This is also illustrated in FIGS. 26A-F. Interface transparency means that to the Java application, all the interfaces in between it and the acceleration implementation are transparent. In other words, the Java application can directly “see” the acceleration implementation (which looks to it like a Java function) regardless of the number of interfaces. Interface transparency essentially raises co-processor control a number of abstraction layers directly to the Java application level.
The use of hardware/software acceleration transparency, allows the designer to build interfaces incrementally. Instead of tearing down the previous interface and starting from scratch at each abstraction level, the next interface incrementally refines the previously constructed interface. Thus, the interface design flow is smooth and continuous. Acceleration transparency allows for system performance modeling at each abstraction level. As each accelerated function is placed into the overall system, the hybrid system can be re-benchmarked and the performance gains ascertained. As the system progresses from software to hardware, the original Java application needs only minor modification. Using acceleration transparency implies that each of the acceleration modules “looks” like the initial Java function in the original application; hence, the original Java application can remain the same (or relatively unchanged) from the beginning functional simulation to the final HW/SW system implementation. Once the interface hierarchy is constructed, a new acceleration module can be appended to the system through the pre-designed interfaces. A system can thus be reconfigured in a systematic way.
The following example shows HW/SW acceleration transparency and gives performance measurements for interface overhead. The simulation environment used for the example includes a cycle-true LEON-Sparc simulator. C code is compiled with the GNU C compiler gcc V3.2 with full optimization (−O2). Java byte-code is interpreted on the KVM embedded virtual machine from the Java2 Micro Edition. Thus, cycle counts for Java are cycles of the target LEON-Sparc which runs KVM that in turn runs the Java program.
The example begins with the aforementioned interface specification of the Rijndael in Java and C. A 128-bit key and 128-bit data block are used in the example.
The interfaces are as follows:
Java: int[ ] rijndael(int[ ] key, int [ ]din)
C: void rijndael(int din[4], int key[4], int dout[4])
A pure Java implementation for Rijndael on top of KVM takes 301,034 cycles, as shown in
A first refinement to the pure Java model is to substitute the pure Java implementation with a native implementation in C. A native method in Java is shown in
The rijndael( ) function of
The next step is to substitute the C implementation with a native hardware implementation of the Rijndael algorithm. A hardware coprocessor is used that completes a 128-bit encryption in 11 clock cycles. This hardware processor is interfaced to the co-processor interface of the Sparc, and programmed as shown in
While the performance gain of moving from Java to native implementation is substantial, it is not completely overhead-free. This overhead is primarily caused by moving data across the hierarchy levels in the model. This overhead can be reduced by treating data-flow and control-flow separately. In any case, the incremental refinement of the model is a major advantage from the design-flow point-of-view.
The design of the thumbpod 200 uses a number of abstraction levels, with each abstraction level based on design decisions and interface construction. The smooth transition from one model to another allows for successive refinement of the system.
The LEON/Sparc core provides two interfaces: a high-speed AMBA bus interface (AHB) and a co-processor interface (CPI). Each interface has specific advantages toward domain-specific co-processors. The CPI offers an instruction- and register-set that is visible from within the Sparc instruction set, and allows a close integration of a domain-specific processor and the Sparc. The AMBA bus uses mapping of a co-processor through the abstraction of a memory interface. The CPI provides two 64-bit data ports and a 10-bit opcode port.
The high speed AMBA bus contains a memory interface and a bridge to the peripheral bus interface (APB). The memory interface includes an interface to a 32 MByte DDR RAM memory. The AMBA peripheral bus (APB) contains the fingerprint processor and two UART blocks. One connection is used to attach a fingerprint sensor 202, while the second one is used to connect an application server. This server is used to download and debug applications, as well as to experiment with the security protocol.
Although the preceding description contains much specificity, this should not be construed as limiting the scope of the invention, but as merely providing illustrations of embodiments thereof. Many other variations are possible within the scope of the present invention. For example, one of ordinary skill in the art will recognize that the thumbpod can be implemented using a variety of virtual machine and/or operating environments, such as, for example, Windows CE, TinyOS, PALM OS, Linux. etc. Although JAVA is described as being used in one or more embodiments, other languages can be used as well, such as, for example, high-level languages, low-level languages, C/C++, lisp, assembly language, etc. Thus, the scope of the invention is limited only by the claims.
The present application claims priority benefit if U.S. Provisional Application No. 60/475,242, filed Jun. 2, 2003, titled “SYSTEM FOR BIOMETRIC SIGNAL PROCESSING WITH HARDWARE AND SOFTWARE ACCELERATION,” the entire contents of which is hereby incorporated by reference.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US04/17545 | 6/2/2004 | WO | 8/7/2006 |
Number | Date | Country | |
---|---|---|---|
60475242 | Jun 2003 | US |