The present invention relates, generally, to digital security devices, and, more particularly, to countermeasures protecting digital security devices against side-channel, fault injection, and timing attacks.
Electronic communication and commerce can be powerful, yet dangerous tools. With the wide-spread availability of network technology, such as the Internet, there is an ever-increasing use of online tools for communication and commerce. Every year more users find it easier or quicker to conduct important transactions, whether in the form of correspondence or commerce, using computers and computer networks.
Furthermore, digital technology is playing an ever-increasing role in identity management, e.g., digital identity cards and passports.
Such digital security technologies may involve any of a large variety of software and hardware techniques, for example, cryptography, anti-virus software, and biometrics. These digital security technologies may be deployed on many types of digital security devices, e.g., smart cards, USB tokens with embedded smart cards, subscriber identity modules (SIM) embedded or installed in mobile devices, Internet of Things (IoT) devices. Indeed, any computerized device entrusted to protect secrets, e.g., private communications, cryptography keys, account numbers, health information, may be viewed as a digital security device.
However, there is always the risk that the security of operations performed by digital security devices are compromised through interception by third parties who do not have the right to partake in the transactions. When malicious third parties obtain access to otherwise private transactions and data there is risk of economic loss, privacy loss, and even loss of physical safety.
Consider, as an example, cryptography, which is one mechanism employed to avoid intrusion into the privacy of electronic transactions and data. Traditionally, both sender and recipient of a cryptographic message were considered secure. Cryptography's primary use was to transmit an encoded message from the sender to the recipient without fear that an intermediary would be able to decode the message. If an attacker has no access to the sender's or recipient's cryptographic devices, the attacker is limited to using the encoded message itself or possibly an encoded message and a corresponding plaintext message, to discern the cryptographic key used to encode or decode the message. However, if the attacker has access to the cryptographic device, the picture changes dramatically as the attacker can in that case also analyze artifacts, so called side-channel data, such as power consumption, to deduce data manipulated by the cryptographic device.
One mechanism of ensuring that a private key is indeed kept private is to store the private key and any related key material on a secure portable device, e.g., a smart card or a mobile device. A smart card is a small tamper resistant computer often in the form of a credit card sized and shaped package. Smart cards may be used to store cryptographic keys and cryptography engines for performing encryption, decryption, and digital signatures.
In one example, a user may receive an encrypted message and uses his smart card to decrypt the message by first authenticating to the smart card and then passing the message to the smart card for decryption. If authentication is successful, the smart card may use a cryptographic key stored on the card, and a corresponding cryptography engine, to decrypt the message and provide the decrypted message to the user. Similarly, if a user wishes to cryptographically sign a message, the user may pass the message to the user's smart card, which uses a cryptographic key of the user to digitally sign the message and to provide the signature back to the user or to a third-party recipient.
While cryptography mechanisms are extremely difficult, if not impossible, to break from an algorithmic perspective, the implementation of cryptography mechanisms on the electronics of digital security devices may render them much less secure than an algorithmic analysis would suggest. If an attacker has access to the smart card, the attacker may make repeated observations of, for example, power consumption or electromagnetic emission, during the execution of the cryptographic algorithms and use such ancillary information in attempts to discern the secrets stored on the smart card, specifically secret cryptographic keys stored on the smart card. One such attack is the so-called side-channel attack.
Side-channel attacks make use of the program timing, power consumption and/or the electronic emanation of a device that performs a cryptographic computation. The behavior of the device (timing, power consumption and electronic emanation) varies and depends directly on the program and on the data manipulated in the cryptographic algorithm. An attacker could take advantage of these variations to infer sensitive data leading to the recovery of a private key.
In parallel to the development of side-channel analysis attacks, techniques have been developed to protect against attempts to recover keys, or other sensitive information, from side-channel leakages. These techniques, known as countermeasures, include attempts to hide the operations of the cryptography device from any side-channel data leakage, for example, by masking the data while being manipulated by cryptographic algorithms, by introducing dummy instructions, altering order of instructions, or manipulating the system clock to introduce jitters in any collected side-channel data.
There are several different types of side-channel attacks and, conversely, several different types of countermeasures. With respect to side-channel attacks based on electrical activity of a device, Mark Randolph and William Diehl, Power Side-Channel Attack Analysis: A Review of 20 Years of Study for the Layman, Cryptography 2020, 4, 15; doi:10.3390/cryptography4020015 (incorporated herein by reference) provides a survey of many of the techniques employed as well as discussion of countermeasures.
One form of attack is based on analysis of execution paths through a security algorithm, e.g., a cryptography algorithm. In essence, an attacker monitors, for example, power consumption or execution time through one path with respect to power consumption or execution time of another path of a sensitive routine to determine which path is being executed in response to a particular input data.
Another form of attack, fault-injection attack, seeks to manipulate a device through its behavior when certain faults are induced, for example, by introducing out-of-range supply voltage or clock manipulation. Fault injection may, for example, be used to cause skipping of instructions, incorrect data or instruction fetch, or failure to correctly write data to memory. Success of fault-injection attacks depends on accurate timing of an attack; for example, branch and compare instructions are likely targets to cause execution of a particular branch.
One countermeasure defense against side-channel and fault-injection attacks is to desynchronize the execution of the respective paths through the routine such that any given pass through a portion of code may not result in the same power consumption or timing signature. Such desynchronization may be performed by introducing dummy routines that intentionally slow-down a portion of code and thereby desynchronizing sensitive routine execution timings to prevent an attacker from performing fault-injection attacks at a precise time or location. Introducing dummy routines may also cause an unpredictable power usage profile for the execution of a portion of code thereby making side-channel analysis attacks more difficult.
An alternative countermeasure, uniform branch timing, pads code execution of sensitive routines such that all possible branches through a sensitive portion have uniform execution times, i.e., regardless of which conditions apply, execution time will be the same.
Both execution desynchronization and uniform branch timing suffer from performance degradation due to execution of unproductive operations intended to either cause unpredictable execution timing or uniform execution of all code branches.
Furthermore, usually desynchronization and padding are implemented in simple and predictable ways, e.g., empty loops or loops that perform simple arithmetic. Such operations may be easily detected through side-channel analysis (e.g., simple power analysis or differential power analysis). Thus, the padding code may in of itself render the sensitive routine vulnerable to attack.
It should be noted that the vulnerability to side-channel analysis is not limited to cryptography operations. There are many other operations by computers that must protect sensitive information, for example, passwords. Thus, side-channel analysis is an issue in such scenarios also.
From the foregoing it is apparent that there is a need for an improved method to protect sensitive routines executed by digital security devices against side-channel and fault-injection attacks through desynchronization and uniform branch execution without wasteful execution of useless code that may be vulnerable to side-channel attack.
According to a first aspect, the herein described technology provides a method for enhancing performance while protecting a computerized digital security device against side-channel, fault injection, and timing attacks. The method comprises identifying asynchronous tasks to be performed by the computerized digital security device, placing identified asynchronous tasks in an asynchronous task queue, and executing a first application by nonlinearizing execution of the application by selecting at least one task from the asynchronous task queue, executing the selected at least one task, removing the selected at least one task from the asynchronous task queue.
The method may further include predicting execution time for each identified asynchronous task, and the step of selecting at least one task from the asynchronous task queue comprises basing the selection of the at least one task from the asynchronous task queue on the predicted execution times of tasks in the asynchronous task queue.
The method may further include identifying an execution location at which to add desynchronization time, determining how much desynchronization time to add. The step of selecting at least one task may then include selecting at least one task such that the predicted execution time of the selected at least one task sums to less than the determined desynchronization time to add, and performing at least one dummy task to equal the difference in execution time between the determined desynchronization time to add and the sum of the predicted execution time of the selected at least one task.
The method may further include shuffling execution of the first application by selecting at least one task from the asynchronous task queue and executing the selected at least one task prior to continuing execution of the first application.
The method may further include desynchronizing execution of the first application by modifying execution flow of the first application by selecting at least one task from the asynchronous task queue and executing the selected task prior to continuing execution of the first application.
The method may further include identifying at least a first and a second code branch wherein execution time for the first and second code branch are unequal, and equalizing the execution time for the first and second code branch by selecting at least one task from the asynchronous task queue so that the predicted execution time of the selected at least one task balances the execution time of both code branches.
The method may further include randomly selecting tasks from the asynchronous task queue.
The method may include adding tasks to the asynchronous task queue that include tasks selected from tasks required by the first application, tasks required by an operating system of the computerized digital security device, and tasks required by applications other than the first application.
The tasks added to the asynchronous task queue may include tasks selected from the group computation of a very large prime number, defragmentation of memory, non-volatile memory (NVM) page refresh, NVM erase, NVM write, generation of a small buffer of random numbers using hardware true random number generation, security sensor check, memory integrity verification, code execution flow control, data compression, and data decompression.
In a second aspect, the herein-described technology includes a digital security device having a processor and a memory and that is programmed to perform the above-described method for enhancing performance while protecting a computerized digital security device against side-channel, fault injection, and timing attacks.
In an embodiment, the instructions for nonlinearizing execution of a first application may include instructions to cause the processor to:
In an embodiment, the instructions for nonlinearizing execution of a first application may include instructions to cause the processor to:
In an embodiment, the instructions for nonlinearizing execution of a first application may include instructions to cause the processor to shuffle execution of the first application by selecting at least one task from the asynchronous task queue and executing the selected at least one task prior to continuing execution of the first application.
In an embodiment, the instructions for nonlinearizing execution of a first application may include instructions to cause the processor to desynchronize execution of the first application by modifying execution flow of the first application by selecting at least one task from the asynchronous task queue and executing the selected task prior to continuing execution of the first application.
In an embodiment, the instructions for nonlinearizing execution of a first application include instructions to cause the processor to:
In an embodiment, the instructions for nonlinearizing execution of a first application include instructions to cause the processor to randomly select tasks from the asynchronous task queue.
In an embodiment, the tasks added to the asynchronous task queue include tasks selected from tasks required by the first application, tasks required by an operating system of the computerized digital security device, and tasks required by applications other than the first application.
In an embodiment, the tasks added to the asynchronous task queue include tasks selected from the group computation of a very large prime number, defragmentation of memory, non-volatile memory (NVM) page refresh, NVM erase, NVM write, generation of a small buffer of random numbers using hardware true random number generation, security sensor check, memory integrity verification, code execution flow control, data compression, and data decompression.
In a further aspect, the herein-described technology includes a non-transitory computer memory storing instructions that may cause a processor of a digital security device to perform the above-described method for enhancing performance while protecting a computerized digital security device against side-channel, fault injection, and timing attacks.
In the following detailed description, reference is made to the accompanying drawings that show, by way of illustration, specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention. It is to be understood that the various embodiments of the invention, although different, are not necessarily mutually exclusive. For example, a particular feature, structure, or characteristic described herein in connection with one embodiment may be implemented within other embodiments without departing from the spirit and scope of the invention. In addition, it is to be understood that the location or arrangement of individual elements within each disclosed embodiment may be modified without departing from the spirit and scope of the invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims, appropriately interpreted, along with the full range of equivalents to which the claims are entitled. In the drawings, like numerals refer to the same or similar functionality throughout the several views.
The following description includes references to various methods executed by a processor of an integrated circuit chip. As is common in the field, there may be phrases herein that indicate these methods or method steps are performed by software instructions or software modules. As a person skilled in the art knows, such descriptions should be taken to mean that a processor, in fact, executes the methods, software instructions, and software modules.
The herein described technology provides a mechanism to protect digital security devices against side-channel attacks and fault injection attacks by desynchronizing execution of sensitive routines in an unpredictable manner by desynchronizing code or padding code through the execution of tasks from a queue of tasks that may be executed asynchronously.
A sensitive operation is an operation that processes or uses a piece of sensitive information that should be protected from being divulged. Examples of sensitive information include private cryptographic keys, account numbers, PIN codes and passwords, biometric information, as well as data transferred in secure memory transfer operations. Cryptography operations are typically sensitive operations. Account access through account number, PIN or password are also sensitive operations as are operations to access or manipulate biometric information.
Herein, an asynchronous task is a task, performed by an electronic device, that can be performed without being synchronized to the execution of another task. An asynchronous task may be any task that can be executed before or independently of the current actions of an active application process. For example, an asynchronous task may be an operating system function that is performed by the operating system periodically or based on a particular condition without being a function of an application being executed. Such operations include memory management functions, for example, defragmentation and non-volatile read-and-write memory (NVM) page refresh. Asynchronous tasks also include application program operations. For example, for cryptography applications, these application program operations may include computation of large prime numbers, generation of a buffer of true random numbers using hardware true random number generation (TRNG).
In many cases, the digital security devices 103 are used to perform cryptographic services in conjunction with a service provided by a service provider 109 over a network 111, e.g., the Internet. Such cryptographic services include providing cryptographic signature, encryption, decryption, and authentication. Alternatively, the digital security devices are used for other operations that involve sensitive information, for example, account access via personal identification number (PIN), password or biometrics.
To perform sensitive operations, for example, cryptographic operations, the digital security device 103 store some sensitive information thereon, e.g., cryptographic keys, PINs or passwords.
In classical cryptography, a sender and recipient of a secret message are each in possession of keys that may be employed to encrypt the message and decrypt the message, respectively. The security of the employed cryptographic algorithm relies on confidence that it is mathematically very difficult to decipher the message without the correct key as well as mathematically very difficult to determine the encryption and decryption keys from a message. Thus, if a message is intercepted en route to a recipient, the intercepting party would not be able to infer either the associated plaintext or the keys used to encrypt and decrypt the message.
That security relies on an assumption that the execution of the algorithm itself will not provide information that may be used to determine the sensitive information used in performing the cryptographic operation, e.g., a decryption operation. If the message is intercepted between sender and intended recipient, that is a safe assumption in the intercepting entity would not have access to the device that is being used to decipher the message.
However, as maybe noted by the examples of
When a digital security device 103 may be observed while performing sensitive operations, it is possible to measure various physical characteristics of the digital security device 103 that change during the performance of the sensitive operation. For example, the power consumption, electromagnetic radiation, timing information, and even noise of the digital security device 103 maybe recorded and analyzed to determine the sensitive information stored on the digital security device 103. Collectively, such physical characteristics are referred to herein as side-channel data and use of such data to determine a sensitive information, e.g., a cryptographic key, as side-channel analysis.
In another form of attack, fault-injection attack, a digital security device 103 may be subjected to an operational fault condition, e.g., a source voltage out of range or a clocking error, to cause operational errors that an attacker may exploit to cause execution of operations in a manner that may be used by the attacker to discern secret information stored on the digital security device 103.
The setup of
There are many different types of side-channel attacks. These include, but are not limited to, Simple Power Analysis (SPA), Differential Power Analysis (DPA), Template Attacks (TA), Correlation Power Analysis (CPA), Mutual Information Analysis (MIA), and Test Vector Leakage Assessment (TVLA). Mark Randolph and William Diehl, Power Side-Channel Attack Analysis: A Review of 20 Years of Study for the Layman, Cryptography 2020, 4, 15; doi:10.3390/cryptography4020015. Randolph and Diehl provide a good introduction to the subject of side-channel analysis.
According to an embodiment, the program memory 305 may also contain at least one asynchronous task table 315 for storing information concerning application program tasks that may be performed asynchronously, i.e., to be performed independently of other program flow, and which may be used to disrupt the execution timing of an application to thereby hinder attacks based on analysis of execution timing. The asynchronous task table 315 associated with a particular application 311 is advantageously stored together with the application 311 in the program memory 305. During runtime, the asynchronous task table 315 is used to build a task queue 317 stored in the RAM 302. A task manager function of the operating system 309 uses the task queue 317 to schedule tasks during the execution of the application 311 and other applications executing on the digital security device 103. As discussed in greater detail below, for example, in conjunction with
The mechanism illustrated in
In a first step, step 401, tasks are identified as being tasks that may be performed asynchronously. This step may be performed by a code analysis or by a programmer providing comments that flag to a pre-processor that a particular section of code may be performed asynchronously.
Certain tasks may be considered inherently asynchronous. For example, defragmentation of memory. Such tasks may be performed given a particular condition. For example, with respect to memory defragmentation, a memory analysis may reveal a threshold level of fragmentation that would make memory defragmentation desirable, and a second threshold level, would make memory defragmentation necessary before execution of the application may proceed.
A predicted execution time for the task is determined, step 403. In an embodiment, an application developer assigns predicted execution time for the various asynchronous tasks that the developer has identified as component tasks of the application 311. An application developer tool may provide mechanisms for predicting such execution times.
In an embodiment, the predicted execution time is used to assign a weight to the task, step 405. For example, an acceptable range for weights may be 1 through 10. Very quick tasks are given a weight of 1, whereas very complex tasks may be given a weight of 10.
For illustrative purposes, Table 1, provides some example tasks, associated predicted execution time and weights.
Thus, a relationship is assigned between predicted execution times and weights, e.g., as illustrated in Table 2:
Table 2 is merely one example. In alternative embodiments, the granularity of the association between execution time and weights may be different depending on the implementation.
During the execution 407 of the application, the application forecasts a future need for a task that may be executed asynchronously, step 409, and places such tasks in the task queue 317. The task queue 317 is used by a task scheduler for scheduling tasks to be executed including asynchronous tasks.
If the application has determined that there is a future need for a task that may be executed asynchronously, step 405, that task, e.g., a task T8, it is added 411 to the asynchronous task queue 317b as illustrated in
During the execution of the application, the need for a desynchronization may be determined, step 413. The reason for such desynchronization is outside of the scope of this document. However, as noted, adding delays to the execution of sensitive routines may be used to prevent an attacker from performing side-channel, fault-injection, and timing attacks. Furthermore, a countermeasure against timing attacks is based on balancing the execution time of different branches of a sensitive routine. Thus, countermeasures based on such techniques would determine, from time-to-time, the need for introducing desynchronization by executing tasks unrelated or tangentially related to the execution of the sensitive routine being protected.
In a first step of adding desynchronization, a determination is made as to how much execution time should be added at the point that such desynchronization is required, step 415. For example, if one of two branches has an execution time that is a certain amount of time less than the other branch, that difference is the desired desynchronization time. Conversely, if a particular point in code is desynchronized by adding a random amount of time, the added desynchronization time may be set to a random number.
Once the requisite desynchronization time has been determined, one or more tasks with a combined predicted execution time matching the requisite desynchronization time are selected from the asynchronous task queue 317, step 419, and these asynchronous tasks are executed.
The selected tasks are then removed from the asynchronous task queue, step 421. The asynchronous task queue 317c updated after the removal of an executed asynchronous task is illustrated in
From time-to-time the situation may occur that the application requires the execution of task placed on the asynchronous task queue 317 before it has been selected during a desynchronization effort. In such a case, the task is executed based on that its execution is needed by the application, step 423. After such a task has been executed, it is removed from the asynchronous task queue, step 425.
Turning now to an example of the hereinabove described technology.
Contrast now the flow of
At time T=3, the first desynchronization block (D1) is executed. However, in contrast to the mechanism of
After desynchronization, the executed tasks are removed. In the course of continued execution of the application, a new task, T4, is forecasted to be required. T4 is, therefore, added to the asynchronous task queue.
While not yet executed through a desynchronization process, task T1 is determined to be required for the execution of the application. It is therefore executed (at time T=19 and removed from the asynchronous task queue. At time T=22, the second desynchronization is performed. This time, the desynchronization time is equivalent to weight W=4. Therefore, a task is randomly selected from the asynchronous task queue that has a weight equal to four; in the example, T with a weight of W=4, and it is removed from the asynchronous task queue.
As can be observed from the example illustrated in
From the foregoing it will be apparent that an efficient and secure mechanism for improving desynchronization operations on a digital security device is provided.
Although specific embodiments of the invention have been described and illustrated, the invention is not to be limited to the specific forms or arrangements of parts so described and illustrated. The invention is limited only by the claims.
Number | Date | Country | Kind |
---|---|---|---|
21306724.2 | Dec 2021 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2022/084358 | 12/5/2022 | WO |