PROTECTING POLYNOMIAL REJECTION THROUGH MASKED COMPRESSION COMPARISON

Information

  • Patent Application
  • 20240126511
  • Publication Number
    20240126511
  • Date Filed
    September 26, 2022
    a year ago
  • Date Published
    April 18, 2024
    25 days ago
Abstract
Various embodiments relate to a data processing system comprising instructions embodied in a non-transitory computer readable medium, the instructions for a cryptographic operation using masked compressing of coefficients of a polynomial having ns arithmetic shares for lattice-based cryptography in a processor, the instructions, including: shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1; scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1; shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1; scaling a second to ns shares of the ns arithmetic shares by a value based on the first compression factor δ and the masking scaling factor φ1; converting the ns scaled arithmetic shares to ns Boolean shares; right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2; XORing an output mask λ2 with the shifted first Boolean share to produce ns compressed Boolean shares; and carrying out a cryptographic operation using the ns arithmetic shares when the ns compressed Boolean shares indicates that the coefficients of the polynomial are within boundary values.
Description
TECHNICAL FIELD

Various exemplary embodiments disclosed herein relate generally to efficiently protecting polynomial rejection through masked compressed comparison.


BACKGROUND

Recent significant advances in quantum computing have accelerated the research into post-quantum cryptography schemes: cryptographic algorithms which run on classical computers but are believed to be still secure even when faced with an adversary with access to a quantum computer. This demand is driven by interest from standardization bodies, such as the call for proposals for new public-key cryptography standards by the National Institute of Standards and Technology (NIST). The selection procedure for this new cryptographic standard has started and has further accelerated the research of post-quantum cryptography schemes.


There are various families of problems to instantiate these post-quantum cryptographic approaches. Constructions based on the hardness of lattice problems are considered to be promising candidates to become the next standard. A subset of approaches considered within this family are instantiations of the Learning With Errors (LWE) framework: the Ring-Learning With Errors problem. One of the leading lattice-based signature schemes is Dilithium which requires operations involving arithmetic with polynomials with integer coefficients. When implemented, the main computationally expensive operations are the arithmetic with polynomials. More precisely, computations are done in a ring Rq=(custom-character/qcustom-character)[X]/(F): the ring where polynomial coefficients are in custom-character/qcustom-character and the polynomial arithmetic is performed modulo a polynomial F.


SUMMARY

A summary of various exemplary embodiments is presented below. Some simplifications and omissions may be made in the following summary, which is intended to highlight and introduce some aspects of the various exemplary embodiments, but not to limit the scope of the invention. Detailed descriptions of an exemplary embodiment adequate to allow those of ordinary skill in the art to make and use the inventive concepts will follow in later sections.


Various embodiments relate to a data processing system including instructions embodied in a non-transitory computer readable medium, the instructions for a cryptographic operation using masked compressing of coefficients of a polynomial having ns arithmetic shares for lattice-based cryptography in a processor, the instructions, including: shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1; scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1; shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1; scaling a second to ns shares of the ns arithmetic shares by a value based on the first compression factor δ and the masking scaling factor φ1; converting the ns scaled arithmetic shares to ns Boolean shares; right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2; XORing an output mask λ2 with the shifted first Boolean share to produce ns compressed Boolean shares; and carrying out a cryptographic operation using the ns arithmetic shares when the ns compressed Boolean shares indicates that the coefficients of the polynomial are within boundary values.


Various embodiments are described, further including performing a masked comparison function on the ns compressed Boolean shares configured to indicate that the coefficients of the polynomial are within boundary values.


Various embodiments are described, wherein the compressed polynomial coefficients corresponding to the ns compressed Boolean shares having a value in a valid range of values have a value of 0, and the masked comparison function compares the ns compressed Boolean shares to 0.


Various embodiments are described, wherein shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1 includes calculating a(0)A=a(0)A1 mod q, where a(0)A is the first arithmetic share of the ns arithmetic shares and q is a prime modulus.


Various embodiments are described, wherein scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1 and shifting the scaled first arithmetic share by a value based on a masking scaling factor φ1 includes calculating







a


(
0
)

A


=


(







2

φ
1


·
δ

q

·

a


(
0
)

A





+

2


φ
1

-
1



)



mod



2

φ
1




δ
.






Various embodiments are described, wherein scaling second to ns shares of the ns shares by a value based on the first compression factor δ and the masking scaling factor φ1 includes calculating







a


(
i
)

A


=







2

ϕ
1


·
δ

q

·

a


(
i
)

A







mod



2

φ
1



δ





where a(i)A is the ith arithmetic share of the ns arithmetic shares.


Various embodiments are described, wherein right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2 includes calculating: ā(⋅)B(⋅)B>>(φ12), where ā(⋅)B is the ns Boolean shares.


Various embodiments are described, wherein XORing an output mask λ2 to the shifted first Boolean share to produce ns compressed Boolean shares includes calculating: ā(0)B(0)B⊕λ2.


Further various embodiments relate to a data processing system including instructions embodied in a non-transitory computer readable medium, the instructions for a cryptographic operation using a masked rejection of a polynomial with coefficients having ns arithmetic shares for lattice-based cryptography in a processor, the instructions, including: generating a ns arithmetic shares for each coefficient of the polynomial; performing a masked compression of each coefficient of the polynomial using the ns arithmetic shares for each coefficient of the polynomial, including: shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1; scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1; shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1; scaling the second to ns shares of the ns arithmetic shares by a value based on the first compression factor δ and the masking scaling factor φ1; converting the ns scaled arithmetic shares to Boolean shares; right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2; and XORing an output mask λ2 to the shifted first Boolean share to produce ns compressed Boolean shares, wherein the compressed ns Boolean shares indicate compressed polynomial coefficients having a predetermined value when the polynomial coefficients are within boundary values; comparing the polynomial coefficients represented by the ns compressed Boolean shares to the predetermined value; and carrying out a cryptographic operation using the ns arithmetic shares when the polynomial coefficients represented by the ns compressed shares are equal to the predetermined value.


Various embodiments are described, wherein the predetermined value is 0.


Various embodiments are described, wherein shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1 includes calculating a(0)A=a(0)A1 mod q, where a(0)A is the first arithmetic share of the ns arithmetic shares and q is a prime modulus, scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1 and shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1 includes calculating








a


(
0
)

A


=


(







2

φ
1


·
δ

q

·

a


(
0
)

A





+

2


φ
1

-
1



)



mod



2

φ
1



δ


,




scaling second to ns shares of the ns arithmetic shares by a value based on the first compression factor δ and the masking scaling factor φ1 includes calculating







a


(
i
)

A


=







2

φ
1


·
δ

q

·

a


(
i
)

A







mod



2

φ
1



δ





where a(i)A is the ith arithmetic share of the ns arithmetic shares, right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2 includes calculating: ā(⋅)B=ā(⋅)B>>(φ12), where ā(⋅)B is the ns Boolean shares, and XORing an output mask λ2 to the shifted first Boolean share to produce ns compressed Boolean shares includes calculating: ā(0)B(0)B⊕λ2.


Further various embodiments relate to a method for a cryptographic operation using masked compressing of coefficients of a polynomial having ns arithmetic shares for lattice-based cryptography, including: shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1; scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1; shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1; scaling a second to ns shares of the ns arithmetic shares by a value based on the first compression factor δ and the masking scaling factor φ1; converting the ns scaled arithmetic shares to ns Boolean shares; right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2; XORing an output mask λ2 to the shifted first Boolean share to produce ns compressed Boolean shares; and carrying out a cryptographic operation using the ns arithmetic shares when the ns compressed Boolean shares indicates that the coefficients of the polynomial are within boundary values.


Various embodiments are described, further including performing a masked comparison function on the ns compressed Boolean shares configured to indicate that the coefficients of the polynomial are within boundary values.


Various embodiments are described, wherein the compressed polynomial coefficients corresponding to the ns compressed Boolean shares having a value in a valid range of values have a value of 0, and the masked comparison function compares the ns compressed Boolean shares to 0.


Vewad wherein shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1 includes calculating a(0)A=a(0)A1 mod q, where a(0)A is the first arithmetic share of the ns arithmetic shares and q is a prime modulus.


Various embodiments are described, wherein scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1 and shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1 includes calculating







a


(
0
)

A


=


(







2

φ
1


·
δ

q

·

a


(
0
)

A





+

2


φ
1

-
1



)



mod



2

φ
1




δ
.






Various embodiments are described, wherein scaling second to ns shares of the ns shares by a value based on the first compression factor δ and the masking scaling factor φ1 includes calculating







a


(
i
)

A


=







2

φ
1


·
δ

q

·

a


(
i
)

A







mod



2

φ
1



δ





where a(i)A is the ith arithmetic share of the ns arithmetic shares.


Various embodiments are described, wherein right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2 includes calculating: ā(⋅)B(⋅)B>>(φ12), where ā(⋅)B is the ns Boolean shares.


Various embodiments are described, wherein XORing an output mask λ2 to the shifted first Boolean share to produce ns compressed Boolean shares includes calculating: ā(0)B(0)B⊕λ2.





BRIEF DESCRIPTION OF THE DRAWINGS

In order to better understand various exemplary embodiments, reference is made to the accompanying drawings, wherein:



FIG. 1 illustrates an exemplary hardware diagram for implementing polynomial rejection by using masked compressed comparison by using the functions MaskedCompress, MaskedComparison, and MaskedReject.





To facilitate understanding, identical reference numerals have been used to designate elements having substantially the same or similar structure and/or substantially the same or similar function.


DETAILED DESCRIPTION

The description and drawings illustrate the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its scope. Furthermore, all examples recited herein are principally intended expressly to be for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor(s) to furthering the art and are to be construed as being without limitation to such specifically recited examples and conditions. Additionally, the term, “or,” as used herein, refers to a non-exclusive or (i.e., and/or), unless otherwise indicated (e.g., “or else” or “or in the alternative”). Also, the various embodiments described herein are not necessarily mutually exclusive, as some embodiments can be combined with one or more other embodiments to form new embodiments.


The signing operation of a digital signature scheme generates a signature for a given message using a secret key. If this secret key was to be leaked, it would invalidate the security properties provided by the scheme. It has been shown that unprotected implementations of post-quantum signature schemes are vulnerable to implementation attacks, e.g., side-channel analysis. In particular, it was demonstrated that the secret key can be extracted from physical measurements of key-dependent parts in the signing operation. For several post-quantum digital signature schemes, the key-dependent operations include the rejection check of polynomials. In the Dilithium protocol in particular, there are two rejection criteria that depend on the sensitive values z and {tilde over (r)}. The check on z prevents a possible leak of secret information when z is made public as part of the signature. The second check on {tilde over (r)} ensures the correctness of the scheme and, after unmasking this value, simplifies the calculation of the hint h (another part of the signature). Both checks assert that all the coefficients of z and {tilde over (r)} lie in their respective required bounds. While this rejection check operation is trivial in the unmasked case, a secure implementation of these digital signature schemes requires the integration of dedicated countermeasures for this step.


Masking is a common countermeasure to thwart side-channel analysis and has been utilized for various applications. Besides security, efficiency is also an important aspect when designing a masked algorithm. Important metrics for software implementations of masking are the number of operations and the number of fresh random elements required for the masking scheme.


The first dedicated masking scheme for a lattice-based signature schemes was presented in Gilles Barthe, Sonia Belaïd, Thomas Espitau, Pierre-Alain Fouque, Benjamin Grégoire, Mélissa Rossi, and Mehdi Tibouchi, Masking the GLP lattice-based signature scheme at any order, Advances in Cryptology—EUROCRYPT 2018—37th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Tel Aviv, Israel, Apr. 29-May 3, 2018 Proceedings, Part II (Jesper Buus Nielsen and Vincent Rijmen, eds.), Lecture Notes in Computer Science, vol. 10821, Springer, 2018, pp. 354-384 (Barthe). To reject arithmetically masked polynomials, the authors propose to use Boolean-masked bound checks for each coefficient. To this end, they first convert the arithmetic shares to Boolean shares, before using Boolean-masked addition to check multiple bounds. The intermediate result of the rejection check is kept in shares as well, and only unmasked as the final decision when all coefficients of the input polynomial have been processed. Overall, this approach requires a costly conversion in addition to multiple costly Boolean-masked additions. The same basic approach was also used for masking qTesla in Francois Gérard and Mélissa Rossi, An efficient and provable masked implementation of qtesla, Smart Card Research and Advanced Applications—18th International Conference, CARDIS 2019, Prague, Czech Republic, November 11-13, 2019, Revised (Sonia Belaïd and Tim Güneysu, eds.), Lecture Notes in Computer Science, vol. 11833, Springer, 2019, pp. 74-91 (Gérard). Further, this approach was slightly optimized for masking Dilithium in Vincent Migliore, Benoît Gérard, Mehdi Tibouchi, and Pierre-Alain Fouque, Masking dilithium—efficient implementation and side-channel evaluation, Applied Cryptography and Network Security—17th International Conference, ACNS 2019, Bogota, Colombia, Jun. 5-7, 2019, Proceedings (Robert H. Deng, Valérie Gauthier-Umaña, hoa, and Moti Yung, eds.), Lecture Notes in Computer Science, vol. 11464, Springer, 2019, pp. 344-362 (Migliore). Still, all of these solutions require multiple costly Boolean-masked additions.


The existing solutions all follow the same basic approach of computing a bound check with Boolean-masked additions. An alternative way to perform a masked rejection check is proposed, which reduces the problem to a masked compressed comparison as used for some post-quantum Key Exchange Mechanisms. This enables the use of existing masked compressed comparison algorithms for masked rejection checks in digital signature schemes. Overall, the proposed masked compressed comparison provides a technological improvement on the state-of-the-art by enabling a significantly more efficient implementation of a masked polynomial rejection. It improves both the number of operations and of random elements.


First a generalized masked compression will be described. In contrast to the state-of-the-art, the described masked compression approach may adapt its compression range much more precisely, by introducing additional parameters. Secondly, this generalized masked compression is combined with a compressed comparison algorithm to derive a new masked polynomial rejection check, that is much more efficient than the state-of-the-art. It is further noted that this new rejection check may be instantiated with any masked compressed comparison algorithm, e.g., with future solutions that improve over the current state-of-the-art.


The two building blocks of the masked rejection check will first be described before combining them to implement the masked rejection check. The following notation and functions will be described and defined. A Boolean or arithmetically masked variable x is denoted as x(⋅)B or x(⋅)A respectively, with ⊕i=0ns−1x(i)B=x or Σi=0ns−1x(i)A=x mod q respectively, (ns being the number of shares). Additionally, bold letters are used to refer to polynomials. So, for example, xi(j)B is the j-th Boolean share of the i-th coefficient in the polynomial x.


The auxiliary variables and functions used herein are defined as follows.

    • q: Prime modulus. In Dilithium it is equal to q=223−213+1.
    • ns: The number of Boolean or arithmetic shares used in the sharing of the secret coefficient.
    • Increasing this value will improve the side-channel security, but also lower the performance of the algorithm.
    • └⋅┐: Rounds to the nearest integer.
    • └⋅┘: Rounds to the greatest integer that is not more than the input.
    • ┌⋅┐: Rounds to the least integer that is higher or equal to the input.
    • +/−: The function computes arithmetic addition or subtraction of the inputs. When applied to one arithmetically-shared input and one public input only one share has to be included in the addition or subtraction.
    • · (Multiplication): The function computes arithmetic multiplication of the inputs. In an arithmetic masking context, if one of the inputs is a constant or a public value, the multiplication operation is applied on each share of the other input independently.
    • >>: The function computes the bit-wise right shift of the input bitstring. In a Boolean masking context, the >> operation is applied on each share of the left input independently.
    • ¬: The function computes the bit-wise negation of the input bitstring.
    • ⊕: The function computes the bit-wise XOR of the input bitstrings.
    • &: The function computes the bit-wise AND of two inputs. In a Boolean masking context, if one of the inputs is a constant or a public value, the & operation is applied on each share of the other input independently.
    • A2B: This function converts ns arithmetic shares x(⋅)A ϵ custom-characterqns to ns Boolean shares x(⋅)B ϵ custom-character2ωns, which encode the same secret value x ϵ custom-characterq.
    • SecAnd: This is a standard implementation of a masked AND-operation performed on Boolean shares.
    • SecOR: This is a standard implementation of a masked OR-operation performed on Boolean shares.


The general purpose of a compression algorithm is to map its inputs to a reduced value space. Depending on the use case different behavior or additional properties may be required. A function that compresses values above the high boundary and values below the low boundary to non-zero values, while mapping the inputs in between the high and low boundaries to zero values. The function MaskedCompress illustrated in pseudocode below is a generalization of Algorithm 13 in Tim Fritzmann, Michiel Van Beirendonck, Debapriya Basu Roy, Patrick Karl, Thomas Schamberger, Ingrid Verbauwhede, and Georg Sigl, Masked accelerators and instruction set extensions for post-quantum cryptography, Cryptology ePrint Archive, Report 2021/479, 2021, https://ia.cr/2021 (Michiel). Instead of using only two compression parameters, the algorithm is modified to receive five parameters as input. This allows the compression to be controlled much more precisely and adapt it to the different situations that occur in an implementation of Dilithium. See Table 1 for suitable parameter sets.


The MaskedCompress function takes arithmetically shared values a(⋅)A mod q and compression parameters δ, φ1, φ2, λ1, λ2 as inputs and returns a compressed value in Boolean shares. The parameters δ, φ1, φ2, λ1, λ2 provide various bounds used in the compression, where λ1 indicates an input mask, λ2 indicates an output mask, φ1 is a masking scaling factor, δ is a first compression factor, and φ2 is a second compression factor. The output of the MaskedCompress function includes Boolean shared values of the compressed coefficients ā(⋅)B.


At line 1 of the MaskedCompress function shifts the value of the first share a(0)A by input mask λ1. At line 2 the value of the first share a(0)A is divided by q and further scaled by 2φ1·δ and rounded, where δ is a compression factor. A further offset of 2φ1−1 is applied to this first share. Line 3 to 5 implement a loop over all of the remaining shares where each share is divided by q and further scaled by 2φ1·δ. Lines 1 to 5 compress the range of the share values to a smaller range.


Next, at line 6, the compressed arithmetic shares are converted to Boolean shares using the function A2B. Then at line 7, the Boolean shares are bitwise right shifted by φ12. This effectively offsets the multiplication by 2φ1 in lines 2 and 4 and the value φ2 allows for a more generical implementation of the compression function. At line 8, the output mask λ2 is XORed to set the compression exactly to the specified boundaries. Then at line 9, the function MaskedCompress returns the compressed value in Boolean shares ā(⋅)B.












Function MaskedCompress (a(·)A, δ, φ1, φ2, λ1, λ2)

















Input: 1. An arithmetic sharing a(·)A of a coefficient; and



    2. Parameters for compression δ, φ1, φ2, λ1, λ2.



Output: A Boolean sharing of the compressed coefficient ā(·)B.










1:
a(0)A = a(0)A + λ1 mod q







2:





a


(
0
)

A


=


(







2

φ
1


·
δ

q

·

a


(
0
)

A





+

2


φ
1

-
1



)


mod


2

φ
1



δ












3:
for i = 1 to ns − 1 do







4:

a(i)A=2φ1·δq·a(i)Amod2φ1δ








5:
end for



6:
ā(·)B = A2B2φ(a(·)A)



7:
ā(·)B = ā(·)B >> (φ1 + φ2)



8:
ā(0)B = ā(0)B ⊕ λ2



9:
return ā(·)B










Next the masked compressed comparison function MaskedComparison will be described. The masked rejection approach disclosed herein may be instantiated with any masked compressed comparison algorithm. Therefore, the masked rejection function disclosed herein may benefit from new developments regarding this problem. For now, a masked compressed comparison function will be used based on a recent approach by Jean-Sébastien Coron, Francois Gérard, Simon Montoya, and Rina Zeitoun, High-order polynomial comparison and masking lattice-based encryption, Cryptology ePrint Archive, Report 2021/1615, 2021,https://ia.cr/2021/1615 (Coron). Note that the multiplicative approach from Coron cannot be used (without explicit compression) due to the magnitude of the compression required for the rejection check. Therefore, instead a more generic approach using a dedicated masked compression function is relied upon. The resulting function MaskedComparison is described using pseudocode shown below. The MaskedComparison function receives inputs of an already compressed masked polynomial x(⋅)B, a comparison polynomial x′, and the bit width w of a polynomial coefficient. The output of the MaskedComparison function is 1 if the input polynomials match and is 0 otherwise. Lines 1 to 3 of the MaskedComparison function loop over each coefficient of the input polynomials and XOR the polynomial coefficients and then negate the value. Then lines 4 through 17 collapse the resulting bits to a single masked comparison bit. As a result, only if the comparison for all coefficients of the polynomial is true, the final bit will be set to 1. Otherwise, 0 will be returned. Line 4 initiates u(⋅)B that will be used to store the cumulative result of ANDing together the bits of xi(⋅)B. The lines 5 to 7 implement a loop over each of the coefficients of the polynomials to use the SecAND function to AND the coefficient values together. Here the negation function and SecAND function are used to implement a secure AND function, which is commonly the way a secure OR function SecOR is implemented. Alternatively, a SecOR function could be used as well.


At line 8 the value m is determined to identify the number of combining iterations that are needed to collapse the bits together in half with each iteration. This is faster than the alternative approach that may be used that just loops through each of the bits one at a time. Then at lines 9 to 12 initiates v(⋅)B that will be used to store the cumulative result of ANDing together the bits of u(⋅)B. Finally, lines 13 to 17 combine the bits of u(⋅)B together by combining one half of the remaining bits together with each iteration until a single bit results.
















Function MaskedComparison(x(·)B, x′, w)



Input: 1. A Boolean sharing (x(·)B of a compressed polynomial x




custom-character  q [X]/(Xn + 1);




    2. A polynomial x′ ∈ custom-characterp [X]/(Xn + 1); and



    3. Bit width of a polynomial coefficient w = ┌log2(q)┐.



Output: 1 if Σk=0ns−1xi(k)B = x′i mod 2φ1δ for 0 ≤ i < n, 0 otherwise.



1: for i = 0 to n − 1 do



2:  xi(0)B = ¬xi(0)B ⊕ x′i



3: end for



4: u(·)B = x0(·)B



5: for i = 1 to n − 1 do



6:  u(·)B = SecAND(u(·)B, xi(·)B)



7: end for



8: m = ┌log2 w┐










9: v(0)B = ¬u(0)B ∨ (22m − 2w)

custom-character  v(·)B ∈ {0,1}2m










10: for i = 1 to ns − 1 do



11: v(i)B = u(i)B



12: end for



13: for i = m − 1 to 0 do



14: g(·)B = v(·)B >> 2i



15: Refresh(g(·)B)



16: v(·)B = SecAND(v(·)B, g(·)B) mod 22i



17: end for



18: return v(·)B









The signing procedure of Dilithium includes several bound checks to ensure that the signature does not leak sensitive information. Two of those checks (∥z∥1−β, ∥{tilde over (r)}∥2−β) are performed on secret values z and {tilde over (r)} and have to be masked. The values γ1, γ2, β are public parameters of the signature scheme. The general idea is to compress the value space for the coefficients of z and {tilde over (r)} in such a way that all the valid accepted values get mapped to 0, while all other values get mapped to other non-zero values. This way the bound checks are reduced to a comparison with zero.


In general, the particular parameter values depend on the modulus q, the valid accepted coefficient range (−γ, γ), and the security level. An overview of the values for the rejection bound λ and the corresponding parameter values for all possible situations that can occur in an implementation of Dilithium are provided in Table 1.









TABLE 1







Possible values for the compression parameters δ, φ1,


φ2, λ1, λ2, depending on the performed check and


security level, and corresponding rejection bound λ.














λ
δ
φ1
φ2
λ1
λ2





{tilde over (r)}-check








Level 2
217-35918
45093
14
10
95061
0


Level 3
218-452 
262341
15
14
785058
1


Level 5
218-376 
262265
10
14
1308820
2


z-check








Level 2
217-78  
65511
13
11
392918
1


Level 3
219-196 
130993
10
14
524060
0


Level 5
219-120 
65487
12
13
524104
0









The function MaskedReject is described below using pseudocode. MaskedReject uses the functions MaskedCompress and MaskedComparison. The input is an uncompressed, arithmetically shared polynomial x(⋅)A together with a bound λ. The output is 1 if the infinity norm of x(⋅)A is less than the bound λ. At line 1, MaskedReject calculates the coefficient bit width (for Dilithium this is always w=23). At line 2, the function GetValues performs a look-up of the compression parameters depending on λ. Table 1 presents an example set of such parameters, but other parameters may be used depending upon the specific cryptographic function being used. An example procedure to find such parameters is described below. The lines 3 to 5 loop over the different coefficients and the MaskedCompress function is applied to each of the coefficients. Finally, at line 6 the arithmetically shared polynomial x(⋅)B is compared to zero using the MaskedComparison function and the output returned.
















Function MaskedReject (x(·)A, λ)



Input: 1. An arithmetic sharing x(·)A of a polynomial x ∈ custom-character  qX]/(Xn + 1);



and



    2. Bound λ for the infinity norm of x(·)A.



Output: 1 if ∥x∥ < λ, 0 otherwise.



1: w = [log2 q]



2: (δ, φ1, φ2, λ1, λ2) = GetValues(λ)



3: for i = 0 to n − 1 do



4: xi(·)B = MaskedCompress(xi(·)A, δ, φ1, φ2, λ1, λ2)



5: end for



6: return MaskedComparison(x(·)B, 0, w)









One example of how to find compression parameters is now described. Note that other methods may be used to determine the compression parameters needed for various cryptographic protocols. Given a modulus q and a rejection bound λ parameter sets (δ, φ1, φ2, λ1, λ2) for the compression may be found. Note that depending on the case, there can be multiple working parameter sets. It is usually advisable to implement one with the smaller parameter values to improve performance.


Starting with 6 and φ2, a search is performed to find a function instantiation of














f

(

x
,
δ
,

φ
2


)

=

(




δ
·
x

q








mod


δ

)



φ
2





(
1
)







with x ϵcustom-characterq, that maps an input range of size I=λ·2−1 to the same output value. Note that it can be any input interval mapping to any fixed output value and does not necessarily have to already comply with the desired compression function. To reduce the search space, the range of φ2 to [0, log2 q] and δ to











q
I

·

2

φ
2






+

[


-

δ
1


,

δ
1


]





may be restricted for some small values of δ1. However, as this restricts the search space, it may result in no solution being found. In that case, it might help to expand the search space again.


After fixing δ and φ2, a search for the offsets (λ1, λ2) is performed by finding an instantiation of the function
















f

(

x
,
δ
,

φ
2

,

λ
1

,

λ
2


)

=

(

(




δ
·

(

x
+

λ
1


)


q









mod


δ

)



φ
2


)



λ
2





(
2
)







which compresses the inputs x ϵ custom-characterq according to the desired rejection function, i.e., the values in the range are mapped to 0, all others to a values≠0. The offset λ2 is found by setting it to the fixed output value of the previous step, and λ1 may be found either by iterating over all possibilities or deriving it from the number of false negatives (non-rejected inputs mapped to≠0), i.e., set such that it corrects this number to zero false negatives.


The remaining parameter φ1 depends on the number of shares and may be found by checking if the masked compression according to the function MaskedCompress computes the expected result for all inputs. The search space may be limited to [┌log2(ns·q)┐−φ2, ┌log 2(ns·q)┐]. However, that might not lead to finding a functional solution. In that case, extending the search space for φ1 may be performed.


The countermeasures that result from using the implementation of the rejection of masked polynomial coefficients provide a technological advantage over the prior art by using a MaskedReject function that requires fewer calculations and the generation of fewer random bits than prior implementations. This will allow for lattice based post-quantum cryptography schemes to be implemented in more applications that have limited processing resources.



FIG. 1 illustrates an exemplary hardware diagram 100 for implementing polynomial rejection by using masked compressed comparison by using the functions MaskedCompress, MaskedComparison, and MaskedReject. As illustrated, the device 100 includes a processor 120, memory 130, user interface 140, network interface 150, and storage 160 interconnected via one or more system buses 110. It will be understood that FIG. 1 constitutes, in some respects, an abstraction and that the actual organization of the components of the device 100 may be more complex than illustrated.


The processor 120 may be any hardware device capable of executing instructions stored in memory 130 or storage 160 or otherwise processing data. As such, the processor may include a microprocessor, microcontroller, graphics processing unit (GPU), field programmable gate array (FPGA), application-specific integrated circuit (ASIC), or other similar devices. The processor may be implemented as a secure processor or may include both a secure processor and unsecure processor.


The memory 130 may include various memories such as, for example L1, L2, or L3 cache or system memory. As such, the memory 130 may include static random-access memory (SRAM), dynamic RAM (DRAM), flash memory, read only memory (ROM), or other similar memory devices.


The user interface 140 may include one or more devices for enabling communication with a user as needed. For example, the user interface 140 may include a display, a touch interface, a mouse, and/or a keyboard for receiving user commands. In some embodiments, the user interface 140 may include a command line interface or graphical user interface that may be presented to a remote terminal via the network interface 150.


The network interface 150 may include one or more devices for enabling communication with other hardware devices. For example, the network interface 150 may include a network interface card (NIC) configured to communicate according to the Ethernet protocol or other communications protocols, including wireless protocols. Additionally, the network interface 150 may implement a TCP/IP stack for communication according to the TCP/IP protocols. Various alternative or additional hardware or configurations for the network interface 150 will be apparent.


The storage 160 may include one or more machine-readable storage media such as read-only memory (ROM), random-access memory (RAM), magnetic disk storage media, optical storage media, flash-memory devices, or similar storage media. In various embodiments, the storage 160 may store instructions for execution by the processor 120 or data upon with the processor 120 may operate. For example, the storage 160 may store a base operating system 161 for controlling various basic operations of the hardware 100. The storage 162 may include instructions for implementing polynomial rejection by using masked compressed comparison by using the functions MaskedCompress, MaskedComparison, and MaskedReject described above.


It will be apparent that various information described as stored in the storage 160 may be additionally or alternatively stored in the memory 130. In this respect, the memory 130 may also be considered to constitute a “storage device” and the storage 160 may be considered a “memory.” Various other arrangements will be apparent. Further, the memory 130 and storage 160 may both be considered to be “non-transitory machine-readable media.” As used herein, the term “non-transitory” will be understood to exclude transitory signals but to include all forms of storage, including both volatile and non-volatile memories.


While the host device 100 is shown as including one of each described component, the various components may be duplicated in various embodiments. For example, the processor 120 may include multiple microprocessors that are configured to independently execute the methods described herein or are configured to perform steps or subroutines of the methods described herein such that the multiple processors cooperate to achieve the functionality described herein. Further, where the device 100 is implemented in a cloud computing system, the various hardware components may belong to separate physical systems. For example, the processor 120 may include a first processor in a first server and a second processor in a second server.


As used herein, the term “non-transitory machine-readable storage medium” will be understood to exclude a transitory propagation signal but to include all forms of volatile and non-volatile memory. When software is implemented on a processor, the combination of software and processor becomes a single specific machine. Although the various embodiments have been described in detail, it should be understood that the invention is capable of other embodiments and its details are capable of modifications in various obvious respects.


Because the data processing implementing the present invention is, for the most part, composed of electronic components and circuits known to those skilled in the art, circuit details will not be explained in any greater extent than that considered necessary as illustrated above, for the understanding and appreciation of the underlying concepts of the present invention and in order not to obfuscate or distract from the teachings of the present invention.


Although the invention is described herein with reference to specific embodiments, various modifications and changes can be made without departing from the scope of the present invention as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present invention. Any benefits, advantages, or solutions to problems that are described herein with regard to specific embodiments are not intended to be construed as a critical, required, or essential feature or element of any or all the claims.


Furthermore, the terms “a” or “an,” as used herein, are defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an.” The same holds true for the use of definite articles.


Unless stated otherwise, terms such as “first” and “second” are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements.


Any combination of specific software running on a processor to implement the embodiments of the invention, constitute a specific dedicated machine.


It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the invention.

Claims
  • 1. A data processing system comprising instructions embodied in a non-transitory computer readable medium, the instructions for a cryptographic operation using masked compressing of coefficients of a polynomial having ns arithmetic shares for lattice-based cryptography in a processor, the instructions, comprising: shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1;scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1;shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1;scaling a second to ns shares of the ns arithmetic shares by a value based on the first compression factor δ and the masking scaling factor φ1;converting the ns scaled arithmetic shares to ns Boolean shares;right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2;XORing an output mask λ2 with the shifted first Boolean share to produce ns compressed Boolean shares; andcarrying out a cryptographic operation using the ns arithmetic shares when the ns compressed Boolean shares indicates that the coefficients of the polynomial are within boundary values.
  • 2. The data processing system of claim 1, further comprising performing a masked comparison function on the ns compressed Boolean shares configured to indicate that the coefficients of the polynomial are within boundary values.
  • 3. The data processing system of claim 2, wherein the compressed polynomial coefficients corresponding to the ns compressed Boolean shares having a value in a valid range of values have a value of 0, andthe masked comparison function compares the ns compressed Boolean shares to 0.
  • 4. The data processing system of claim 1, wherein shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1 includes calculating a(0)Aa(0)A+λ1 mod q,
  • 5. The data processing system of claim 4, wherein scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1 and shifting the scaled first arithmetic share by a value based on a masking scaling factor φ1 includes calculating
  • 6. The data processing system of claim 5, wherein scaling second to ns shares of the ns shares by a value based on the first compression factor δ and the masking scaling factor φ1 includes calculating
  • 7. The data processing system of claim 6, wherein right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2 includes calculating: ā(⋅)B=ā(⋅)B>>(φ1+φ2),
  • 8. The data processing system of claim 7, wherein XORing an output mask λ2 to the shifted first Boolean share to produce ns compressed Boolean shares includes calculating: ā(0)B=ā(0)B⊕λ2.
  • 9. A data processing system comprising instructions embodied in a non-transitory computer readable medium, the instructions for a cryptographic operation using a masked rejection of a polynomial with coefficients having ns arithmetic shares for lattice-based cryptography in a processor, the instructions, comprising: generating a ns arithmetic shares for each coefficient of the polynomial;performing a masked compression of each coefficient of the polynomial using the ns arithmetic shares for each coefficient of the polynomial, including: shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1;scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1;shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1;scaling the second to ns shares of the ns arithmetic shares by a value based on the first compression factor δ and the masking scaling factor φ1;converting the ns scaled arithmetic shares to Boolean shares;right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2; andXORing an output mask λ2 to the shifted first Boolean share to produce ns compressed Boolean shares,wherein the compressed ns Boolean shares indicate compressed polynomial coefficients having a predetermined value when the polynomial coefficients are within boundary values;comparing the polynomial coefficients represented by the ns compressed Boolean shares to the predetermined value; andcarrying out a cryptographic operation using the ns arithmetic shares when the polynomial coefficients represented by the ns compressed shares are equal to the predetermined value.
  • 10. The data processing system of claim 9, wherein the predetermined value is 0.
  • 11. The data processing system of claim 9, wherein shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1 includes calculating a(0)A=a(0)A+λ1 mod q,
  • 12. A method for a cryptographic operation using masked compressing of coefficients of a polynomial having ns arithmetic shares for lattice-based cryptography, comprising: shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1;scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1;shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1;scaling a second to ns shares of the ns arithmetic shares by a value based on the first compression factor δ and the masking scaling factor φ1;converting the ns scaled arithmetic shares to ns Boolean shares;right shifting the ns Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2;XORing an output mask λ2 to the shifted first Boolean share to produce ns compressed Boolean shares; andcarrying out a cryptographic operation using the ns arithmetic shares when the ns compressed Boolean shares indicates that the coefficients of the polynomial are within boundary values.
  • 13. The method of claim 12, further comprising performing a masked comparison function on the ns compressed Boolean shares configured to indicate that the coefficients of the polynomial are within boundary values.
  • 14. The method of claim 13, wherein the compressed polynomial coefficients corresponding to the ns compressed Boolean shares having a value in a valid range of values have a value of 0, andthe masked comparison function compares the ns compressed Boolean shares to 0.
  • 15. The method of claim 12, wherein shifting a first arithmetic share of the ns arithmetic shares by an input mask λ1 includes calculating a(0)A=a(0)A+λ1 mod q,
  • 16. The method of claim 15, wherein scaling the shifted first arithmetic share by a value based on a first compression factor δ and a masking scaling factor φ1 and shifting the scaled first arithmetic share by a value based on the masking scaling factor φ1 includes calculating
  • 17. The method of claim 16, wherein scaling second to ns shares of the ns shares by a value based on the first compression factor δ and the masking scaling factor φ1 includes calculating
  • 18. The method of claim 17, wherein right shifting the s Boolean shares based upon the masking scaling factor φ1 and a second compression factor φ2 includes calculating: ā(⋅)B+ā(⋅)B>>(φ1+φ2),
  • 19. The method of claim 18, wherein XORing an output mask λ2 to the shifted first Boolean share to produce ns compressed Boolean shares includes calculating: ā(0)B=ā(0)B⊕λ2.