The invention relates to a method and system for modifying a document without changing hash values of the document.
A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by the hash function are called hash values, hash codes, digests, or simply hashes. The hash functions and their associated hash tables are used in data storage and retrieval applications to reduce the amount of storage size, but also to ensure data integrity and authentication of messages.
Data integrity is the maintenance of, and the assurance of, data accuracy and consistency and is an aspect that needs to be considered in the design, implementation, and usage of any system that stores, processes, or retrieves data. The overall intent of a method, such as a hash function, for data integrity is to ensure that the data is recorded or stored exactly as intended and, upon later retrieval of the stored data, ensure the retrieved stored data is the same as when the data was originally stored. The hash function used for data integrity aims to prevent unintended (or unintentional) changes to information in the data stored.
Any unintended changes to the stored data as the result of a storage, retrieval or processing operation, including malicious intent, unauthorized access, unexpected hardware failure, and human error, is failure of data integrity. These unintended changes can be as benign as a single pixel in an image appearing with a different color than was originally recorded to the loss of vacation pictures or data from a business-critical database.
There are several reports about methods for breaking hash functions. For example, Laurent and Perrin presented a paper “SHA-1 is a shamble” at the USENIX Security Conference in 2020. The paper is available at https://eprint.iacr.org/2020/014.pdf (accessed on 15 Nov. 2022). The reports about the methods for breaking hash functions are often presented in the context of the implications for security and consistency of data.
Wang Zeguo et al. “Variational quantum attacks threaten advanced encryption standard based symmetric cryptography” at Science China Information sciences, Beijing, vol. 65, no. 10, 18 Jul. 2022, https://doi.org/10.1007/s11432-022-3511-5, discloses a variational quantum attack algorithm for classical advanced encryption standard symmetric cryptography. However, this paper is silent about how to generate a required hash value in a hash generator. This paper also does not disclose a comparator for comparing a created hash values with a true hash value.
Winternitz Robert S: “A Secure One-Way Hash Function Built from DES”, 1984 IEEE Symposium on Security and Privacy, Los Alamitos, CA, US, published on 29 Apr. 1984, DOI: 10.1109/SP.1984.10027 describes a secure one-way hash function built from DES. Winternitz et al. is aiming to address a problem of insecure one-way hash function built from the Data Encryption Standard (DES). However, this paper is silent about how to modify a variable string in a document comprising the variable string and a fixed string. This paper neither discloses how to construct a Hamiltonian based on the variable string nor how to encode the variable string into a quantum circuit comprising a plurality of qubits. Winternitz et al. is silent about how to generate a required hash value in a hash generator and how compare a created hash values with a true hash value in a comparator. Winternitz does not disclose how to determine an overlap between the generated hash value and a true hash value and therefore determine the variable string as described by the present document.
A comparison between classical Tensor Networks (TN) and TN-inspired quantum circuits in the context of Machine Learning on highly complex, simulated Large Hadron Collider (LHC) data is described in Araz et al. “Classical versus Quantum: comparing Tensor Network-based Quantum Circuits on LHC data” at https://arxiv.org/abs/2202.10471. However, this paper is silent about how to generate a required hash value in a hash generator. This paper also does not disclose a comparator for comparing a created hash values with a true hash value.
There are, however, cases in which it is desired to modify the stored data from which the hash function was generated but to maintain the original generated hash value. This has not been possible in the art, but the advent of quantum computing suggests that this may be practical in the future.
The application of quantum computing offers the potential to overcome this challenge using optimization algorithms. Currently we are in the noisy intermediate-scale quantum (NISQ) era at which real-life quantum computing systems are characterized by a number of restrictions, such as a low number of qubits, low fidelity, and shallow quantum circuits. Under these restrictions, various classical-quantum hybrid algorithms have been proposed, including the variational quantum algorithm (VQA) and the Quantum Approximate Optimization Algorithm (QAOA). VQA and QAOA quantum-classical hybrid algorithms have been found to have significant advantages in solving combinatorial optimization and Hamiltonian ground state problems.
In a preferred embodiment the present invention is a computer-implemented method to attack hash functions and to generate a hash value from a modified document.
The method and system described in this document enables part of the data in a document, such as a variable string, to be modified in the document but still produce the same hash value when passed to a hash function generator.
The method for modifying the variable string in a document and generating a required hash value will now be outlined. The document comprises both the variable string and a fixed string. The term “string” is used in this context to indicate a sequence of characters in the document. The method comprises constructing a Hamiltonian based on the variable string and then encoding the variable string into a quantum circuit comprising a plurality of qubits. A hash function generator generates a hash value from the fixed string and the output of the quantum circuit, and a comparator compares the created hash values with a true hash value. An overlap is then determined between the generated hash value and a true hash value. On reaching a zero-overlap (or a substantially zero overlap) value, the parameters of the quantum circuit are determined for the variable string, otherwise optimizing the parameters of the quantum circuit.
In one aspect, the step of optimizing the parameters of the quantum circuit uses a classical optimization algorithm, such as, but not limited to, a gradient descent method.
In one aspect, the encoding of the variable string into the quantum circuit is one of encoding into a parameterized quantum circuit or a tensor network.
In one aspect, the encoding of the variable string into the quantum circuit is one of encoding into a parameterized quantum circuit or a tensor network and the constructing of the Hamiltonian comprises creating a graph with a plurality of nodes representing the bits of the variable string.
In one aspect, the graph is a 3-regular graph.
In one aspect, the determining of the overlap is carried out by calculating the Hamming distance between the generated hash value and the true hash value.
The method may comprise measuring a superposition of the variable string by a measurement device. The measuring may be implemented by a quantum state tomography individually for the plurality of the qubits.
The quantum state tomography may determine a quantum state of each qubit of the plurality of the qubits.
The variable string may be assigned to a non-orthogonal quantum state of one qubit of the plurality of the qubits.
A system for modifying a variable string in a document and generating a required hash value is also disclosed. The system comprises at least one input/out device for inputting the document, at least one quantum circuit for encoding the variable string, a hash function generator for creating hash values from an output of the at least one quantum circuit with a plurality of qubits, a comparator for comparing the created hash values with a true hash value, and at least one optimization element for adjusting the parameters of the quantum circuit.
The quantum circuit is implemented as one of a quantum annealer or a quantum gate computer and can be implemented in a quantum computer or simulated in a classical computer.
The system may further comprise a measurement device for measurement a superposition of the variable string by a quantum state tomography implemented individually for the plurality of the qubits.
For a more complete understanding of the present invention and the advantages thereof, reference is now made to the following description and the accompanying drawings, in which:
The invention will now be described on the basis of the drawings. It will be understood that the embodiments and aspects of the invention described herein are only examples and do not limit the protective scope of the claims in any way. The invention is defined by the claims and their equivalents. It will be understood that features of one aspect or embodiment of the invention can be combined with a feature of a different aspect or aspects and/or embodiments of the invention.
A graphics processing unit 35 for processing vector calculations and a field programmable gate array (FGPA) 40 for control logic that can also be connected to the central processing unit 20. A quantum processor 50 (also termed quantum accelerator) is connected to the classical central processing unit 20. In an alternative embodiment, the quantum processor 50 is emulated on a classical processor.
In one implementation of the computing system 10, the quantum processor 50 is a gate-based quantum processor. Alternatively, the quantum processor 50 could be a quantum annealing processor. It is also possible to use a quantum processor 50 which is a quantum annealing system. The computing system 10 is connected to a computer network 60, such as the Internet. It will be appreciated that the computing system 10 of
The method is illustrated in
In a first step S210, the document 300 with both the fixed string 310 and the variable string 320 is input into the system 100 through one (or more) of the input/output devices 30 and, in step S220, a Hamiltonian is constructed whose ground state corresponds to the variable string 320. The construction of the Hamiltonian is outlined below.
The variable string 320 is encoded in step S220 into an adjustable quantum state by a quantum circuit which is in this case a parameterized quantum circuit (PQC) 330 (which is also known as an ansatz). PQC output 335 of the parameterized quantum circuit 320 is a superposition of values and read out in step S230 by a measurement device 340 and passed in step S240 as an input to a hash function generator 350 in the (classical) central processing unit 20. The hash function generator 350 is not limited to particular hash functions. Non-limiting examples of hash functions include folding, division hashing, multiplicative hashing as well as other known hashing functions and indeed customized hash functions combining several known hash functions.
The hash function generator 350 takes as its input both the set of values from the PQC output 335 and the fixed string 310 and produces in step S250 a set of hash values which can be compared with a true hash value 370 in a comparator 360. The comparator 360 produces a Hamiltonian of the Hamming distances between the true hash value 370 and the set of hash values produced by the hash function generator 350.
The Hamiltonian is forward to a classical optimization algorithm 380 in the classical central processing unit 20. The optimization algorithm is used to adjust in step S270 the input parameters of the parameterized quantum circuit 330 to arrange for the output of the hash function generator 350 to have an overlap with the true hash value 370 in step S280. The overlap occurs when the Hamming distance is zero (or very close to zero). At this point, the parameters of the parameterized quantum circuit 330 are known and the output of the hash function generator 350 can be used to create from the variable string 320 a hash value which is the same as the true hash value 370 despite having a different variable string 320.
An example of the implementation of the VQAA will serve to illustrate this method. The variable string 320 is encoded in the Hamiltonian (step S220).
As noted above, the variational process (i.e., step S270) is started to find the lowest energy of the Hamiltonian. This is done by using each bit in the variable string 320 as a node to construct regular graphs. It is possible, for an 8-node network, to construct an n-regular (where n=1, 2 . . . , 7) graph. In practice, it is chosen that n=3 (although this is not limiting of the invention and other options may be also possible). It will be appreciated that the number of nodes is not limiting of the invention.
The known variable string 320 is encoded in the step S220 into the Hamiltonian ground state and this will now be described. Each of the eight bits of the variable string 320 is used as a node to construct an 8-node 3-regular graph. The value of the i-th node is denoted by V(i), which is the value of the i-th bit. If there is a pair of nodes (i, j) in the graph that are connected, the term wijZiZj is added into the Hamiltonian, where Z is the Pauli-Z operator, i, j ∈ {0, 1, . . . , 7}. The coefficient wij is determined by V (i) and V (j): wij=+1 if V(i)=V(j), and −1 otherwise. Additionally, the single-qubit terms ti Zi are added, such that ti=0.5 if V(i)=1, and −0.5 if V(i)=0. The resulting 3-regular graph shown in
The cost function E(β) is the expectation value of the Hamiltonian where |β> is the superposition of the variable string 320. The parameterized quantum circuit 310 is the ansatz shown in
On another implementation, the Hamiltonian is the Hamming distance between the bit strings.
The variational process starts to find the Hamiltonian with the lowest energy. This Hamiltonian with the lowest energy state is expected to contain the corresponding key. The superposition of the variable string 320 is measured in step S240 and the result is forwarded in step 260 to a classical optimization algorithm 380 running on a classical central processing unit 20 to adjust the input parameters of the quantum circuit 330. This variational process (adjusting the input parameters of the quantum circuit 330) continues until a zero overlap with the true hash value 370 is found.
In one non-limiting implementation, the classical optimization algorithm with best results is the Gradient Descent method with cut-off condition of −9, i.e., when the expectation of the Hamiltonian is less than −9, the first excited energy. GD is restarted when the norm of the gradient is lower than 0.8, the moment in which the parameters are randomly initialized. The learning rate is set to 1.08.
The VQAA can be improved in terms of better classical optimization algorithm, such as Adaptive Moment Estimation Algorithm (ADAM), better ansatz (less sequential ansatz to increase entanglement, in search) and better initial parameters (learning rate, cut-off condition, initial state).
In a further aspect, the method can be implemented using a quantum annealer, such as those from D-Wave, as the quantum processor 50. The quantum annealer is used to generate the variational states for the qubits (note: no non-orthogonal basis in this case). The variational parameters are the couplings of the D-Wave Hamiltonian and other annealing parameters (such as, but not limited to annealing schedule, extra magnetic fields).
A further aspect is the use of non-orthogonal qubit states: Current NISQ quantum devices have a limited number of qubits (as noted above) and are therefore only able to handle a small number of qubit variables. For current variational quantum algorithms in gate-based quantum computers, one qubit of the quantum computer is typically assigned to one bit variable of the cost function. The largest gate-based quantum computer as of today, built by IBM, has 433 superconducting qubits. Therefore, with the current approach, it is possible to optimize the cost functions up to 433 bits.
Current variational quantum optimization algorithms are based on e.g., Variational Quantum Eigensolvers (VQE). This approach fits very well into NISQ devices but is very hard to scale up to those cost functions involving many bits. This is because, in the current approach, each bit variable in the cost function corresponds to one qubit in the NISQ device. The NISQ devices have a limited number of qubits, and this limited number limits the applicability to large, realistic cost functions. This is limiting in cybersecurity applications.
One idea to overcome the problem of limited number of qubits is to modify the assignment between the quantum state of each individual qubit and the corresponding variable in the cost function. The method set out above has the correspondence as follows: |0→0, 11→1. In other words, a measurement in the 0/1 basis provides immediately the value of the bit variable. It is possible to extend the representability of classical discrete variables using different non-orthogonal states of one qubit. In particular, p maximally orthogonal states of one qubit could represent the values of a classical variable q=0, 1, . . . , p−1. The maximally orthogonal states of one qubit correspond to Platonic solids inside of the Bloch sphere of the qubit, as illustrated in
Using this Bloch sphere representation, it is possible to fit much larger optimization problems in variational quantum algorithms in the NISQ devices for cybersecurity attacks. As an example, for a processor of 433 qubits, with 40 states per qubit, it would be possible to optimize cost functions of up to 17,320-bit variables.
To implement an improved variational optimization algorithm such as in VQAA using the qubit states as in
Other non-orthogonal encodings can also be used, including polyhedral, discretized qubit angles and continuum optimization.
It will be further appreciated that the use of the quantum variational circuits 320 could be replaced by tensor networks. In this case, the variable string 320 is encoded into the tensor network and the values of the tensors in the tensor networks are updated using gradient descent.
The search can also be parallelized by using more than one quantum variational circuit 330 to search for minima.
One further method of accelerating the search for the key is to identify the local minima (rather than trying to find the global minima) and subsequently test which one(s) of the local minima result in the key.
The foregoing description of the preferred embodiment of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention. The embodiment was chosen and described to explain the principles of the invention and its practical application to enable one skilled in the art to utilize the invention in various embodiments as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims appended hereto and their equivalents. The entirety of each of the aforementioned documents is incorporated by reference herein.
Number | Date | Country | Kind |
---|---|---|---|
23382108.1 | Feb 2023 | EP | regional |
This application is a continuation-in-part of U.S. patent application Ser. No. 18/106,555 filed on Feb. 7, 2023, and claims priority to and benefit of European Patent Application No EP 23 38 21 08 filed on Feb. 7, 2023.
Number | Date | Country | |
---|---|---|---|
Parent | 18106555 | Feb 2023 | US |
Child | 18375714 | US |