Quantum computing may solve certain problems much faster than classical devices. Examples of such problems include Shor's algorithm, unstructured search problems, and simulation of quantum mechanical systems.
Advances in quantum algorithms that offer speed-up over classical devices have been described at a very high-level of abstraction, and practical estimates of quantum circuits or other resources needed to perform quantum algorithms have not been provided. To estimate the required resources for a given quantum algorithm, the high-level representation of the quantum algorithm needs to be translated (or compiled) to a low-level set of operations that can be realized using standard gate sets, such as the Clifford+T gate set. In addition, to ensure that the resulting low-level set of operations perform the quantum algorithm within a certain specified tolerance, the errors generated during the translation or compilation of the algorithm need to be managed.
In one example, the present disclosure relates to a method for decomposing a quantum algorithm into quantum circuits. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm until the quantum algorithm is fully decomposed into the quantum circuits, where the automatically performing the step-wise decomposition results in a set of approximation errors and a set of parameters to instantiate at least a subset of the quantum circuits corresponding to the quantum algorithm, such that an overall approximation error caused by the automatically performing the step-wise decomposition is maintained below a specified threshold approximation error.
In another example, the present disclosure relates to a method for decomposing a quantum algorithm into quantum circuits. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits. The method may further include using the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.
In yet another example, the present disclosure relates to a computer-readable medium comprising computer executable instructions for a method. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits, where the step-wise decomposition into the subroutines is implemented via a quantum phase estimation (QPE) process. The method may further include using the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Furthermore, the claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure.
The present disclosure is illustrated by way of example and is not limited by the accompanying figures, in which like references indicate similar elements. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale.
Examples described in this disclosure relate to generating quantum computing circuits by distributing approximation errors in a quantum algorithm. As the field of quantum computing approaches a state where thousands of operations can be carried out on tens and soon even hundreds of qubits, a supporting software stack is required. When compiling algorithms for fault-tolerant quantum computing, some operations must be approximated while keeping the overall error beneath a user or application-defined threshold. As a result, several choices for distributing the error among subroutines emerge; the goal is to choose a good, or even the best, one. To this end, the present disclosure describes an error management module which can be integrated into any quantum software framework.
The job of a quantum program compiler is to translate a high-level description of a given quantum program to hardware-specific machine-level instructions. During the compilation process, one of the requirements may be to optimize as much as possible in order to reduce the overall depth of the resulting circuit to keep the overhead of the required quantum error correction manageable. Optimizations include quantum versions of constant-folding (such as merging consecutive rotation gates, or even additions by constants) and recognition of compute/action/uncompute sections to reduce the number of controlled gates. To allow such optimizations, multiple layers of abstractions may be used instead of compiling directly down to low-level machine instructions, which would make it impossible to recognize, e.g., two consecutive additions by constants. As an example, even canceling a gate followed by its inverse becomes computationally hard, or even impossible once continuous gates have been approximated.
To translate an intermediate representation to the next lower level of abstraction, a set of decomposition rules is used, some of which introduce additional errors which can be made arbitrarily small at the cost of an increasing circuit size or depth, which in turn implies a larger overhead when applying quantum error correction. It is therefore of great interest to choose these error tolerances such that the computation succeeds with high probability given the available resources (number and quality of qubits). At each level of abstraction, the compiler introduces additional accuracy parameters which must be chosen such that: (1) the cost to implement subroutines is automatically computed as a function of precision, (2) the overall error lies within the specifications of the algorithm, and (3) the implementation cost is as low as possible while the first constraint is satisfied.
One example solution manages these two constraints by expressing the final cost metric, which in one example is the total number of elementary quantum gates used, in terms of the costs of implementing all of its different subroutines. These subroutine costs are first assumed to be parameters, leading to an (in general, non-convex) optimization problem when trying to minimize the overall metric, while still guaranteeing a given overall approximation error. In an example implementation, the optimization problem is solved by using simulated annealing, starting from an initially random assignment of parameters.
While it is not possible to perform error correction over a continuous set of quantum operations (gates), this can be achieved over a discrete gate set such as the aforementioned Clifford+T gate set. As a consequence, certain operations must be approximated using gates from this discrete set. An example is the operation which achieves a rotation around the z-axis,
To implement such a gate over Clifford+T, synthesis algorithms can be used. Given the angle θ of this gate, such a rotation synthesis algorithm will produce a sequence of (log εR−1) Clifford+T gates which approximate RzΘ up to a given tolerance εR. In most error correction protocols, the T-gate is the most expensive operation to realize, as it cannot be executed natively but requires a distillation protocol to distill many noisy magic states into one good state, which can then be used to apply the gate. As a consequence, it may be advantageous to reduce the number of these T-gates as much as possible in order to allow executing a certain quantum computation.
As described, the job of a quantum program compiler is to translate a high-level description of a given quantum program to hardware-specific machine-level instructions. As in classical computing, such compilation frameworks can be implemented in a hardware-agnostic fashion by introducing backend-independent intermediate representations of the quantum code.
During the compilation process, it is useful to optimize as much as possible in order to reduce the overall depth of the resulting circuit to keep the overhead of the required quantum error correction manageable. Optimizations include quantum versions of constant-folding (such as merging consecutive rotation gates, or even additions by constants) and recognition of compute/action/uncompute sections to reduce the number of controlled gates. To allow such optimizations, it may be advantageous to introduce multiple layers of abstractions instead of compiling directly down to low-level machine instructions, which may make it impossible to recognize, e.g., two consecutive additions by constants. This is because even canceling a gate followed by its inverse becomes computationally hard once continuous gates have been approximated.
To translate an intermediate representation to the next lower level of abstraction, a set of decomposition rules is used, some of which introduce additional errors which can be made arbitrarily small at the cost of an increasing circuit size or depth, which in turn implies a larger overhead when applying quantum error correction.
In one example, the time-evolution of a closed quantum system can be described by a unitary operator. As a consequence, each time-step of an example quantum computer can be described by a unitary matrix of dimension 2n×2n (excluding measurement), where n denotes the number of quantum bits (qubits). When decomposing such a quantum operation U into a sequence of lower-level operations UM . . . U1, the resulting total error can be estimated from the individual errors ε of the lower-level gates using the following Lemma 1. Given a unitary decomposition of U such that U=UM·UM-1 . . . U1 and unitaries Vi which approximate the unitary operators Ui such that ∥Vi−Ui|<εi ∀i, the total error can be bounded as follows:
The proof of Lemma 1 is by induction using the triangle inequality and submultiplicativity of ∥⋅∥ with ∥U∥≤1. The base case M=2 can be proven as follows:
∥U2U1−V2V1∥=∥U2U1−U2V1+U2V1−V2V1∥≤∥U2(U1−V1)∥+∥(U2−V2)V1∥≤ε1+ε2.
The induction step P(M−1)→P(M) can be shown in a similar fashion:
Note that this also holds for subunitaries {tilde over (V)}i, meaning that ∥{tilde over (V)}i∥≤1. Therefore, in this example, one can safely ignore measurement and the resulting overall error can only be smaller than estimated. In addition, measurements are rare operations and as such, the effect of this approximation on the choice of the individual εi is minor.
Using only Lemma 1 in the compilation process to automatically optimize the individual εi would make the resulting optimization problem infeasibly large. In addition, the number of parameters to optimize would vary throughout the optimization process since the number of lower-level gates changes when implementing a higher-level operation at a different accuracy, which in turn changes the number of distinct εi. To address these two issues, Theorem 4, which generalizes Lemma 1, is introduced. First, a few definitions concerning the Theorem 4 are provided.
Definition 1: Let VM(ε) . . . V1 be an approximate decomposition of the target unitary U such that ∥VM(ε) . . . V1∥≤ε. A set of subroutine sets (U, ε)={S1, . . . , SK} is a partitioning of subroutines of U if ∀i∃! k: Vi∈Sk and we denote by S(V) the function which returns the subroutine set S such that V∈S.
Such a partitioning will be used to assign to each Vi the accuracy εS(V
With these definitions in place, one can now generalize Lemma 1. Theorem 4: Let (U, ε)={S1, . . . , SK} be a cost-respecting partitioning of subroutines for a given decomposition of U w.r.t. the cost measure C(U, ε) denoting the number of elementary gates required to implement U. Then the cost of U can be expressed in terms of the costs of all subroutine sets S∈(U, εU) as follows
where fS(εU) gives the number of subroutines in the decomposition of U that are in S, given that the decomposition of U would introduce error εU if all subroutines were to be implemented exactly and εS denotes the error in implementing subroutines that are in S.
In this example, the cost C(U, ε) can be decomposed into a sum of the costs of all subroutines Vi. Furthermore, since εV=εS ∀V∈S,
and fS(εU):=|{i: Vi∈S}| ∀S∈(U, εU).
To prove that the overall error remains bounded by ε, let Ũ denote the unitary which is obtained by applying the decomposition rule for U with accuracy εU, i.e., ∥U−Ũ∥≤εU (where all subroutines are implemented exactly). Furthermore, let V denote the unitary which will ultimately be executed by the quantum computer, i.e., the unitary which is obtained after all decomposition rules and approximations have been applied. By the triangle inequality and Lemma 1,
∥U−V∥≤∥U−Ũ∥+∥Ũ−V∥≤εU+(U,ε
1. fS(ε)=1 ∀S∈(U, ε), |S(U, ε)|=#gates needed to implement U:
2. fS(ε)=#gates needed to implement U ∀S∈(U, ε), |S(U, ε)|=1:
Therefore, this solves the first issue of Lemma 1: In a practical implementation, the size of the set (U, ε) can be adaptively chosen such that the resulting optimization problem which is of the form
for a user-defined or application-defined over-all tolerance ε, can be solved using a reasonable amount of resources.
Moreover, the costs of optimization can be reduced by initializing the initial trial parameters εS
The second issue with a direct application of Lemma 1 is the varying number of optimization parameters, which is also resolved by Theorem 4. Of course, one can simply make (U, ε) tremendously large such that most of the corresponding fS(ε) are zero. This, however, is a rather inefficient solution which would also be possible when using Lemma 1 directly.
A better approach may be to inspect (U, ε) for different values of ε and to then choose A auxiliary subroutine sets S1a, . . . , SAa such that each additional subroutine Vka which appears when changing ε (but is not a member of any S of the original (U, ε)) falls into exactly one of these sets.
The original set (U, ε) can then be extended by these auxiliary sets before running the optimization procedure. Again, the level of granularity of these auxiliary sets and thus the number of such sets A can be tuned according to the resources that are available to solve the resulting optimization problem.
In step 110, inputs related to a quantum algorithm A, overall target error ϵ, and cost metric M may be received. As part of this step, the system may also access a database of available decomposition rules and compiler applying rules 115. Next, in step 120, the system may decompose the quantum algorithm into subroutines, with parameters and corresponding approximation errors. As an example, the quantum algorithm may be decomposed using the Trotter decomposition process. Alternatively, an approach based on truncated Taylor series may also be used for the decomposition process. As an example, as shown in
One example of decomposing a quantum algorithm includes the decomposition of the transverse field Ising model into subroutines. Thus, as shown in
Another example of a schema that tracks the lists of approximation errors when recursively decomposing a top-level quantum algorithm, where the top-level quantum algorithm is a linear combination of unitaries is shown in
Next, in step 130, the system may determine whether the subroutines comprising the quantum algorithm A have been fully decomposed. If not, then the system may continue to iterate until the subroutines have been fully decomposed.
Next, in step 140, the system may generate the optimization problem to achieve the target error ϵ from the computed set of parameters and the approximation errors.
Next, in step 150, the system may solve the optimization problem by minimizing the cost metric M. As part of this step, the system may obtain a heuristic to solve the optimization problem for the specified cost metric M 155. In one example, the heuristic may be simulated annealing. In one example, the optimization problem may be solved in two steps or modes. The first mode may be active whenever the current overall error is larger than the target accuracy ϵ. In this mode, the system may perform annealing until the target accuracy has been reached. At this point, the second mode may become active. In the second mode, the system may perform annealing-based optimization to reduce the circuit cost function. After each such step, the system may switch back to the error-reduction mode if the overall error increased above the target accuracy ϵ. Table 1 below provides a high-level description of an example annealing-based algorithm to solve the optimization problem as part of step 150.
Next, in step 160, the system may instantiate parameters in all subroutines with the solution to the optimization problem. In a preferred embodiment, the subroutines are unitary operations, each of which depends on one parameter or a limited number of parameters and each of which operates on a limited number of qubits. The solution computed in step 150 using heuristic 155 is then a setting of said parameters to specific values which commonly are real numbers in some interval. In another embodiment, the subroutines can involve unitary operations that depend on several parameters and which operate on a growing number of qubits. Examples for such embodiments include, but are not restricted to, reflection operation around states that are modeled parametrically. Other examples for such embodiments include rotations on subspaces, where the rotation angles are parameters. Other examples for such embodiments include single qubit unitary rotations and controlled single qubit unitary rotations.
Finally, in step 170, after the parameters have been used in step 160 to determine concrete unitary operations over the complex numbers. Doing so will set all subroutines to specific unitary operations that do no longer depend on parameters and which can then be implemented by a quantum computer hardware. The collection of subroutines will then be assembled into one program which is a quantum circuit for the instruction-level representation of the algorithm A. The system may then output at least one quantum circuit to compute the algorithm A with the approximation error of at most ε and execute said circuit on a target quantum computer.
where Jij are coupling constants and Γi denotes the strength of the transverse field at location i. σxi and σzi are the Pauli matrices, i.e.,
acting on the i-th spin.
The sum over i,j loops over all pairs of sites (i,j) which are connected. In this example, this corresponds to nearest-neighbor sites on a one-dimensional spin chain (with periodic boundary conditions) of length N. Given an approximation to the ground state ψ0 of Ĥ, the ground state energy E0 may be determined such that
Ĥψ
0
=E
0ψ0.
In this example, quantum phase estimation (QPE) can be used to achieve this task: If the overlap between ψ0 and {tilde over (ψ)}0 is large, a successful application of QPE followed by a measurement of the energy register will collapse the state vector onto ψ0 and output E0 with high probability (namely p=|{tilde over (ψ)}0|ψ0|2). There are various ways to implement QPE, but the simplest to analyze is the coherent QPE followed by a measurement of all control qubits.
This procedure requires 16π/εQPE applications of (the controlled version of) the time-evolution operator Uδ=exp (−iδĤ) for a success probability of ½, where εQPE denotes the desired accuracy (bit-resolution of the resulting eigenvalues). Using a Trotter decomposition of Uδ, i.e., for large M
allows to implement the global propagator Uδ using a sequence of local operations. These consist of z- and x-rotations in addition to nearest-neighbor CNOT gates to compute the parity (before the z-rotation and again after the z-rotation to uncompute the parity). The rotation angles are
for z- and x-rotations, respectively. The extra factor of two arises from the definitions of the Rz and Rx gates.
In order to apply error correction to run the resulting circuit on actual hardware, these rotations can be decomposed into a sequence of Clifford+T gates using rotation synthesis. Such a discrete approximation up to an accuracy of εR features (log εR−1) T-gates, where even the constants hidden in the notation were explicitly determined.
The first compilation step is to resolve the QPE library call. In this example, the cost of QPE applied to a general propagator U is
where cU denotes the controlled version of the unitary U, i.e.,
(cU:=00⊗1+11⊗U).
Furthermore, the chosen tolerances must satisfy
The next step, in this example, is to approximate the propagator using a Trotter decomposition. Depending on the order of the Trotter formula being used, this yields
In the experiments section,
is chosen as an example. Finally, approximating the (controlled) rotations in cU1 and cU2 by employing rotation synthesis,
C(cUi,εU
Collecting all of these terms and using that C(cU1,⋅)=C(cU2,⋅) yields
Next, the implementation details and numerical results of the example error management module are described. While the optimization procedure becomes harder for fine-grained cost and error analysis, the benefits in terms of the cost of the resulting circuit are substantial.
A two-mode annealing procedure for optimization is described, in which two objective functions are reduced as follows: The first mode is active whenever the current overall error is larger than the target accuracy ε. In this case, it performs annealing until the target accuracy has been reached. At this point, the second mode becomes active. It performs annealing-based optimization to reduce the circuit cost function. After each such step, it switches back to the error-reduction subroutine if the overall error increased above ε.
Both annealing-based optimization modes follow the same scheme, which consists of increasing/decreasing a randomly chosen εi by multiplying/dividing it by a random factor f∈(1,1+δ], where δ can be tuned to achieve an acceptance rate of roughly 50%. Then, the new objective function value is determined, followed by either a rejection of the proposed change in εi or an acceptance with probability
p
accept=min(1,e−βΔE),
where β=T−1 and T denotes the annealing temperature. This means, in particular, that moves which do not increase the energy, i.e., ΔE≤0 are always accepted. The pseudo-code of this algorithm can be found in Table 2 provided later.
Using the example of a transverse-field Ising model which was described earlier, the benefits of the error management module are determined by running two experiments. The first aims to assess the difference between a feasible solution, i.e., values εi which produce an overall error that is less than the user-defined tolerance, and an optimized feasible solution. In the first case, the first mode is run only until a feasible solution is obtained and in the latter, both modes are employed, as outlined above.
Finally, the robustness of the optimization procedure by introducing redundant parameters, i.e., additional rotation gate synthesis tolerances εR
In addition, the time it takes to find an initial feasible solution will grow as well. As an example,
With continued reference to
In conclusion, the present disclosure relates to a method for decomposing a quantum algorithm into quantum circuits. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm until the quantum algorithm is fully decomposed into the quantum circuits, where the automatically performing the step-wise decomposition results in a set of approximation errors and a set of parameters to instantiate at least a subset of the quantum circuits corresponding to the quantum algorithm, such that an overall approximation error caused by the automatically performing the step-wise decomposition is maintained below a specified threshold approximation error.
Each of the quantum circuits may be a gate that can be implemented using a quantum processor. Each of the quantum circuits may be a fault-tolerant logical gate. Each of the quantum circuits may be implemented as a protected operation on encoded quantum data.
The method may further comprise using an optimization problem minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below the specified threshold approximation error. The optimization problem may encode a condition to meet the overall approximation error and a condition to minimize the cost metric associated with the quantum circuits. The optimization problem may be solved using a heuristic method to select parameters of the quantum circuits. The optimization problem may be solved by choosing a random initial assignment of approximation errors and parameters. The solution to the optimization problem may be computed using simulated annealing.
In another example, the present disclosure relates to a method for decomposing a quantum algorithm into quantum circuits. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits. The method may further include using the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.
Each of the quantum circuits may be implemented as a protected operation on encoded quantum data.
The minimizing the cost metric may further comprise solving an optimization problem using a heuristic method to select parameters of the quantum circuits. The optimization problem may be solved by choosing a random initial assignment of approximation errors and parameters.
In yet another example, the present disclosure relates to a computer-readable medium comprising computer executable instructions for a method. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits, where the step-wise decomposition into the subroutines is implemented via a quantum phase estimation (QPE) process. The method may further include using the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.
The QPE process may implement a time evolution of an Ising model in a transverse field. The QPE process may be applied to a task of evolving a quantum-mechanical system that is initialized in a given state for a specified total duration of time. The total duration of time may be divided into subintervals. The total duration of time may be divided into subintervals using Trotter method or Trotter-Suzuki method. The total duration of time may be divided into subintervals using a Linear Combination of Unitaries (LCU) method. The LCU method may be implemented using state preparation circuits.
It is to be understood that the methods, modules, and components depicted herein are merely exemplary. Alternatively, or in addition, the functionally described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include Field-Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), Application-Specific Standard Products (ASSPs), System-on-a-Chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc. In an abstract, but still definite sense, any arrangement of components to achieve the same functionality is effectively “associated” such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality can be seen as “associated with” each other such that the desired functionality is achieved, irrespective of architectures or inter-medial components. Likewise, any two components so associated can also be viewed as being “operably connected,” or “coupled,” to each other to achieve the desired functionality.
The functionality associated with some examples described in this disclosure can also include instructions stored in a non-transitory media. The term “non-transitory media” as used herein refers to any media storing data and/or instructions that cause a machine to operate in a specific manner. Exemplary non-transitory media include non-volatile media and/or volatile media. Non-volatile media include, for example, a hard disk, a solid-state drive, a magnetic disk or tape, an optical disk or tape, a flash memory, an EPROM, NVRAM, PRAM, or other such media, or networked versions of such media. Volatile media include, for example, dynamic memory, such as, DRAM, SRAM, a cache, or other such media. Non-transitory media is distinct from, but can be used in conjunction with transmission media. Transmission media is used for transferring data and/or instruction to or from a machine. Exemplary transmission media, include coaxial cables, fiber-optic cables, copper wires, and wireless media, such as radio waves.
Furthermore, those skilled in the art will recognize that boundaries between the functionality of the above described operations are merely illustrative. The functionality of multiple operations may be combined into a single operation, and/or the functionality of a single operation may be distributed in additional operations. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments.
Although the disclosure provides specific examples, various modifications and changes can be made without departing from the scope of the disclosure as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present disclosure. Any benefits, advantages, or solutions to problems that are described herein with regard to a specific example are not intended to be construed as a critical, required, or essential feature or element of any or all the claims.
Furthermore, the terms “a” or “an,” as used herein, are defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an.” The same holds true for the use of definite articles.
Unless stated otherwise, terms such as “first” and “second” are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements.
This application claims the benefit of U.S. Provisional Application No. 62/676,519, filed May 25, 2018, titled “GENERATING QUANTUM COMPUTING CIRCUITS BY DISTRIBUTING APPROXIMATION ERRORS IN A QUANTUM ALGORITHM,” the entire contents of which are hereby incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62676519 | May 2018 | US |