GENERATING QUANTUM COMPUTING CIRCUITS BY DISTRIBUTING APPROXIMATION ERRORS IN A QUANTUM ALGORITHM

Information

  • Patent Application
  • 20190362270
  • Publication Number
    20190362270
  • Date Filed
    June 29, 2018
    6 years ago
  • Date Published
    November 28, 2019
    4 years ago
Abstract
Methods for generating quantum computing circuits by distributing approximation errors in a quantum algorithm are described. A method includes decomposing a quantum algorithm into quantum circuits. The method includes using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm until the quantum algorithm is fully decomposed into the quantum circuits, where the automatically performing the step-wise decomposition results in a set of approximation errors and a set of parameters to instantiate at least a subset of the quantum circuits corresponding to the quantum algorithm, such that an overall approximation error caused by the automatically performing the step-wise decomposition is maintained below a specified threshold approximation error.
Description
BACKGROUND

Quantum computing may solve certain problems much faster than classical devices. Examples of such problems include Shor's algorithm, unstructured search problems, and simulation of quantum mechanical systems.


Advances in quantum algorithms that offer speed-up over classical devices have been described at a very high-level of abstraction, and practical estimates of quantum circuits or other resources needed to perform quantum algorithms have not been provided. To estimate the required resources for a given quantum algorithm, the high-level representation of the quantum algorithm needs to be translated (or compiled) to a low-level set of operations that can be realized using standard gate sets, such as the Clifford+T gate set. In addition, to ensure that the resulting low-level set of operations perform the quantum algorithm within a certain specified tolerance, the errors generated during the translation or compilation of the algorithm need to be managed.


SUMMARY

In one example, the present disclosure relates to a method for decomposing a quantum algorithm into quantum circuits. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm until the quantum algorithm is fully decomposed into the quantum circuits, where the automatically performing the step-wise decomposition results in a set of approximation errors and a set of parameters to instantiate at least a subset of the quantum circuits corresponding to the quantum algorithm, such that an overall approximation error caused by the automatically performing the step-wise decomposition is maintained below a specified threshold approximation error.


In another example, the present disclosure relates to a method for decomposing a quantum algorithm into quantum circuits. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits. The method may further include using the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.


In yet another example, the present disclosure relates to a computer-readable medium comprising computer executable instructions for a method. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits, where the step-wise decomposition into the subroutines is implemented via a quantum phase estimation (QPE) process. The method may further include using the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.


This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Furthermore, the claimed subject matter is not limited to implementations that solve any or all disadvantages noted in any part of this disclosure.





BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure is illustrated by way of example and is not limited by the accompanying figures, in which like references indicate similar elements. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale.



FIG. 1 shows a flow chart of a method for distributing approximation errors as part of generating quantum circuits corresponding to a quantum algorithm in accordance with one example;



FIG. 2 shows a diagram of a decomposition of a quantum algorithm into subroutines S1, S2, . . . Sn with an approximation error of ϵ1, ϵ2, . . . ϵn, respectively in accordance with one example;



FIG. 3 shows a quantum circuit of a quantum phase estimation applied to a time evolution operator U=e−itH, where H is the Hamiltonian of the quantum system being simulated, in accordance with one example;



FIG. 4 is an abstract depiction of the compilation process for a quantum phase estimation (QPE) applied to a given unitary U in accordance with one example;



FIG. 5 shows a diagram of an example of a schema that tracks the lists of approximation errors when recursively decomposing a top-level quantum algorithm;



FIG. 6 depicts the costs of the resulting circuit as a function of the desired overall accuracy ε in accordance with one example;



FIG. 7 shows the difference between the circuit costs when using just two such parameters (e.g., setting εRTrotter) versus using all three parameters in accordance with one example;



FIG. 8 shows an example of the increase in the fraction of the circuit cost as the number of parameters used for optimization grows;



FIG. 9 shows an example of how the runtime for the annealing procedure increases with an increase in the number of redundant parameters used; and



FIG. 10 shows an example system environment for implementing aspects of the components and the methods related to generating quantum computer circuits by distributing approximation errors.





DETAILED DESCRIPTION

Examples described in this disclosure relate to generating quantum computing circuits by distributing approximation errors in a quantum algorithm. As the field of quantum computing approaches a state where thousands of operations can be carried out on tens and soon even hundreds of qubits, a supporting software stack is required. When compiling algorithms for fault-tolerant quantum computing, some operations must be approximated while keeping the overall error beneath a user or application-defined threshold. As a result, several choices for distributing the error among subroutines emerge; the goal is to choose a good, or even the best, one. To this end, the present disclosure describes an error management module which can be integrated into any quantum software framework.


The job of a quantum program compiler is to translate a high-level description of a given quantum program to hardware-specific machine-level instructions. During the compilation process, one of the requirements may be to optimize as much as possible in order to reduce the overall depth of the resulting circuit to keep the overhead of the required quantum error correction manageable. Optimizations include quantum versions of constant-folding (such as merging consecutive rotation gates, or even additions by constants) and recognition of compute/action/uncompute sections to reduce the number of controlled gates. To allow such optimizations, multiple layers of abstractions may be used instead of compiling directly down to low-level machine instructions, which would make it impossible to recognize, e.g., two consecutive additions by constants. As an example, even canceling a gate followed by its inverse becomes computationally hard, or even impossible once continuous gates have been approximated.


To translate an intermediate representation to the next lower level of abstraction, a set of decomposition rules is used, some of which introduce additional errors which can be made arbitrarily small at the cost of an increasing circuit size or depth, which in turn implies a larger overhead when applying quantum error correction. It is therefore of great interest to choose these error tolerances such that the computation succeeds with high probability given the available resources (number and quality of qubits). At each level of abstraction, the compiler introduces additional accuracy parameters which must be chosen such that: (1) the cost to implement subroutines is automatically computed as a function of precision, (2) the overall error lies within the specifications of the algorithm, and (3) the implementation cost is as low as possible while the first constraint is satisfied.


One example solution manages these two constraints by expressing the final cost metric, which in one example is the total number of elementary quantum gates used, in terms of the costs of implementing all of its different subroutines. These subroutine costs are first assumed to be parameters, leading to an (in general, non-convex) optimization problem when trying to minimize the overall metric, while still guaranteeing a given overall approximation error. In an example implementation, the optimization problem is solved by using simulated annealing, starting from an initially random assignment of parameters.


While it is not possible to perform error correction over a continuous set of quantum operations (gates), this can be achieved over a discrete gate set such as the aforementioned Clifford+T gate set. As a consequence, certain operations must be approximated using gates from this discrete set. An example is the operation which achieves a rotation around the z-axis,







Rz
θ

=


(




e


-
i







θ
/
2





0




0



e

i






θ
/
2






)

.





To implement such a gate over Clifford+T, synthesis algorithms can be used. Given the angle θ of this gate, such a rotation synthesis algorithm will produce a sequence of custom-character(log εR−1) Clifford+T gates which approximate RzΘ up to a given tolerance εR. In most error correction protocols, the T-gate is the most expensive operation to realize, as it cannot be executed natively but requires a distillation protocol to distill many noisy magic states into one good state, which can then be used to apply the gate. As a consequence, it may be advantageous to reduce the number of these T-gates as much as possible in order to allow executing a certain quantum computation.


As described, the job of a quantum program compiler is to translate a high-level description of a given quantum program to hardware-specific machine-level instructions. As in classical computing, such compilation frameworks can be implemented in a hardware-agnostic fashion by introducing backend-independent intermediate representations of the quantum code.


During the compilation process, it is useful to optimize as much as possible in order to reduce the overall depth of the resulting circuit to keep the overhead of the required quantum error correction manageable. Optimizations include quantum versions of constant-folding (such as merging consecutive rotation gates, or even additions by constants) and recognition of compute/action/uncompute sections to reduce the number of controlled gates. To allow such optimizations, it may be advantageous to introduce multiple layers of abstractions instead of compiling directly down to low-level machine instructions, which may make it impossible to recognize, e.g., two consecutive additions by constants. This is because even canceling a gate followed by its inverse becomes computationally hard once continuous gates have been approximated.


To translate an intermediate representation to the next lower level of abstraction, a set of decomposition rules is used, some of which introduce additional errors which can be made arbitrarily small at the cost of an increasing circuit size or depth, which in turn implies a larger overhead when applying quantum error correction.



FIG. 1 shows a flow chart of a method for distributing approximation errors as part of generating quantum circuits corresponding to a quantum algorithm. In this example, the method is described using a framework for determining the total error from the decomposition of a quantum algorithm into lower-level gates by estimating the individual errors ϵ of the lower-level gates.


In one example, the time-evolution of a closed quantum system can be described by a unitary operator. As a consequence, each time-step of an example quantum computer can be described by a unitary matrix of dimension 2n×2n (excluding measurement), where n denotes the number of quantum bits (qubits). When decomposing such a quantum operation U into a sequence of lower-level operations UM . . . U1, the resulting total error can be estimated from the individual errors ε of the lower-level gates using the following Lemma 1. Given a unitary decomposition of U such that U=UM·UM-1 . . . U1 and unitaries Vi which approximate the unitary operators Ui such that ∥Vi−Ui|<εi ∀i, the total error can be bounded as follows:









U
-


V
M









V
1










i
=
1

M




ɛ
i

.






The proof of Lemma 1 is by induction using the triangle inequality and submultiplicativity of ∥⋅∥ with ∥U∥≤1. The base case M=2 can be proven as follows:





U2U1−V2V1∥=∥U2U1−U2V1+U2V1−V2V1∥≤∥U2(U1−V1)∥+∥(U2−V2)V1∥≤ε12.


The induction step P(M−1)→P(M) can be shown in a similar fashion:











U
M













U
1


-


V
M









V
1





=







U
M













U
1


-


U
M









U
2



V
1


+


U
M









U
2



V
1


-


V
M













V
1











U
M










U
2



(


U
1

-

V
1


)





+




(



U
M













U
2


-


V
M













V
2



)



V
1








ɛ
1

+




i
=
2

M



ɛ
i




=




i
=
1

M




ɛ
i

.







Note that this also holds for subunitaries {tilde over (V)}i, meaning that ∥{tilde over (V)}i∥≤1. Therefore, in this example, one can safely ignore measurement and the resulting overall error can only be smaller than estimated. In addition, measurements are rare operations and as such, the effect of this approximation on the choice of the individual εi is minor.


Using only Lemma 1 in the compilation process to automatically optimize the individual εi would make the resulting optimization problem infeasibly large. In addition, the number of parameters to optimize would vary throughout the optimization process since the number of lower-level gates changes when implementing a higher-level operation at a different accuracy, which in turn changes the number of distinct εi. To address these two issues, Theorem 4, which generalizes Lemma 1, is introduced. First, a few definitions concerning the Theorem 4 are provided.


Definition 1: Let VM(ε) . . . V1 be an approximate decomposition of the target unitary U such that ∥VM(ε) . . . V1∥≤ε. A set of subroutine sets custom-character(U, ε)={S1, . . . , SK} is a partitioning of subroutines of U if ∀i∃! k: Vi∈Sk and we denote by S(V) the function which returns the subroutine set S such that V∈S.


Such a partitioning will be used to assign to each Vi the accuracy εS(Vi)Sk with which all Vi∈Sk are implemented. In order to decompose the cost of U, however, we also need the notion of a cost-respecting partitioning of subroutines of U and the costs of its subsets: Definition 2: Let custom-character(U, ε)={S1, . . . , SK} be a set of subroutine sets. custom-character(U, ε) is a cost-respecting partitioning of subroutines of U w.r.t. a given cost measure C(U, ε) if ∀ε, i,j,k: (Vi∈Sk∧Vj∈Sk ⇒C(Vi, ε)=C(Vj, ε)). The cost of a subroutine set S is then well-defined and given by C(S, ε): =C(V, ε) for any V∈S.


With these definitions in place, one can now generalize Lemma 1. Theorem 4: Let custom-character(U, ε)={S1, . . . , SK} be a cost-respecting partitioning of subroutines for a given decomposition of U w.r.t. the cost measure C(U, ε) denoting the number of elementary gates required to implement U. Then the cost of U can be expressed in terms of the costs of all subroutine sets S∈custom-character(U, εU) as follows







C


(

U
,
ɛ

)


=




S





(

U
,

ɛ
U


)







C


(

S
,

ɛ
S


)





f
S



(

ɛ
U

)







with













S





(

U
,

ɛ
U


)












ɛ
S




f
s



(

ɛ
U

)






ɛ
-

ɛ
U



,




where fSU) gives the number of subroutines in the decomposition of U that are in S, given that the decomposition of U would introduce error εU if all subroutines were to be implemented exactly and εS denotes the error in implementing subroutines that are in S.


In this example, the cost C(U, ε) can be decomposed into a sum of the costs of all subroutines Vi. Furthermore, since εVS ∀V∈S,







C


(

U
,
ɛ

)


=




i



C


(


V
i

,

ɛ

V
i



)



=




i



C


(


V
i

,

ɛ

S


(

V
i

)




)



=





{

i
:


V
i


S


}





C


(

S
,

ɛ
S


)









and fSU):=|{i: Vi∈S}| ∀S∈custom-character(U, εU).


To prove that the overall error remains bounded by ε, let Ũ denote the unitary which is obtained by applying the decomposition rule for U with accuracy εU, i.e., ∥U−Ũ∥≤εU (where all subroutines are implemented exactly). Furthermore, let V denote the unitary which will ultimately be executed by the quantum computer, i.e., the unitary which is obtained after all decomposition rules and approximations have been applied. By the triangle inequality and Lemma 1,





U−V∥≤∥U−Ũ∥+∥Ũ−V∥≤εU+custom-character(U,εU)εSfSU)≤ε.



FIG. 4 is an abstract depiction of the compilation process for a quantum phase estimation (QPE) applied to a given unitary U in accordance with one example. In FIG. 4, for example, the left-most cU box gets ε1 as its error budget. Depending on the implementation details of cU, some of this budget may already be used to decompose cU into its subroutines, even assuming that all subroutines of cU are implemented exactly. The remaining error budget is then distributed among its subroutines, which is exactly the statement of the above theorem. The decomposition of the cost can be performed at different levels of granularity. This translates into having a larger set custom-character(U, ε) and more functions fSU) that are equal to 1. The two extreme cases are:


1. fS(ε)=1 ∀S∈custom-character(U, ε), |custom-characterS(U, ε)|=#gates needed to implement U:

    • A different εU for each gate


2. fS(ε)=#gates needed to implement U ∀S∈custom-character(U, ε), |S(U, ε)|=1:

    • The same εØ for all gates


Therefore, this solves the first issue of Lemma 1: In a practical implementation, the size of the set custom-character(U, ε) can be adaptively chosen such that the resulting optimization problem which is of the form







(


ɛ

S
1



,





,

ɛ

S
N




)







arg





min





C





Program






(


ɛ

S
1


,





,

ɛ

S
N



)






such





that






ɛ


Program


(


ɛ

S
1



,









,

ɛ

S
N




)



ɛ







for a user-defined or application-defined over-all tolerance ε, can be solved using a reasonable amount of resources.


Moreover, the costs of optimization can be reduced by initializing the initial trial parameters εSi to the corresponding solution accuracies of a lower-dimensional optimization problem where custom-character(U, ε) had fewer distinct subroutines. This example approach is very similar to multi-grid schemes which are used to solve partial differential equations.


The second issue with a direct application of Lemma 1 is the varying number of optimization parameters, which is also resolved by Theorem 4. Of course, one can simply make custom-character(U, ε) tremendously large such that most of the corresponding fS(ε) are zero. This, however, is a rather inefficient solution which would also be possible when using Lemma 1 directly.


A better approach may be to inspect custom-character(U, ε) for different values of ε and to then choose A auxiliary subroutine sets S1a, . . . , SAa such that each additional subroutine Vka which appears when changing ε (but is not a member of any S of the original custom-character(U, ε)) falls into exactly one of these sets.


The original set custom-character(U, ε) can then be extended by these auxiliary sets before running the optimization procedure. Again, the level of granularity of these auxiliary sets and thus the number of such sets A can be tuned according to the resources that are available to solve the resulting optimization problem.


In step 110, inputs related to a quantum algorithm A, overall target error ϵ, and cost metric M may be received. As part of this step, the system may also access a database of available decomposition rules and compiler applying rules 115. Next, in step 120, the system may decompose the quantum algorithm into subroutines, with parameters and corresponding approximation errors. As an example, the quantum algorithm may be decomposed using the Trotter decomposition process. Alternatively, an approach based on truncated Taylor series may also be used for the decomposition process. As an example, as shown in FIG. 2, quantum algorithm A may be decomposed into subroutines S1, S2, . . . Sn with an approximation error of ϵ1, ϵ2, . . . ϵn, respectively. This example schema involves decomposing the top-level algorithm A into subroutines with a list of resulting approximation errors. As further shown in FIG. 2, each of the subroutines S1, S2, . . . Sn may be further decomposed into additional subroutines and concomitant approximation errors.


One example of decomposing a quantum algorithm includes the decomposition of the transverse field Ising model into subroutines. Thus, as shown in FIG. 4, as part of the decomposition step (e.g., step 120 of FIG. 1), the top-level quantum phase estimation (QPE) is decomposed into several applications of a controlled unitary operator (U) and the inverse of a quantum Fourier transform (QFT). Subsequently, the controlled-U blocks are further decomposed into rotations R1, R2 . . . Rn, as shown in FIG. 4.


Another example of a schema that tracks the lists of approximation errors when recursively decomposing a top-level quantum algorithm, where the top-level quantum algorithm is a linear combination of unitaries is shown in FIG. 5. In a first decomposition step, the top-level quantum phase estimation (QPE) is decomposed into several time steps. Each time step is approximated by a subroutine Uδ and in turn each of these is implemented by a state preparation circuit StatePrep. Finally, each state preparation is implemented by a sequence of rotations R(θ1), R(θ2), . . . R(θL).


Next, in step 130, the system may determine whether the subroutines comprising the quantum algorithm A have been fully decomposed. If not, then the system may continue to iterate until the subroutines have been fully decomposed.


Next, in step 140, the system may generate the optimization problem to achieve the target error ϵ from the computed set of parameters and the approximation errors.


Next, in step 150, the system may solve the optimization problem by minimizing the cost metric M. As part of this step, the system may obtain a heuristic to solve the optimization problem for the specified cost metric M 155. In one example, the heuristic may be simulated annealing. In one example, the optimization problem may be solved in two steps or modes. The first mode may be active whenever the current overall error is larger than the target accuracy ϵ. In this mode, the system may perform annealing until the target accuracy has been reached. At this point, the second mode may become active. In the second mode, the system may perform annealing-based optimization to reduce the circuit cost function. After each such step, the system may switch back to the error-reduction mode if the overall error increased above the target accuracy ϵ. Table 1 below provides a high-level description of an example annealing-based algorithm to solve the optimization problem as part of step 150.











TABLE 1









β = 0




ε = (0.1,0.1,...,0.1)




cost = get_cost(ε)



error = get_total_error(ε)



for step in range(num_steps):



      i = floor( rnd( ) * len( eps ) )



      old_ε = ε



      if rnd( ) < 0.5:



        εi *= 1 + (1 − rnd( ) ) * δ



      else:



        εi /= 1 + (1 − rnd( ) ) * δ



    if error <= goal_error:



      # reduce cost



      ΔE = get_cost(ε) − cost



    else:



      # reduce error



      ΔE = get_total_error(ε) − error



    paccept = min(1 , e−βΔE)



    if rnd ( )> paccept :



      ε = old_ε



    β += Δ β










Next, in step 160, the system may instantiate parameters in all subroutines with the solution to the optimization problem. In a preferred embodiment, the subroutines are unitary operations, each of which depends on one parameter or a limited number of parameters and each of which operates on a limited number of qubits. The solution computed in step 150 using heuristic 155 is then a setting of said parameters to specific values which commonly are real numbers in some interval. In another embodiment, the subroutines can involve unitary operations that depend on several parameters and which operate on a growing number of qubits. Examples for such embodiments include, but are not restricted to, reflection operation around states that are modeled parametrically. Other examples for such embodiments include rotations on subspaces, where the rotation angles are parameters. Other examples for such embodiments include single qubit unitary rotations and controlled single qubit unitary rotations.


Finally, in step 170, after the parameters have been used in step 160 to determine concrete unitary operations over the complex numbers. Doing so will set all subroutines to specific unitary operations that do no longer depend on parameters and which can then be implemented by a quantum computer hardware. The collection of subroutines will then be assembled into one program which is a quantum circuit for the instruction-level representation of the algorithm A. The system may then output at least one quantum circuit to compute the algorithm A with the approximation error of at most ε and execute said circuit on a target quantum computer.



FIG. 3 shows a quantum circuit of a quantum phase estimation applied to a time evolution operator U=e−itH where H is the Hamiltonian of the quantum system being simulated. In this example, after the inverse quantum Fourier transform (QFT), a measurement yields the phase which was picked up by the input state. For the ground state ψ0, this is Uψ0=e−iHtψ0=e−iE0tψ0, allowing the extraction of the energy E0 of ψ0. As an example, the simulation of a quantum mechanical system called the transverse-field Ising model (TFIM), which is governed by the Hamiltonian shown below, is described.








H
^

=


-






i
,
j







J
ij



σ
z
i



σ
z
j




-



i




Γ
i



σ
x
i





,




where Jij are coupling constants and Γi denotes the strength of the transverse field at location i. σxi and σzi are the Pauli matrices, i.e.,







σ
x

=



(



0


1




1


0



)






and






σ
z


=

(



1


0




0



-
1




)






acting on the i-th spin.


The sum over custom-characteri,jcustom-character loops over all pairs of sites (i,j) which are connected. In this example, this corresponds to nearest-neighbor sites on a one-dimensional spin chain (with periodic boundary conditions) of length N. Given an approximation custom-character to the ground state ψ0 of Ĥ, the ground state energy E0 may be determined such that






Ĥψ
0
=E
0ψ0.


In this example, quantum phase estimation (QPE) can be used to achieve this task: If the overlap between ψ0 and {tilde over (ψ)}0 is large, a successful application of QPE followed by a measurement of the energy register will collapse the state vector onto ψ0 and output E0 with high probability (namely p=|{tilde over (ψ)}00|2). There are various ways to implement QPE, but the simplest to analyze is the coherent QPE followed by a measurement of all control qubits. FIG. 3 shows a diagram of an example quantum circuit.


This procedure requires 16π/εQPE applications of (the controlled version of) the time-evolution operator Uδ=exp (−iδĤ) for a success probability of ½, where εQPE denotes the desired accuracy (bit-resolution of the resulting eigenvalues). Using a Trotter decomposition of Uδ, i.e., for large M









U
δ




(


U

δ
M

J



U

δ
M

Γ


)

M


=



(


exp


(


-
i







δ
M





i




J

i
,

i
+
1





σ
z
i



σ
z

i
+
1





)




exp


(


-
i







δ
M





i




Γ
i



σ
x
i




)



)

M

=


(



i








exp


(


-
i







δ
M



J

i
,

i
+
1





σ
z
i



σ
z

i
+
1



)






i







exp


(


-
i







δ
M



Γ
i



σ
x
i


)





)

M



,




allows to implement the global propagator Uδ using a sequence of local operations. These consist of z- and x-rotations in addition to nearest-neighbor CNOT gates to compute the parity (before the z-rotation and again after the z-rotation to uncompute the parity). The rotation angles are







θ
z

=


2






δ
M



J

i
,

i
+
1








and






θ
x


=


-
2







δ
M



Γ
i







for z- and x-rotations, respectively. The extra factor of two arises from the definitions of the Rz and Rx gates.


In order to apply error correction to run the resulting circuit on actual hardware, these rotations can be decomposed into a sequence of Clifford+T gates using rotation synthesis. Such a discrete approximation up to an accuracy of εR features custom-character(log εR−1) T-gates, where even the constants hidden in the custom-character notation were explicitly determined.


The first compilation step is to resolve the QPE library call. In this example, the cost of QPE applied to a general propagator U is








C


(


QPE
U

,

ɛ
U


)


=


16

π



ɛ
QPE



C


(




C


U

,

ɛ
U


)





,




where cU denotes the controlled version of the unitary U, i.e.,





(cU:=00⊗1+11⊗U).


Furthermore, the chosen tolerances must satisfy








16





π




ɛ
QPE



ɛ
U




ɛ
-

ɛ
QPE




.




The next step, in this example, is to approximate the propagator using a Trotter decomposition. Depending on the order of the Trotter formula being used, this yields








C


(




C


U

,

ɛ
U


)


=


M


(

ɛ
Trotter

)




(


C


(


U
1



C

,

ɛ

U
1



)


+

C


(


U
2



C

,

ɛ

U
2



)



)






with














M


(

ɛ
Trotter

)




(


ɛ

U
1


+

ɛ

U
2



)





ɛ
U

-


ɛ
Trotter

.






In the experiments section,







M


(

ɛ
Trotter

)




1


ɛ
Trotter







is chosen as an example. Finally, approximating the (controlled) rotations in cU1 and cU2 by employing rotation synthesis,






C(cUiUi)=24 log εR−1

    • with 2NεR≤εUi for i∈{1,2}.


Collecting all of these terms and using that C(cU1,⋅)=C(cU2,⋅) yields







C


(


QPE
U

,
ɛ

)


=



16





π



ɛ
QPE




M


(

ɛ
Trotter

)


·
2
·
2



N
·
4






log






ɛ
R

-
1












with










ɛ

QPE
+






16

h




ɛ
QPE



(


2



M


(

ɛ
Trotter

)


·
2


N






ɛ
R


+

ɛ
Trotter


)




ɛ
.












Next, the implementation details and numerical results of the example error management module are described. While the optimization procedure becomes harder for fine-grained cost and error analysis, the benefits in terms of the cost of the resulting circuit are substantial.


A two-mode annealing procedure for optimization is described, in which two objective functions are reduced as follows: The first mode is active whenever the current overall error is larger than the target accuracy ε. In this case, it performs annealing until the target accuracy has been reached. At this point, the second mode becomes active. It performs annealing-based optimization to reduce the circuit cost function. After each such step, it switches back to the error-reduction subroutine if the overall error increased above ε.


Both annealing-based optimization modes follow the same scheme, which consists of increasing/decreasing a randomly chosen εi by multiplying/dividing it by a random factor f∈(1,1+δ], where δ can be tuned to achieve an acceptance rate of roughly 50%. Then, the new objective function value is determined, followed by either a rejection of the proposed change in εi or an acceptance with probability






p
accept=min(1,e−βΔE),


where β=T−1 and T denotes the annealing temperature. This means, in particular, that moves which do not increase the energy, i.e., ΔE≤0 are always accepted. The pseudo-code of this algorithm can be found in Table 2 provided later.


Using the example of a transverse-field Ising model which was described earlier, the benefits of the error management module are determined by running two experiments. The first aims to assess the difference between a feasible solution, i.e., values εi which produce an overall error that is less than the user-defined tolerance, and an optimized feasible solution. In the first case, the first mode is run only until a feasible solution is obtained and in the latter, both modes are employed, as outlined above.



FIG. 6 shows numerical results for the optimization problem resulting from the transverse-field Ising model example. FIG. 6 shows the relationship between the circuit cost (along y-axis) for implementing the quantum algorithm and the target accuracy ϵ. As shown, the circuit cost goes up with increasing target accuracy. One curve shows the circuit cost prior to optimization and the second curve shows the circuit cost after optimization. Improving the first encountered feasible solution by further optimization allows the reduction of the cost metric M by almost a factor of two (see inset in FIG. 6). By optimizing using additional parameters, the cost metric M can be reduced by several orders of magnitude.



FIG. 7 shows an example of the improvement of the first encountered feasible solution shown in FIG. 6. In this example, the circuit costs from performing two-variables based optimization versus three-variables based optimization are shown.


Finally, the robustness of the optimization procedure by introducing redundant parameters, i.e., additional rotation gate synthesis tolerances εRi, where the optimal choice would be εRRiRj for all i, j, is measured. However, because the resulting optimization problem features more parameters, it is harder to solve, and the final circuit cost is expected to be higher.


In addition, the time it takes to find an initial feasible solution will grow as well. As an example, FIGS. 8 and 9 show results which indicate that this approach is scalable to hundreds of variables if the goal is to find a feasible solution. However, as the number of parameters grows, it becomes increasingly harder to simultaneously optimize for the cost of the circuit. This could be observed, e.g., with 100 additional (redundant) parameters, where further optimization of the feasible solution reduced the cost from 1.65908·1012 to 1.10752·1012, which is far from the almost 2× improvement which was observed for smaller systems in FIGS. 6 and 7.



FIG. 8 shows an example of the increase in the fraction of the circuit cost as the number of parameters used for optimization grows. The fraction of the circuit cost is the circuit cost with a number of redundant parameters divided by the cost achieved with no redundant parameters. In this example, the annealing time was chosen to be 10 throughout, and the annealing procedure was run with 1000 different random number generator seeds, reporting the best result out of the 1000 runs. As illustrated in FIG. 8, the problem becomes harder to optimize as more parameters are added.



FIG. 9 shows an example of how the runtime for the annealing procedure increases with an increase in the number of redundant parameters used. In this example, the annealing time was chosen to be 10 throughout, and the annealing procedure was run with 1000 different random number generator seeds, reporting the best result out of the 1000 runs. The scaling of the runtime in FIG. 9 can be explained since new updates are proposed by selecting i∈[0, . . . , N−1] uniformly at random (followed by either increasing or decreasing εi). Due to this random walk over i∈[0, . . . , N−1], the overall runtime is also expected to behave like the expected runtime of a random walk and, therefore, to be in custom-character(N2).



FIG. 10 shows an example system environment for implementing aspects of the technology and the methods described in the present disclosure. System environment includes a quantum computing system 1010 and a classical computing system 1030, which is coupled to a remote computer 1050. Quantum computing system 1010 consumes the quantum circuits generated using the classical computing system 1030. Quantum computing system 1010 may include a quantum processor 1012 and measuring/monitoring devices 1014. In one example, quantum processor 1012 and measuring/monitoring devices 1014 may be configured to operate in a cryogenic environment (e.g., between 4 Kelvin and 77 Kelvin) such that quantum processor may perform superconducting operations. Quantum processor 1012 may execute quantum circuits that are compiled by classical computing system 1030. The compiled quantum circuits may be communicated to quantum processor 1012 via bus 1006.


With continued reference to FIG. 9, classical computing system 1030 may include communication interface(s) 1032, classical processor 1034, and memory 1036. Memory 1036 may include a compiler unit 1038, including libraries and other programs or code to compile a high-level description of a quantum algorithm into quantum circuits. Compiler unit 1038 may further include programs or code to execute the steps described with respect to FIG. 1. Thus, compiler unit 1038 may include programs or code that when executed by classical processor 1034 may perform the various methods described with respect to the present disclosure. In one example, the code shown in Table 2 below may be stored in memory 1036 either as part of compiler unit 1038 or separately. The high-level description of the quantum algorithm may be stored in memory 1036 or in memory 1052, which is associated with remote computer 1050.









TABLE 2







 /* Finds approximations to the solution(s) of the constraint


 optimization problem min J(x) such that E(x) <= E0 using


 simulated annealing. The coded example is three-dimensional


 (i.e., x \in R{circumflex over ( )}3). */


 #include<iostream>


 #include<cstdlib>


 #include<cmath>


 #include<vector>


 #include<random>


 using namespace std;


 int main(int argc, char *argv[ ]){


   double beta_0 = 0.; // initial inverse temperature


   double M = 0; // ~ number of steps


  double goal_E = 0.01; // desired accuracy of the overall algorithm


  if (argc > 1)


   goal_E = atof(argv[1]); // first argument is desired accuracy


   (if provided)


  unsigned num_var = 3;


   if (argc > 2)


    num_var = atoi(argv[2]);


   if (argc > 3)


    M = atof(argv[3]);


  // EXAMPLE: Transverse-field lsing Model


  vector<long double> eps(num_var, 1.e−4); // initial values


  double N = 10.; // ~ number of spins in TFIM


  // functions returning the number of gates and the error for given


  epsilon


  auto J = [&]( ){


  double loge = 0.;


  for (unsigned i = 2; i < eps.size( ); ++i)


   loge += log2(1/eps[i]);


  loge /= eps.size( )−2;


  return 4*M_PI/eps[0] * 2*N/sqrt(eps[1])*2*N*4*loge;


 };


 auto E = [&]( ){


  double e = 0.;


  for (unsigned i = 2; i < eps.size( ); ++i)


   e += eps[i];


  e /= eps.size( )−2;


  return eps[0] + 4*M_PI/eps[0]*(2*N/sqrt(eps[1])*2*N*e+eps[1]);


 };


 long accept = 0, total = 0;


 vector<long double> best_eps; // keep track of the best values


 double lowest_J = 1./0., best_E, best_beta; // same here


 for (unsigned seed = 0; seed < ((M > 1)?1000:1) ; ++seed){


  // RUN annealing:


 double beta = beta_0; // inverse temperature for annealing


 bool error = true; // if true, annealing reduces error; if false it reduces


gate count


 unsigned k = 0; // helper variable keeping track of the number of mode


changes


  auto f_dec = − 0.08;


  auto f_inc = + 0.08;


   std::mt19937 mt(seed);


   std::uniform_real_distribution<double> dist(0., 1.);


   auto rng = [&]( ){ return dist(mt); };


   unsigned r = 0;


   while (E( ) > goal_E || beta < 10.){


    auto oldE = E( );


    double current_J = J( );


    auto P = rng( );


    auto Q = rng( );


    auto old_eps = eps;


    eps[int(eps.size( ) * P)] *= 1 +


    ((Q < 0.5)?f_dec:f_inc) * (1−rng( ));


    if (!error){


     double dJ = J( ) − current_J;


     total++;


     if (dJ > 0 && exp(−dJ * beta) < rng( ))


      eps = old_eps;


     else


      accept++;


    }


    else{


     auto dE = (E( ) − oldE) / goal_E;


     double dJ = J( ) − current_J;


     if (dE > 0 && exp(−dE * beta) < rng( ))


      eps = old_eps;


    }


   // switch mode depending on the current error


   if (E( ) > goal_E && !error)


    error = true; // --> algorithm will try to reduce error


   if (E( ) <= goal_E && error){


    error = false; // --> algorithm will try to reduce gate count


    /*beta /= M / ++k; // update annealing parameter


    if (beta < beta_0) // annealing processes, iteratively reducing


 errors followed


     beta = beta_0; // by reducing gates*/


   }


   if (J( ) <= lowest_J && E( ) <= goal_E){ // keep track of


   best values


    lowest_J = J( );


    best_E = E( );


    best_eps = eps;


    best_beta = beta;


   }


   beta += 1./M;


  }


  }


  // output best values//


 /* Finds approximations to the solution(s) of the constraint optimization


 problem min J(x) s.t. E(x) <= E0 using thermal annealing. The coded


 example is three-dimensional (i.e., x \in R{circumflex over ( )}3). */


 #include<iostream>


 #include<cstdlib>


 #include<cmath>


 #include<vector>


 #include<random>


 using namespace std;


 int main(int argc, char *argv[ ]){


   double beta_0 = 0.; // initial inverse temperature


   double M = 0; // ~ number of steps


  double goal_E = 0.01; // desired accuracy of the overall algorithm


  if (argc > 1)


   goal_E = atof(argv[1]); // first argument is desired accuracy


   (if provided)


  unsigned num_var = 3;


   if (argc > 2)


    num_var = atoi(argv[2]);


   if (argc > 3)


    M = atof(argv[3]);


  // EXAMPLE: Transverse-field lsing Model:


  vector<long double> eps(num_var, 1.e−4); // initial values


  double N = 10.; // ~ number of spins in TFIM


  // functions returning the number of gates and the error for given


  epsilon


  auto J = [&]( ){ return 16*M_PI/eps[0] *


 2*N/sqrt(eps[1])*2*N*4*log2(1/eps[eps.size( )−1]); };


  auto E = [&]( ){ return eps[0] +


 16*M_PI/eps[0]*(2*N/sqrt(eps[1])*2*N*eps[eps.size( )−1]+eps[1]); };


  long accept = 0, total = 0;


  vector<long double> best_eps; // keep track of the best values


  double lowest_J = 1./0., best_E, best_beta; // same here


  for (unsigned seed = 0; seed < ((M > 1)?1000:1) ; ++seed){


   // RUN annealing:


  double beta = beta_0; // inverse temperature for annealing


  bool error = true; // if true, annealing reduces error; if false it


reduces gate count


  unsigned k = 0; // helper variable keeping track of the number of


mode changes


   auto f_dec = − 0.08;


   auto f_inc = + 0.08;


   std::mt19937 mt(seed);


   std::uniform_real_distribution<double> dist(0., 1.);


   auto rng = [&]( ){ return dist(mt); };


   unsigned r = 0;


   while (E( ) > goal_E || beta < 10.){


    auto oldE = E( );


    double current_J = J( );


    auto P = rng( );


    auto Q = rng( );


    auto old_eps = eps;


    eps[int(eps.size( ) * P)] *= 1 +


    ((Q < 0.5)?f_dec:f_inc) * (1−rng( ));


    if (!error){


     double dJ = J( ) − current_J;


     total++;


     if (dJ > 0 && exp(−dJ * beta) < rng( ))


      eps = old_eps;


     else


      accept++;


    }


    else{


     auto dE = (E( ) − oldE) / goal_E;


     double dJ = J( ) − current_J;


     if (dE > 0 && exp(−dE * beta) < rng( ))


      eps = old_eps;


    }


    // switch mode depending on the current error


    if (E( ) > goal_E && !error)


     error = true; // --> algorithm will try to reduce error


    if (E( ) <= goal_E && error){


     error = false; // --> algorithm will try to reduce gate count


     /*beta /= M / ++k; // update annealing parameter: we perform


 multiple


     if (beta < beta_0) // annealing processes, iteratively reducing


 errors followed


      beta = beta_0; // by reducing gates*/


    }


    if (J( ) <= lowest_J && E( ) <= goal_E){ // keep track of


    best values


     lowest_J = J( );


     best_E = E( );


     best_eps = eps;


     best_beta = beta;


    }


    beta += 1./M;


   }


  }


   // output best values


  cout << “\rBEST: “ << lowest_J << ” : err = “ << best_E << ”


  eps[ ] = {“;


  for (auto e : best_eps)


   cout << e << “ ”;


   cout << ”}\nat beta = “ << best_beta << ”\n”;


   cout << “Acceptance = ” << accept/(double)total*100 <<“%.\n”;


  }









In conclusion, the present disclosure relates to a method for decomposing a quantum algorithm into quantum circuits. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm until the quantum algorithm is fully decomposed into the quantum circuits, where the automatically performing the step-wise decomposition results in a set of approximation errors and a set of parameters to instantiate at least a subset of the quantum circuits corresponding to the quantum algorithm, such that an overall approximation error caused by the automatically performing the step-wise decomposition is maintained below a specified threshold approximation error.


Each of the quantum circuits may be a gate that can be implemented using a quantum processor. Each of the quantum circuits may be a fault-tolerant logical gate. Each of the quantum circuits may be implemented as a protected operation on encoded quantum data.


The method may further comprise using an optimization problem minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below the specified threshold approximation error. The optimization problem may encode a condition to meet the overall approximation error and a condition to minimize the cost metric associated with the quantum circuits. The optimization problem may be solved using a heuristic method to select parameters of the quantum circuits. The optimization problem may be solved by choosing a random initial assignment of approximation errors and parameters. The solution to the optimization problem may be computed using simulated annealing.


In another example, the present disclosure relates to a method for decomposing a quantum algorithm into quantum circuits. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits. The method may further include using the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.


Each of the quantum circuits may be implemented as a protected operation on encoded quantum data.


The minimizing the cost metric may further comprise solving an optimization problem using a heuristic method to select parameters of the quantum circuits. The optimization problem may be solved by choosing a random initial assignment of approximation errors and parameters.


In yet another example, the present disclosure relates to a computer-readable medium comprising computer executable instructions for a method. The method may include using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits, where the step-wise decomposition into the subroutines is implemented via a quantum phase estimation (QPE) process. The method may further include using the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.


The QPE process may implement a time evolution of an Ising model in a transverse field. The QPE process may be applied to a task of evolving a quantum-mechanical system that is initialized in a given state for a specified total duration of time. The total duration of time may be divided into subintervals. The total duration of time may be divided into subintervals using Trotter method or Trotter-Suzuki method. The total duration of time may be divided into subintervals using a Linear Combination of Unitaries (LCU) method. The LCU method may be implemented using state preparation circuits.


It is to be understood that the methods, modules, and components depicted herein are merely exemplary. Alternatively, or in addition, the functionally described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include Field-Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), Application-Specific Standard Products (ASSPs), System-on-a-Chip systems (SOCs), Complex Programmable Logic Devices (CPLDs), etc. In an abstract, but still definite sense, any arrangement of components to achieve the same functionality is effectively “associated” such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality can be seen as “associated with” each other such that the desired functionality is achieved, irrespective of architectures or inter-medial components. Likewise, any two components so associated can also be viewed as being “operably connected,” or “coupled,” to each other to achieve the desired functionality.


The functionality associated with some examples described in this disclosure can also include instructions stored in a non-transitory media. The term “non-transitory media” as used herein refers to any media storing data and/or instructions that cause a machine to operate in a specific manner. Exemplary non-transitory media include non-volatile media and/or volatile media. Non-volatile media include, for example, a hard disk, a solid-state drive, a magnetic disk or tape, an optical disk or tape, a flash memory, an EPROM, NVRAM, PRAM, or other such media, or networked versions of such media. Volatile media include, for example, dynamic memory, such as, DRAM, SRAM, a cache, or other such media. Non-transitory media is distinct from, but can be used in conjunction with transmission media. Transmission media is used for transferring data and/or instruction to or from a machine. Exemplary transmission media, include coaxial cables, fiber-optic cables, copper wires, and wireless media, such as radio waves.


Furthermore, those skilled in the art will recognize that boundaries between the functionality of the above described operations are merely illustrative. The functionality of multiple operations may be combined into a single operation, and/or the functionality of a single operation may be distributed in additional operations. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments.


Although the disclosure provides specific examples, various modifications and changes can be made without departing from the scope of the disclosure as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present disclosure. Any benefits, advantages, or solutions to problems that are described herein with regard to a specific example are not intended to be construed as a critical, required, or essential feature or element of any or all the claims.


Furthermore, the terms “a” or “an,” as used herein, are defined as one or more than one. Also, the use of introductory phrases such as “at least one” and “one or more” in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an.” The same holds true for the use of definite articles.


Unless stated otherwise, terms such as “first” and “second” are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements.

Claims
  • 1. A method for decomposing a quantum algorithm into quantum circuits, the method comprising: using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm until the quantum algorithm is fully decomposed into the quantum circuits, wherein the automatically performing the step-wise decomposition results in a set of approximation errors and a set of parameters to instantiate at least a subset of the quantum circuits corresponding to the quantum algorithm, such that an overall approximation error caused by the automatically performing the step-wise decomposition is maintained below a specified threshold approximation error.
  • 2. The method of claim 1, wherein each of the quantum circuits is a gate that can be implemented using a quantum processor.
  • 3. The method of claim 1, wherein each of the quantum circuits is a fault-tolerant logical gate.
  • 4. The method of claim 1, wherein each of the quantum circuits is implemented as a protected operation on encoded quantum data.
  • 5. The method of claim 1 further comprising using an optimization problem minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below the specified threshold approximation error.
  • 6. The method of claim 5, wherein the optimization problem encodes a condition to meet the overall approximation error and a condition to minimize the cost metric associated with the quantum circuits.
  • 7. The method of claim 6, wherein the optimization problem is solved using a heuristic method to select parameters of the quantum circuits.
  • 8. The method of claim 7, wherein the optimization problem is solved by choosing a random initial assignment of approximation errors and parameters.
  • 9. The method of claim 8, wherein a solution to the optimization problem is computed using simulated annealing.
  • 10. A method for decomposing a quantum algorithm into quantum circuits, the method comprising: using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits; andusing the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.
  • 11. The method of claim 10, wherein each of the quantum circuits is implemented as a protected operation on encoded quantum data.
  • 12. The method of claim 10, wherein the minimizing the cost metric further comprises solving an optimization problem using a heuristic method to select parameters of the quantum circuits.
  • 13. The method of claim 12, wherein the optimization problem is solved by choosing a random initial assignment of approximation errors and parameters.
  • 14. A computer-readable medium comprising computer executable instructions for a method comprising: using at least one processor, automatically performing a step-wise decomposition of the quantum algorithm and distributing an overall approximation error caused by the automatically performing the step-wise decomposition into subroutines until the quantum algorithm is fully decomposed into the quantum circuits, wherein the step-wise decomposition into the subroutines is implemented via a quantum phase estimation (QPE) process; andusing the at least one processor, minimizing a cost metric associated with implementing the quantum circuits while maintaining the overall approximation error below a specified threshold approximation error.
  • 15. The computer-readable medium of claim 14, wherein the QPE process implements a time evolution of an Ising model in a transverse field.
  • 16. The computer-readable medium of claim 14, wherein the QPE process is applied to a task of evolving a quantum-mechanical system that is initialized in a given state for a specified total duration of time.
  • 17. The computer-readable medium of claim 16, wherein the total duration of time is divided into subintervals.
  • 18. The computer-readable medium of claim 17, wherein the total duration of time is divided into subintervals using Trotter method or Trotter-Suzuki method.
  • 19. The computer-readable medium of claim 17, wherein the total duration of time is divided into subintervals using a Linear Combination of Unitaries (LCU) method.
  • 20. The computer-readable medium of claim 19, wherein the LCU method is implemented using state preparation circuits.
CROSS-REFERENCE TO A RELATED APPLICATION

This application claims the benefit of U.S. Provisional Application No. 62/676,519, filed May 25, 2018, titled “GENERATING QUANTUM COMPUTING CIRCUITS BY DISTRIBUTING APPROXIMATION ERRORS IN A QUANTUM ALGORITHM,” the entire contents of which are hereby incorporated herein by reference.

Provisional Applications (1)
Number Date Country
62676519 May 2018 US