The disclosure relates to quantum differential encoders and, in particular, though not exclusively, to methods and systems of solving a (non)linear set of differential equations using a quantum computer, and a computer program product enabling a data processing system comprising a quantum computer to solve set of (non)linear differential equations.
The calculus of infinitesimal changes has formed the paradigm for modelling phenomena in various areas of science. Written as a system of differential equations (DEs) of one or more variables, the solution typically allows to describe function behaviour in space and time, given the specified initial and boundary conditions. The range of applications for differential equations spans across all disciplines. In physics they encompass mechanics, electrodynamics, fluid dynamics, thermodynamics, and quantum theory. In chemistry, DEs describe chemical reactions and molecular dynamics. In biology they describe ecology and competition in the ecosystem. DEs are also a workhorse of epidemiology. In economy and finances they describe strategies for optimal pricing, hedging, and determination of optimal investment strategies.
For the cases that cannot be treated analytically, finding solutions to differential equations requires advanced numerical methods. These can be categorized into local and global methods. Local classical computational methods rely on discretization of the space of variables, often with a fine grid being required to represent a solution accurately. Moreover, as derivatives are approximated using numerical differentiation techniques (finite differencing and Runge-Kutta methods), small grid steps are required to reproduce the results qualitatively. This leads to large computational cost as the number of points M (degrees of freedom) grows. Moreover, lattice-based calculations for increasing dimensionality of the problem (equal to the number of variables v) leads to a ‘blow-up’ of required number of points as ˜Mv, a phenomenon known as ‘the curse of dimensionality’. Global (spectral) numerical methods for solving differential equation rely on representing the solution in terms of a suitable basis set. This recasts the problem to finding optimal coefficients for the polynomial (e.g. Fourier or Chebyshev) approximation of the sought function. While in some cases spectral methods may more efficiently find optimal solutions, for finding general functions the same problem of dimensionality applies, as the required basis set size can grow rapidly for complex solutions and differentiation still requires performing numerical approximations. Finally, unlike linear systems of algebraic equations, nonlinear differential equations can be stiff with solutions being unstable for certain methods. Such systems are difficult to solve due to large change of solution in the narrow interval of parameters. Furthermore, challenging problems correspond to systems with highly oscillatoric solutions and discontinuities. Yet again it requires considering fine grids and large basis sets, together with applying meticulous numerical differentiation techniques.
A prominent example of a field where complex differential equations are widespread is fluid dynamics. Fluid dynamics studies the flow of Newtonian fluids, with corresponding Navier-Stokes differential equations derived based on energy and momentum conservation laws. Being highly nonlinear, they are rarely solveable with analytical methods and therefore they are treated within the computational fluid dynamics discipline. Solving these equations is vital to many industrial applications; this includes propulsion and aircraft construction (airspace industry), engine design (automotive industry), submarine design (naval industry), oil wells (drilling industry), blood flow in capillar tubes (health industry), weather forecasting and climate modelling, among many others. Taking a solid share of computational time on large-scale classical computers, efficient solving of systems of differential equations is an outstanding open problem.
Quantum computers change the way we process information. For certain problems they offer drastic computational speed up, ranging from quadratic acceleration of searching unstructured data, to exponential improvements for factoring large numbers used in encryption applications. Using qubits and coherent superpositions of binary strings, quantum machines utilize quantum interference effects to amplify the correct solution, reached in fewer steps than classical computers ever can. Quantum computers are well-suited for chemistry applications, as they are naturally suitable for the simulation of certain quantum processes. At the same time, quantum computers are not directly suited for all computational problems, and can be seen as specialized machines (akin to GPUs), that need to be tailored to the problem at hand. Designing these machines and their operational schedule is crucial for solving problems in quantum computers faster than any available classical methods. This remains true for tasks and applications in differential calculus.
To date, few studies asked the question if quantum processors can be capable of solving systems of differential equations. This mostly builds on the known quantum algorithms for solving linear systems of equations known as HHL algorithm [HHL 2009] and its improvements. To be precise, a quantum version of the problem is solved, where variable is sought as a quantum state |x=Ã−1|b for the matrix à and a vector of constant terms needs to be loaded as a quantum state |b. This is named as an amplitude encoding, where any function can be represented as a quantum state |u=Σk=12
Using a finite differencing scheme (Euler's method), several studies reduced DEs to algebraic equations, utilizing similar approach. They also need to be linearized, increasing overhead in the number of equations. A direct treatment of nonlinear differential equations by a quantum algorithm remains an open challenge.
In general, compressing 2N-dimensional data into an N-qubit register is beneficial due to exponential memory savings. However, as highlighted in the review Ref. [Biamonte 2018] several problems arise; first, creating an exponentially compressed state from a vector of constants is a serious problem, that requires sophisticated techniques like quantum random access memory (QRAM) and may require exponentially many gate operations for preparing a general state, resulting in an exponential scaling of the algorithmic runtime. For ODEs, this appears at the level of preparing an arbitrary initial condition. Second, reading out the result is complicated, as sampling from a quantum state requires exponentially many samples in the number of qubits for a given numerical accuracy. Finally, the required depth of HHL-like protocols for industrially relevant problems is huge even for fault-tolerant quantum devices. Taking the simulation of electromagnetic wave scattering as an example, and considering the cost of implementing oracles, a circuit that simulates an industrially relevant challenge was estimated to require 108 qubits and 1029 gates Ref. [Scherer 2017]. Even with extremely fast gates this runtime is on the cosmic scale.
Current quantum devices are prone to noise and are not suited for large depth quantum circuits. However, the Hilbert space of these devices increases exponentially with the number of qubits, providing advantage over classical methods for certain problems. Quantum processors with about ˜100 qubits may offer computational power inaccessible to classical computers. This corresponds to Noisy Intermediate Scale Quantum (NISQ) processors, that are special purpose devices that need to be co-designed with a problem in mind. A particularly useful approach in this setting is to use variational quantum algorithms (VQA). Initially proposed for chemistry under the name of variational quantum eigensolver (VQE), this approach queries a quantum computer to prepare low energy states on a quantum devices, but guiding the optimization loops using a classical computer. This strategy has allowed to perform quantum calculations with relatively noisy devices, allowing for numerous advances, unmatched by current large-depth protocols. This has triggered the attention to generic VQA's, finding applications in many application areas including data science and quantum simulation.
One variational approach which aims to solve differential equations on a quantum computer is described in Ref. [Lubasch 2020]. That proposed approach also allows to implement nonlinear terms, using ‘quantum nonlinear processing units’, albeit at the expense of adding complex ancillary-based operations with a K-fold increase of the system size (K is the order of the non-linearity in the differential equation). Importantly, the approach is based on the amplitude encoding of the function |u=Σk=12
Another work relevant in the current context is Ref. [Gaitan 2020] where numerical approaches to fluid dynamics problems are considered. The proposed approach relies on finite-difference methods with grid-based calculations. The workflow of the algorithm represents a classical CFD algorithm with a single step that is delegated to quantum hardware. The latter corresponds to function averaging as performed in Kacewicz' quantum ODE algorithm, with the core subroutine being quantum amplitude estimation algorithm (QAEA). QAEA is a Grover search-type algorithm that provides a quadratic quantum speed once the desired function is prepared as a quantum state and two quantum Fourier transforms (direct and inverse) are performed. Since the function averaging is based on amplitude-encoded state, and assumes that a black-box oracle for this operation is available, it is difficult to estimate the algorithm complexity. Also, unfortunately the input problem (QRAM requirement) arises here.
Hence, from the above, it follows that there is therefore a need in the art for improved methods and systems for solving different types of (non-)linear differential equations using quantum computers. In particular, there is a need in the art to improve the approximation of trial function derivatives, as well as efficiently encoding the solution in a quantum representation that should be easily loaded, processed and read out, as well as providing a framework that is compatible with near-term quantum hardware with limited circuit depth, as well as extensibility to fault-tolerant hardware.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Functions described in this disclosure may be implemented as an algorithm executed by a microprocessor of a computer. Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied, e.g., stored, thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber, cable, RF, etc., or any suitable combination of the foregoing. Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java™, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor, in particular a microprocessor or central processing unit (CPU), of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer, other programmable data processing apparatus, or other devices create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. Additionally, the Instructions may be executed by any type of processors, including but not limited to one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FP-GAs), or other equivalent integrated or discrete logic circuitry.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It is an objective of the embodiments in this disclosure to reduce or eliminate at least part of the drawbacks known in the prior art. In particular, it is an object of the embodiments in this application to efficiently encode the differential equation trial solution in a quantum representation that is easily loaded, processed and read out. A further object of the embodiments is to compute analytical derivatives of trial solutions, instead of numerical approximations. It is a yet further object of the embodiments to provide a framework that is compatible with the limited coherence time and available quantum circuit gate depth of near-term quantum hardware. It is also an object of the embodiments to make use of the increased fault-tolerancy of projected future quantum hardware. It is also an object of the embodiments to allow for general solutions to general sets of differential equations, including non-linear components and components representing complex oscillatory and/or non-trivial behavior. It is also an object of the embodiments to allow for solving parametric differential equations where the parameters or terms of the equation are not known to high accuracy, while additional data points (measurements) are available to perform partial regression and for mitigating such uncertainty. It is also an object of the embodiments to allow for solving parametric differential equations where it is not known that all terms of the equation accurately describe the data with a finite coefficient or not, where a trade-off is required to ensure interpretability.
In an aspect, the invention may relate to a method for solving one or more differential equations DEs, e.g. a set of (non)linear differential equations, using a data processing system comprising a classical computer system and a quantum computer system. The method may comprise: receiving or determining, by the classical computer system, a formulation of quantum circuits representing the one or more DEs, the quantum circuits being parameterized by variables x of the one or more DEs and including one or more function circuits for determining one or more trial functions values ƒ(xj) around one more points xj and one or more derivative function circuits for determining one or more trial derivative values around the one or more points xj; executing, by the quantum computer system, the quantum circuits for a set of points xj in the variable space X of the one or more DEs; receiving, by the classical computer system, in response to the execution of the quantum circuits quantum, hardware measurement data; and, determining, by the classical computer system, on the basis of the quantum hardware measurement data and a loss function, if the quantum hardware measurement data forms a solution to the one or more DEs.
In a further aspect, the invention may relate to a method for solving one or more differential equations, DEs, using a data processing system comprising a classical computer system and a quantum computer system, wherein the method may comprise: receiving or determining, by the classical computer system, a formulation of quantum circuits representing a trial function for the one or more DEs, the trial function being associated with one or more variables and a variable space, the quantum circuits including one or more function circuits for determining one or more values of the trial function around one or more points in the variable space, one or more derivative function circuits for determining one or more values of an derivative of the trial function around the one or more points and one or more quantum variational circuits associated with one or more optimization parameters; executing, by the classical computer system, the quantum circuits for a set of points in the variable space of the trial function, wherein the execution of the quantum circuits includes: translating the quantum circuit into control signals for controlling quantum elements of the quantum computer system and for readout of the quantum elements to obtain hardware measurement data and controlling the quantum computer system based on the control signals; receiving, by the classical computer system, in response to the execution of the quantum circuits, the hardware measurement data and processing the hardware measurement data into one or more trial functions and one or more derivatives of the one or more trial functions; determining, by the classical computer system, on the basis of the one or more trial functions, the one or more derivatives of the one or more trial functions and a loss function, a score indicating how well the one or more measured trial functions satisfy the one or more DEs.
In an embodiment, the method may include optimizing the loss function, the optimization including adjusting the one or more optimization parameters and repeating the execution of the quantum circuits, the processing of the hardware measurement data and the determination of a score, until the score meets a predetermined optimization condition.
In an embodiment, the one or more derivative function circuits may be obtained by analytic differentiation of the one or more function circuits with respect to the one or more variables
Thus, the embodiments in this application allows solving differential equations, including nonlinear differential equations, of a general form using a quantum computer in a way that is substantially different from the schemes known in the prior art. For a given set of nonlinear differential equations, quantum circuits are constructed on the basis of so-called differentiable quantum circuits (DQCs). These quantum circuits may be executed on the quantum computer and a cost function, e.g. an Hermitian operator such as a Hamiltonian, may be used to measured observables that form an approximation of the solution to the set of nonlinear differential equations are used in a classical optimization algorithm. A loss function may be used to determine if the approximation of the solution is sufficiently close to the solution to the set of nonlinear differential equations.
In an embodiment, the determining if the quantum hardware measurement data forms a solution to the one or more DEs may be further based on one or more boundary conditions associated with the one or more DEs
In an embodiment, if the one or more DEs represents one or more parameterized DEs, the determining if the quantum hardware measurement data forms a solution to the one or more DEs may be further based on one or more boundary conditions associated with the one or more DEs and one or more data points, for example measured data points.
In a further embodiment, the determining if the quantum hardware measurement data forms a solution to the one or more DEs may be based on regularization data representing initial estimated values for parameters used in the one or more parameterized DEs.
In an embodiment, the one or more DEs may include one or more parameterized DEs, and, wherein the determining if the quantum hardware measurement data forms a solution to the one or more DEs is further based on one or more boundary conditions associated with the one or more parameterized DEs and one or more data points, for example measured data points, and, optionally, regularization data representing initial estimated values for one or more parameters used in the one or more parameterized Des.
In an embodiment, the one or more DEs may include one or more parameterized DEs, wherein a right hand side RHS term of the one or more parameterized DEs define a parameterized linear combination of functionals; and, wherein the determining if the quantum hardware measurement data forms a solution to the one or more DEs is further based on one or more boundary conditions associated with the one or more parameterized DEs and one or more data points, for example measured data points, and, optionally, regularization data representing initial estimated values for one or more parameters used in the one or more parameterized DEs.
In an embodiment, the parameterized linear combination of functionals may define as vector inner product ·F, wherein defines a parameter in the form of a vector of coefficients and F is a vector of functionals on ƒ and x,
and, optionally, the loss function including a regularization loss term based on the coefficients, preferably the norm of the coefficients.
In other words, the right-hand side RHS of this equation may be any linear combination of a set of library terms that are expected to be relevant to solving the differential equation. Then, during the loss function optimization, both the quantum circuit parametrization, θ as well as the function coefficients, , may be optimized.
In an embodiment, executing the quantum circuits may include: translating each of the quantum circuits into a sequence of pulses; and, applying the sequence of pulses to the qubits of the quantum computer.
In an embodiment, receiving hardware measurement data may include: applying a read-out pulse signal to qubits of the quantum computer system and measuring quantum hardware measurement data.
In an embodiment, the quantum circuits may be parameterized based on at least an adjustable optimization parameter, wherein if it is determined that the quantum hardware measurement data do not form a solution, then adjusting the at least one adjustable optimization parameter to form adjusted quantum circuits and determining further quantum hardware measurement data based on the adjusted quantum circuits. Further, if the one or more DEs include one or more parameterized DEs, also at least one of the one or more parameters of the one or more parameterized DEs respectively may be adjusted.
In an embodiment, the function circuit may comprise a quantum feature map circuit for encoding a functional dependence, preferably a nonlinear dependence, on one or more variables x of the DEs into quantum wave amplitudes of the qubits of the quantum computer system.
In an embodiment, the quantum feature map circuits may be differentiable into a sum of modified quantum circuits.
In an embodiment, each of the one or more derivative function circuits comprise at least one of the modified quantum circuits.
In an embodiment, the one or more function circuits and/or the derivative circuits may further comprise a quantum variational ansatz circuit, which may be parameterized by an adjustable optimization parameter.
In an embodiment, the one or more derivative function circuits may represent analytical derivatives of the one or more function circuits in the variables x.
In an embodiment, the quantum circuits may represent one or more solutions to the set of DEs according to the functional F[{dnƒ/dxn}n,{ƒm(x)}m]=0, wherein the functional F is determined by the system described by the (non)linear DEs.
Hence, the embodiments in this application use quantum feature map encoding to overcome the complexity of amplitude encoding that is used in the prior art for preparing the solution at the boundary; use automatic differentiation of the quantum feature map circuit, allow to represent analytical function derivatives without imprecision error characteristic to numerical differentiation (finite differencing);
In an embodiment, the quantum hardware measurement data may include measured trial function values and measured trial differential values and the loss function may be used to determine if the measured trial function values and the measured differential values form a solution to the set of differential equations.
In an embodiment, receiving hardware measurement data may include: applying a read-out pulse signal to qubits of the quantum computer system and measuring quantum hardware measurement data.
In an embodiment, the quantum hardware measurement data may be measured as expectation values of a Hermitian cost operator, preferably a Hamiltonian operator.
In an embodiment, the quantum circuits may comprise a quantum feature map circuit, which encodes a (non-)linear function dependence on differential equation variable(s) x.
In an embodiment, the quantum feature map circuit may comprise a product-type feature map, a Chebyshev feature map, a Chebyshev sparse feature map, a Chebyshev tower feature map, a Fourier-type feature map, an Evolution-enhanced feature map, and amplitude-encoding feature map, or any other feature map
In an embodiment, the quantum circuits may comprise a variational quantum circuit, which can be used to variationaly optimize quantum ampltidues such that the cost function is minimized.
In an embodiment, the variational quantum circuit may include a hardware efficient ansatz or an alternating blocks ansatz
In an embodiment, the cost function may represent the differential equation's function value at a point xj.
In an embodiment, the cost function may be comprised of single-qubit operators, general Ising Hamiltionian operators, or chemistry-like general strings of Pauli-string operators which may or may not have variationally optimizable coefficients.
In an embodiment, the loss function may be based on mean-squared error, mean-absolute error, Kull-back-Leibler (KL) divergence, Jensen-Shannon divergence, or other error or divergence metrics.
In an embodiment, the boundary may be handled by a boundary-pinning strategy, a floating boundary strategy, or an optimized boundary handling.
In an embodiment, regularization prior information may be applied to help convergence and reducing risk of getting stuck in local minima.
In an embodiment, non-regularization (real) information, data points, physical measurements, or any other a-priori known data may be used to solve systems of parametric differential equations where parameters of the equation are not exactly known or known to low degree of confidence. Those data points may be considered in the total loss function in order to perform partial regression on the data while still conforming to the (set of) differential equations.
The invention may relate to a quantum-classical hybrid system for solving one or more of the following differential equations: ordinary differential equations ODE, partial differential equations PDE, stochastic differential equations SDE, higher-order DE, higher-degree DE, single- or multiple dimensional DE, one or more/sets of DE, elliptic/hyperbolic/parabolic PDE, initial value and/or boundary value problems, including Dirichlet and/or Neuman boundary condition, Cauchy-type, as well as Ribin and mixed boundary type, control-type problems, including but not limited to Hamilton-Jacobi Belman and Eikonal boundary value problems, delay differential equations, differential-algebraic equations (DAEs), real and complex-valued functions, solving non-linear stochastic differential equation, for example but not limited to those of Ito and Stratonovich form. The invention may further reside in one or more of the following embodiments.
In an embodiment, the variational quantum circuit may include a quantum gate sequence which can be used to variationally optimize quantum amplitudes such that the loss function L is minimized and one or more accurate solution(s) to one or more Differential equations can variationally be found
In an embodiment, the variational quantum circuit includes a hardware-efficient ansatz, where layers of parameterized quantum gate rotations are followed by layers of Controlled-NOT operations in alternating fashion, such that the logical circuit requirements map natively to the quantum hardware information carrier connectivity.
In an embodiment, the variational quantum circuit includes an alternating-blocks ansatz, preferably such that the circuit consists of gate blocks of width M/2 with M larger than 1 and smaller than the number of quantum information carriers1. The blocks are placed in a checker-board pattern, and repeated several times.
In an embodiment, the loss function may define an instruction for a classical computer on how to collect quantum measurement data and convert it to a single quantity (‘loss’) which quantifies the quality of the solution to the (set of) (non)linear differential equations. Preferably, the loss function comprises a differential loss term and a boundary term
In an embodiment, the distance definition of the loss is quantified by means of a Mean Squared Error (MSE) defined by (a−b)2 for single-valued parameters a and b.
In an embodiment, the distance definition of the loss is quantified by means of a Mean Absolute Error (MAE) defined by |a−b| for single-valued parameters a and b.
In an embodiment, the distance definition of the loss is quantified by means of the Kullback-Leibler divergence, defined by the expectation of the logarithmic difference between the considered quantities.
In an embodiment, the distance definition of the loss is quantified by means of a Jensen-Shannon divergence.
In an embodiment, the boundary conditions to the set of one or more differential equations may be included as an additional consideration in the method. Preferably, these boundary conditions determine the specific solution of the differential equations.
In an embodiment, the boundary conditions may be included directly in the loss function as an additional term.
In an embodiment, the boundary conditions may be included in the cost operator C with a fixed additional term, presenting an equivalent treatment of boundary and derivative terms, both being encoded in the eigenspectrum of the cost operator.
In an embodiment, the boundary conditions are included via iteratively shifting the estimated trial solution based on the boundary or initial point, at each iteration of the classical computer's variational optimization sequence.
In an embodiment, the boundary conditions may be included via a classical shift of the solution defined by the gradient descent procedure on par with variational angles' optimization. Preferably this constitutes another variational parameter to be passed to the classical optimizer.
In an embodiment, the variational optimizer is prevented from getting trapped in local minima, by adding a regularization procedure. Preferably, this includes one or more of the following strategies: feeding in prior information about the potential solution; biasing the solution into a specific form; searching for a solution in a region close to the known boundary values of the differential equation(s).
In an embodiment, the regularization term is linearly decreasing in weight contribution to the loss function.
In an embodiment, the regularization term may be smoothly dropped at pre-defined training stages, which may in an embodiment correspond to a reverse sigmoid optimization schedule.
In an embodiment, the differential equation may be an Ordinary Differential Equation (ODE).
In an embodiment, the differential equation may be a Partial Differential Equation (PDE), where preferably several quantum feature map circuits are concatenated or interleaved.
In an embodiment, the differential equation may be a parameterized differential equation, where one or more of the elements in the equation are (hyper)parameters rather than functions (ƒ(x), g(x, y)) or dependent/dimensional variables (x, y, z, t). To solve such equations for a specific instance, additional data is required and used as boundary conditions on the parameters, and regularization in the form of initial suggested values for the parameters may be given.
In an embodiment, the differential equation may be a higher-order DE, where one or more functions are differentiated more than once.
In an embodiment, the differential equation may be of higher-degree or non-linear, where one or more functions are raised to a power other than zero or one, or argument to one or more non-linear functions themselves.
In an embodiment, the differential equation may depend on multiple unique variables x, y, z . . . , also known as dimensions.
In an embodiment, the differential equation considered may comprise a set of more than one differential equation.
In an embodiment, the differential equation may be an elliptic, hyperbolic or parabolic PDE.
In an embodiment, the differential equation considered describes an initial value and/or boundary value problem, including but not limited to Dirichlet-type
In an embodiment, the differential equation may have one or more boundary conditions of one of {Dirichlet, Neuman, Cauchy, Ribin, mixed}-type.
In an embodiment, the differential equation may describe a control-type problem, including but not limited to Hamilton-Jacobi Bellman equation, and Eikonal boundary value problems.
In an embodiment, the differential equation may constitute a (set of) delay differential equations.
In an embodiment, the differential equation may describe differential-algebraic equations (DAEs).
In an embodiment, the differential equation(s) include one or more (non-) linear stochastic differential equations, including but not limited to those of Ito and Stratonovich form.
Embodiments in this application may be implemented based on noisy quantum hardware with finite logical gate error and finite coherence times.
In an embodiment, the method may be based on noisy quantum hardware wherein the subroutines of the algorithm may be executed by multiple quantum devices operating in parallel and/or in series, routing measurement data to one classical computer which computes the loss function value each iteration.
In an embodiment, instead of measuring a cost function for each part in the loss function, one may rely on overlap estimations of left-hand-side and right-hand-side of the differential equations in functional form, wherein the quantum hardware quantum information overlap may be considered as a functional overlap.
In an embodiment, the quantum computer system may be implemented as a hardware quantum computer.
In an embodiment, the quantum computer system may comprise qubit-based quantum hardware, where the quantum information carriers are embodied by qubits or quantum bits.
In an embodiment, the quantum hardware may include a continuous-variable system, such that information carriers are defined by continuous quantum variables
In an embodiment, the quantum computer system may be implemented as a software program for simulating a quantum computer system comprising quantum processing elements, for example qubits. Hence, this embodiment, the software program may be a classical software program that runs a classical computer so that quantum algorithms associated with the embodiments in this application can be developed, executed and tested on a classical computer without requiring access to a hardware implementation of the quantum processor system.
In an embodiment, the quantum circuit may include a plurality of different quantum sub-circuits. In an embodiment, the one or more quantum sub-circuit may include one or more quantum feature circuits, each quantum feature circuit being configured to map a variable of the differential equation to a Hilbert space that is associated with the qubits of the quantum computer.
In an embodiment, the one or more quantum sub-circuit may include one or more variational quantum circuits associated with one or more variational parameters for training the quantum circuit to approximate a solution to the differential equation.
In an embodiment, the one or more quantum sub-circuit may include one or more initialization quantum circuits configured to initialize at least part of the variational parameters of the quantum circuit based on one or more initialization parameters.
In an embodiment, at least part of the variational parameters of the one or more variational quantum circuits may be initialized based on initialization values that are classically computed based on a classically simulable version of the quantum circuit that can simulated on a classical computer.
In an embodiment, the classical simulation may include: computing expectation values of the output of the classical simulable quantum circuit, the expectation values defining a function ƒ({right arrow over (x)}); determining a dependent-variable dependence of ƒ({right arrow over (x)}) as a function of initialization parameters {right arrow over (θ)}ini; determining optimal values for the initialization parameters {right arrow over (θ)}ini based on fitting ƒ({right arrow over (x)}) to a desired (estimate of the) solution; initializing the quantum circuit based the optimal values for the initialization parameters {right arrow over (θ)}ini, while keeping the other variational parameters I the the classical simulable quantum circuit.
Thereafter, the quantum circuit may be variationally optimized based on the variational parameters until a predetermined convergence of the output of the quantum circuit with the actual solution is determined.
In an embodiment, the one or more quantum feature maps and the one or more variational quantum circuits, and cost function may be constructed so that the circuit operators exhibit a certain symmetry, such as real-value preserving features. In that case, circuit differentiation can be performed in N+1 evaluations instead of 2N parameter-shift evaluations, speeding up the calculation of function derivatives by an (asymptotic) factor of two.
In an aspect, the invention may relate to a process of initializing a quantum circuit representing a function or an equation. The quantum circuit may include one or more quantum sub-circuits, the one or more sub-circuits including one or more quantum feature maps, one or more variational quantum circuits, and one or more initialization unitary quantum circuits. In an embodiment, a classically simulable quantum circuit associated with the quantum circuit may be determined. To that end, in an embodiment, a relation between at least part of the variational parameters of the one or more variational quantum circuits may be defined so that the quantum circuit defines a classically simulable quantum circuit.
Expectation values at the output of the classically simulable quantum circuit may be determined by simulating the classically simulable quantum circuit using a classical computer and a trial function may be computed based on the expectation values.
Based on the trial function values for the variational parameters of the one or more initialization quantum circuit may be determined. The values for the variational parameters may be optimized by fitting the trial function to desired function. The optimized initialization values may then used to initialize the one or more variational quantum circuits of the quantum circuit.
Thereafter, the variational parameters of the quantum circuit may be variationally optimized until the output of the quantum computer convergence towards a solution of the question.
The above-referred initialization method may be used in quantum learning methods that are executed on a hybrid computer system comprising a classical computer system and a quantum computer as for example described with reference to
Hence, in a further aspect, the invention may relate to a quantum learning method based on a quantum circuit, wherein the quantum circuit may include a plurality of different quantum sub-circuits, including one or more quantum feature circuits, each quantum feature circuit being configured to map a variable of the differential equation to a Hilbert space that is associated with the qubits of the quantum computer and one or more variational quantum circuits associated with one or more variational parameters for training the quantum circuit to approximate a solution. The method may include initializing the at least part of the variational parameters of the one or more variational quantum circuits based on initialization values that are classically computed based on a classically simulable version of the quantum circuit that are simulated on a classical computer; and, variationally optimizing the quantum circuit based on the variational parameters until a predetermined convergence of the output of the quantum circuit with the solution is determined. The invention may also relate to a system that is configured to execute such quantum learning method.
In a further aspect, the invention may relate to a system for solving a set of (non)linear differential equations, DEs, using a hybrid data processing system comprising a classical computer system and a quantum computer system: the system comprising: a memory device including computer-executable instructions and a processor connected to the memory device, the processor being configured to perform executable operations comprising: instructing the classical computer system to receive or determine a formulation of quantum circuits representing the set of DEs, the quantum circuits being parameterized by variables x of the DEs and including one or more function circuits for determining one or more trial functions values f(xj) around one more points xj and one or more derivative function circuits for determining one or more trial derivative values around the one or more points xj; instructing the quantum computer system to execute the quantum circuits for a set of points xj in the variable space x of the DEs; instructing the classical computer system, in response to the execution of the quantum circuits quantum, to determine hardware measurement data; and, instructing the classical computer system to determine on the basis of the quantum hardware measurement data and a loss function, if the quantum hardware measurement data forms a solution to the set of (non)linear DEs.
The invention may also relate to a computer program or suite of computer programs comprising at least one software code portion or a computer program product storing at least one software code portion, the software code portion, when run on a computer system, being configured for executing any of the method steps described above.
The invention may further relate to a non-transitory computer-readable storage medium storing at least one software code portion, the software code portion, when executed or processed by a computer, is configured to perform any of the method steps as described above.
The invention will be further illustrated with reference to the attached drawings, which schematically will show embodiments according to the invention. It will be understood that the invention is not in any way restricted to these specific embodiments.
The system may further comprise a (purely classical information) input 112 and an (purely classical information) output 114. The data processor systems may be configured to solve nonlinear differential equations using the quantum computer. Input data may include information related to the differential problem to be solved. This information may include the differential equations, boundary conditions, information for construction quantum circuits that can be executed on the quantum computer and information about an optimization process that needs to be executed to compute the solutions to the nonlinear differential equations. The input data may be used by the system to construct quantum circuits, in particular differentiable quantum circuit, and to classically calculate values, e.g. sequences of pulses, which may be used to initialize and control qubit operations according to the quantum circuit. To that end, the classical computer may include a differentiable quantum circuit (DQC) generator 107. Similarly, output data may include ground state and/or excited state energies of the quantum system, correlator operator expectation values, optimization convergence results, optimized quantum circuit parameters and hyperparameters, and other classical data.
Each of the one or more quantum processors may comprise a set of controllable quantum processing elements, e.g. a set of controllable two-level systems referred to as qubits. The two levels are |0 and |1 and the wave function of a N-qubit quantum processor may be regarded as a superposition of 2N of these basis states. The embodiments in this application however are not limited to qubits but may include any multi-level quantum processing elements, e.g. qutrits, that is suitable for performing quantum computation Examples of such quantum processors include noisy intermediate-scale quantum (NISQ) computing devices and fault tolerant quantum computing (FTQC) devices.
The quantum processor may be configured to execute a quantum algorithm in accordance with the gate operations of a quantum circuit. The quantum processor may be implemented as a gate-based qubit quantum device, which allows initialization of the qubits into an initial state, interactions between the qubits by sequentially applying quantum gates between different qubits and subsequent measurement of the qubits' states. To that end, the input devices may be configured to configure the quantum processor in an initial state and to control gates that realize interactions between the qubits. Similarly, the output devices may include readout circuitry for readout of the qubits which may be used to determine a measure of the energy associated with the expectation value of the Hamiltonian of the system taken over the prepared state.
In some embodiments, the first data processor system may be implemented as a software program for simulating a quantum computer system 104 comprising a quantum processor system 108. Hence, in that case, the software program may be a classical software program that runs a classical computer 106 so that quantum algorithms can be developed, executed and tested on a classical computer without requiring access to a hardware implementation of the quantum processor system.
The embodiments in this application aim to solve nonlinear differential equations of a general form using a quantum computer in a way that is substantially different from the schemes known in the prior art. For a given set of nonlinear differential equations, quantum circuits are constructed on the basis of so-called differentiable quantum circuits (DQCs). These quantum circuits may be executed on the quantum computer and a cost function, e.g. an Hermitian operator such as a Hamiltonian, may be used to measured observables that form an approximation of the solution to the set of nonlinear differential equations are used in a classical optimization algorithm. A loss function may be used to determine if the approximation of the solution is sufficiently close to the solution to the set of nonlinear differential equations.
An exemplary workflow of the optimization loop is provided in
Thereafter, the optimization scheme may be started by determining approximate solutions for the set of differential equations at a set of points xj. To that end, for each point xj the quantum circuit is executed on the quantum computer. As shown in the figure, the quantum circuit may include function circuits 126 and derivative function circuits 128. A function circuit may be used to evaluate a function ƒ around point xj and a derivative function circuit may be used to evaluate the derivative of the function dƒ/dx around point xj. The execution of each quantum circuit will result in a measured value, an observable of the quantum state of the quantum computer, representing an approximation of a function value or a derivative value at a particular point xj. Then, the loss function and the measured values may be used to determine if the measured values form a sufficient accurate approximation of the solution to the set of differential equations. If this is not the case, in a further step 130, a classical optimization scheme may be used to update the optimization parameter θ of the variational ansatz circuit. Thereafter, approximate solutions are determined for each xj by executing the quantum circuits based on updated the optimization parameter θ.
It is submitted that the differential equations described with reference to the scheme of
Thus, the quantum circuit used for encoding the value of the function at specific value of the variable x=xj, comprises a feature map Ûφ that encodes the x-dependence into the quantum computer, followed by variational ansatz Ûθ, and observable-based readout for the set of operators . The measurement result is classically post-processed to provide a quantum function representation ƒ(x) as a sum of expectations, where coefficients αl can be optimized in a quantum-classical hybrid loop as e.g. described with reference to
Thus, as can be seen from
A quantum feature map represents a latent space encoding, that unlike amplitude encoding, does not require access to each amplitude and is controlled by classical gate parameters. The quantum feature maps real parameter x to the corresponding variable value. Next, a variational quantum circuit Ûθ parametrized by vector θ that can be adjusted in a quantum-classical optimization loop is used. The resulting state |ƒφ,θ(x)=ÛθÛφ(x)|Ø for optimal angles contains the x-dependent amplitudes sculptured to represent the sought function. Finally, the real valued function can be read out as an expectation value of predefined Hermitian cost operator Ĉ, such that the function reads:
ƒ(x)=ƒφ,θ(x)|Ĉ|ƒφ,θ(x) (1)
The optimization process based on the variational circuit parameter θ and the loss function may be regarded as a quantum machine learning process wherein the quantum circuit comprising the variational quantum circuit may define a plurality of mutually interacting qubits that can be adjusted variationally by (at least on) variational circuit parameter θ. This way, the plurality of mutually interacting qubits may define a parametrized quantum circuit that can be trained based on variational circuit parameter θ, training data and a loss function to approximate a function ƒ(x) for a certain value of variable x.
The differentiation of quantum feature map circuits may be defined by the following expression: dÛφ(x)/dx=ΣjÛdφ,j(x), which allows the action differential to be represented as a sum of modified circuits Ûdφ,j. This way, function derivatives may be represented using a product derivative rule. Thus, in case of a quantum feature map generated by strings of Pauli matrices or any involutory matrix, the parameter shift rule may be used such that a function derivative may be expressed as a sum of expectations:
with |uƒdφ,j,θ±(x) defined through the parameter shifting, and index j runs through individual quantum operations used in the feature map encoding. Applying the parameter shift rule once again a second-order derivative d2u(x)/dx2 may be obtained with four shifted terms for each generator.
Importantly, to perform quantum circuit differentiation, the automatic differentiation (AD) technique may be used. AD allows to represent exact analytical formula for the function derivative using a set of simple computational rules, as opposed to the numerical differentiation. Since automatic differentiation provides an analytical derivative of the circuit in at any point of variable x, the scheme does not rely on the accumulated error from approximating the derivatives. Notably, all known prior art schemes for quantum ODE solvers involve numerical differentiation using Euler's method and finite difference scheme that suffers from approximation error, and often require fine discretization grid. The embodiments in this application alleviate this problem.
One of the aims of the invention is to define the conditions for the quantum circuit to represent the solution of differential equations, generally written as F[{dnƒ/dxn}n, {ƒm(x)}m]=0, where the functional F[⋅] is provided by the problem. This demands that derivatives and nonlinear functions need to give net zero contribution. Hence, solving the differential equations may be written as an optimization problem using a loss function θ[dxƒ, ƒ, x]. This corresponds to minimization of F[x]|x→x
are found, the solution from Eq. (1) as a function can be produced. Hence, the embodiments in this application:
For the latter point, it differs from amplitude encoding |u in HHL and related methods, where getting the full solution from amplitudes is exponentially costly and requires tomographic measurements.
One of the aims of the invention is to construct circuits that can work for quantum processors with limited computational power, meaning with the gate depth (number of operations to performed in series) being limited to a certain limited amount. The gate depth largely defines the training procedure, which is relied upon in the classical optimization loop. Alleviating the reduced depth problem, it is also possible to exploit parallel training strategies for the quantum circuit and quantum state encoding, coming closer to the ideal quantum operation regime.
Below quantum circuits are described that may be used to build differentiable circuit as a solution of differential equations. The quantum circuits include quantum feature maps and their derivatives; variational quantum circuits (ansatz); cost functions that define trial functions; and loss functions that are used in the optimization loop. Additionally, boundary handling techniques, regularization schemes, and a complete optimization schedule are described.
A quantum feature map is a unitary circuit Ûφ(x) that is parametrized by the variable x and typically nonlinear function φ(x). Acting on the state, it realizes a map xÛφ(x)|Ø s.t. the x-dependence is translated into quantum state amplitudes. This is also referred as a latent space mapping. Different ways of feature map encoding exist. Below various examples are described including a Chebyshev quantum feature map that allows to approximate highly nonlinear functions. The procedure of feature map differentiation, as an important step in constructing quantum circuits for solutions of differential equations is also described.
In a first embodiment, a product feature map may be used that uses qubit rotations.
Preferably, the product feature map has nonlinear dependence on the encoded variable X. In the simplest case, this may correspond to a single layer of rotations. Such product feature map may be described by the following expression:
where N′≤N is a number of qubits that is used by the quantum computer for the encoding. Further,
is a Pauli rotation operator for Pauli matrices {circumflex over (P)}α,j={circumflex over (X)}j, Ŷj or {circumflex over (Z)}j, (α=x,y,z, respectively) acting at qubit j for phase φ. As we consider rotations on different j here the symbol ⊗j denotes the tensor product. This type of feature map circuit is also used in Quantum Circuit Learning. The next step is to assign a nonlinear function for rotation. In an embodiment, the nonlinear function may be selected as φ(x)=arcsin x and α=y such that only real amplitudes are generated. The unitary operator of Eq. (5) may then rewritten as:
leading to amplitudes that depend on the encoded variables as cos[(arcsin x)/2] and sin[(arcsin x)/2]. Acting on the initial state |Ø this feature map may encode the variable as an N-th degree polynomial formed by {1, x, √{square root over (1−x2)}} and products [QCL]. The redundancy from many qubits thus forms a basis set for function fitting [S-S]. % We note that while basis set can be continuously improved adding more rotations, this choice of basis is well-suitable for linear and quadratic functions, but often lacks the expressive power.
The product feature map can be generalized to several layers of rotations =1, 2, . . . , L, various nonlinear functions and specific subsets of qubits N written as:
Below an example of how the quantum feature map can be differentiated is provided, e.g. the example in Eq. (4) wherein α=y rotations and full layer are considered. The derivative for the unitary operator generated by any involutory matrix (length-1 Pauli string in this case) can be written as:
where Euler's formula can be used to rewrite the derivative into the form of a sum of unitaries, where x-dependent rotations are shifted one-by-one. Next, the formula may be generalized to the expectation value of any observable Ĉ for the encoded state, following the step of standard parameter shift rule. This reads:
where + and − are the sum of shifted unitaries:
The corresponding derivative quantum circuits (DQC) are shown in
In another embodiment, a nonlinear quantum feature map may be used which may be referred to as the Chebyshev feature map. Belonging to the product feature map family, this feature map drastically changes the basis set for function representation. As a building block a single qubit rotation {circumflex over (R)}y,j(φ[x]) may be used, but with nonlinearity introduced as φ(x)=2n arccos x, n=0, 1, 2, . . . , such that the encoding circuit reads:
Here it is considered that the coefficient n[j] may in general depend on the qubit position. The seemingly small change of factor two multiplication goes a surprisingly long way. Namely, let us expand the rotation using Euler's formula, getting:
The resulting decomposition of equation (11) corresponds to the unitary operation with matrix elements defined through degree—n Chebyshev polynomials of first and second kind, denoted as Tn(x) and Un(x). Formally, Chebyshev polynomials represent a solution of Chebyshev differential equation:
and wherein A, B are some constants. Chebyshev polynomial of the first kind for low degrees can be written explicitly as T0(x)=1, T1(x)=x, T2(x)=2x2−1, T3(x)=4x3−3x, and higher degrees can be deduced using the recursion relation
T
n+1(x)=2xTn(x)−Tn−1(x). (14)
Similarly, second-kind Chebyshev polynomials can be written as U0(x)=1, U1(x)=2x, Un+1(x)=2x, and Un(x)−Un−1(x). The crucial properties of Chebyshev polynomials are their chaining properties, nesting properties, and simple differentiation rules. The chaining properties for polynomials of the first and second kind read as 2Tm(x)Tn(x)=Tm+n(x)+T|m−n|(x) and Um(x)Un(x)=Σk=0nUm−n+2k(x), respectively. Derivatives can be obtained as dTn(x)/dx=nUn−1(x). Nesting corresponds to the relation Tn(Tm(x))≡Tnm(x). Finally, polynomials of different kinds can be converted as Un(x)=2ΣjnTj(x) when j is even, and Un(x)=2Σjn [Tj(x)−1] when j is odd. Finally, it is noted that Chebyshev polynomials may represent oscillating functions when defined in the x=(−1,1) region, and their derivatives diverge at the boundaries of this interval.
The power of the representation described can be inferred from the approximation theory. It states that any smooth function can be approximated as ƒ(x)=Σn=0∞AnTn(x), |x|≤1. Chebyshev polynomials form an optimal set of basis function in the sense of the uniform L∞ norm. This is why they are at the foundation of spectral algorithms for solving ODEs, and also give an edge in quantum simulation.
In examples described below, two types of Chebyshev quantum feature maps are considered. The first version corresponds to a sparse Chebyshev feature map defined as:
where encoded degree is homogeneous and is equal to one. Here the chaining properties Tn (x) and Un (x) should be remembered, noting that once states with Chebyshev polynomials as pre-factors are created, the basis set will grow further by concatenating elements. In the following, the sparse distinction is dropped and simply refer to Eq. (15) as Chebyshev feature map. The second version corresponds to a Chebyshev tower feature map, which may be defined as:
where the encoded degree grows with the number of qubit, creating a tower-like structure of polynomials with increasing n=j. Again, as polynomials chain together and morph between two kinds and their degrees, the basis set is largely enriched. This is the choice that is exploited when large expressibility is needed without increasing system size and number of rotations. Eq. (16) allows the representation of generic functions, and can be improved further by using layers of rotations as in Eq. (6).
Product feature maps may induce nonlinear mappings between variable(s) x and quantum states described by tensor products of separate single-qubit wavefunctions. These states are limited to the subspace of product states. To utilize the power of the entire Hilbert space of the system, approaching the amplitude encoding case, independently distinct amplitudes need to be populated, including the subspace of entangled states. To make the described feature maps even more expressive, it is suggested to enhance the product feature maps (and specifically the layered Chebyshev map) with additional entangling layers represented by Hamiltonian evolution. Namely, after the set of single qubit rotations another unitary exp(−iĤτ) may be considered which acts for time τ and is generated by the Hamiltonian Ĥ. The sketch of the circuit is shown in
One important choice is when Ĥ corresponds to a hard problem from NP-hard complexity class, as proposed. Then using two layers of rotations plus evolution the embedding becomes difficult to simulate classically, but can be implemented as unitary evolution on a quantum computer. The evolution-enhanced feature map can also be seen through the prism of a recently proposed Fourier feature maps, which are a class of quantum feature maps based on the evolution exp (−iĤdata,Jx), which is applied for qubits in J. The Fourier map allows functions to be encoded as Fourier series defined by the differences of the eigenvalues of Ĥdata. The evolution-enhanced feature map then joins the Chebyshev and Fourier basis sets, encoded in the full Hilbert space for complex Ĥ.
In another embodiment, a variable may be encoded the data using a feature map by transforming it into the canonical amplitude encoding form. This relates x, written in binary form, to a computational basis state in binary representation. The corresponding feature map Ûφ(x) to encode the binary variable x reads
where {xj} denote binary values for the parameter x in j-th digit. The differentiation of the amplitude-encoding feature map then relies on the product rule for N rotations, and also includes the binary derivative of the variable from the product rule.
In a further embodiment, a variable may be converted into the decimal representation as xint=Σjxj·2j. For the reverse procedure, each binary digit xj can be identified by the remainder of the repeated division xj=mod(xint, 2j). Ûφ(x) thus can be rewritten as a function of xint, and learn how to differentiate circuits with this feature map with respect to x=xint.
Amplitude-encoding feature maps offer a powerful technique when dealing with functions of discrete variables and functions encoded as quantum wavefunctions (rather than expectation value). They can give an advantage in terms of compressing data to a quantum register.
To construct the solution of the differential equation as a quantum circuit we need to manipulate the latent space basis function and bring both derivatives and function to the required form. This is achieved through the variational circuit Ûθ, typically referred to as a quantum ansatz. Below various detailed architectures are described.
In an embodiment, a variational circuit Ûθ one or more layers of parametrized rotations may be selected. These layers may be followed by layers of CNOT operations. This is known as a hardware efficient ansatz (HEA), which was proposed for variational quantum encoder VQE schemes for chemistry applications. The structure of a HEA quantum circuit corresponds to concatenated layers of single qubit rotations and global entangling layers for all N qubits or at least a large part thereof
In a further embodiment, an alternating blocks ansatz (ABA) may be used, where instead of global entangling layers separate subblocks are used, interleaved into a checkerboard form as shown in
It is noted that the choice of ansatz may be sensitive to the choice of cost function operator, as dictated by the symmetry. Namely, as a consequence of cost function choice non-commuting generators for the variational ansatz need to be selected such that []≠0, where a generator is the rewritten form of a general unitary as Ûj=. Here, a generator refers to a quantum operator that acts on one or more of the qubits in a controlled fashion. It is quite a generic concept and cover basically any circuit, gate etc. This ensures that the solution space can be spanned. Also, symmetries may be taken into account, reducing the Hilbert space for the search. In many cases generators can be chosen such that only real amplitudes are generated. An adaptive strategy or a genetic search may also be used.
Further, to search for the optimal circuit parameters, a stochastic gradient descent scheme may be used, and specifically its adaptive version represented by Adam. For this, one or more gradients of the variational circuit ∇θ may be measured using the automatic differentiation approach. Choosing an ansatz parametrized by single-qubit rotations allows the application of the parameter shift rule, while overlap measurement opens up options for more general strategies.
To read out information a Hermitian operator may be selected to measure an observable. In general, different choices are available. In an embodiment, a Hermitian operator may comprise the magnetization of a single qubit j, {circumflex over (Z)}j. This suits functions with range bounded to [−1,1]. In another embodiment, a Hermitian operator comprising the total magnetization of the system Ĉ=Σj{circumflex over (Z)}j (with equal weights) may be selected.
In further embodiments, the cost may be selected as a quantum Hamiltonian that has a provably complex spectrum, and, for instance, belongs to ergodic phase. Such Hermitian operator may comprise an Ising Hamiltonian with additional transverse and longitudinal magnetic fields,
where the Ising couplings Jj,j+1 and hjz,x can be inhomogeneous. For cost functions with several non-commuting groups of observables the Hamiltonian averaging procedure may be used, where term-by-term measurement may be performed. Instead of nearest-neighbour Hamiltonians, in an embodiment, spin-glass type cost functions may be used, which may have the form
These are known to include NP-hard problem instances, and allows for high expressibility of the circuit describing the DEs solution. Finally, a generic cost function may comprise a large set of Pauli strings, similar to, for instances, in quantum chemistry.
Together with measuring individual cost functions, functions being represented by classically weighted sum of observables may also be considered. Such cost function may have the form
where αl∈R are weighting coefficients, and Ĉl are cost functions that can be chosen from the pool of operators described above. The coefficients may be tunable, such that the gradient descent (represented by Adam in our case) can adjust the cost to have optimal form. This procedure further improves the strength of the hybrid quantum-classical workflow.
To solve the system of differential equation a means needs to be provided that allows to measure (e.g. in terms of a distance) how well the differentiable quantum circuit matches the conditions to be the solution of the problem being considered. The classical optimiser then updates the parameters to reduce this distance. This distance corresponds to the difference between the differential equation and zero evaluated at a set of points, as well as matched initial and boundary conditions. This can be reformulated as an optimization problem for a loss function of derivatives and functions evaluated at the grid of points.
A loss function parametrized by variational angles θ in the following example form may be used
θ
[d
x
ƒ,ƒ,x]=
θ
(diff)
[d
x
ƒ,ƒ,x]+
θ
(boundary)
[ƒ,x], (20)
where the loss contribution from matching the differentials θ(df) and the loss contribution from satisfying the boundary conditions θ(budr) may be splitted. The differential loss is defined as
with L (a, b) being a function describing how the distance between the two arguments a and b is being measured. The loss may be estimated on a grid of M points, and is normalized by the grid size. Functional F corresponds to the differential equation written in the form F[dxu,u,x]=0. It can be evaluated by combining values of ƒ and dxƒ at the training grid points. The functional includes information about all differential equations when dealing with the system, such that contributions from all equations are accounted for. The boundary loss contribution reads
θ
(boundary)
[ƒ,x]=ηL(θ(x0),u0) (22)
which includes the distance between the function value at the boundary x0 and given boundary value u0. It is noted that x0 can be an initial point or a set of boundary points. A boundary pinning coefficient η may be used to control the weight of the boundary term in the optimization procedure. In particular, larger η>1 may be used to ensure the boundary is prioritized and represented to higher precision.
Different choices of the loss defined by three distance definitions L may be used. In an embodiment, a loss type corresponding to the mean square error (MSE) may be used:
L(a,b)=(a−b)2. (23)
While being simple, MSE performs sufficiently well in numerical simulations. In a further embodiment, a mean absolute error (MAE) may also be used as a loss defined with distance L(a, b)=|a−b|.
In further embodiments, several more complex metrics may be used, including the variants of Kullback-Leibler (KL) divergence and Jensen-Shannon divergence. These are well known loss functions that are routinely used in statistical modelling.
The choice of loss functions dictates how the optimizer perceives the distance between vectors and therefore affects the convergence. MSE places a greater emphasis on larger distances and smaller weight on small distances, strongly discouraging terms with large L. Both MAE and KL do not place such an emphasis and may have slower convergence. However, once close to the optimal solution they can achieve higher accuracy than MSE. KL has an additional incentive for keeping the magnitude of the first argument low, which for the differential loss term works well as one want to match the differential equation to zero.
Constructing a quantum circuit that satisfies differential equations, together with matched derivatives it needs to be ensured that an initial value or boundary value problem is solved. Generally this corresponds to setting the function value at the required initial point or the collection of boundary points, thus resembling the quantum circuit learning task [QCL]. At the same time, there are several ways how the DQC-based function ƒθ(x) can be constructed, leading to varying performance and specific pros/cons when solving particular problems.
Information about the boundary can be included as part of the loss function as defined by Eq. (20). For a MSE loss function type the boundary part Eq. (22) can be written in the form
θ
(boundary)
[ƒ,x]=η(ƒθ(x0)−u0)2, (24)
where x0 represents the set of boundary points (or an initial point), and u0 is a vector of boundary values, and η is a pinning coefficient as described above.
In an embodiment, information about the boundary may be included in the expectation of the cost function. This may be referred to as pinned boundary handling. This corresponds to simply choosing a cost operator , and representing the solution in the form
ƒ(x)=ƒφ,θ(x)|Ĉ|ƒφ,θ(x). (25)
The initial value u0 is then matched to via the boundary term in the loss function. The strength of the pinned boundary handling is in the equivalent treatment of boundary and derivative terms, both being encoded in the eigenspectrum of . At the same time, it needs adjusting the boundary value starting from the one represented by initial θinit, typically generated randomly. This can be adjusted by shifting ƒ(x) by a constant-times-identity term added to the cost operator, =α0+Σj=1Mαj, where α0 is set such that for θinit˜random[0,2π] the function ƒφ,θ
In a further embodiment, a boundary handler may correspond to iteratively shifting the estimated solution based on the boundary or initial point. For this method the boundary information does not require a separate boundary loss term nor is it encoded in the expectation of the cost function. Instead it is set iteratively within the parametrisation of the function. This method does lead to additional terms in the function and in derivatives calculated and so information about the boundary is contained within θ(df)[dxƒ, ƒ, x] itself. The function may be parameterized as
ƒ(x)=ƒb+θφ,ƒ(x)|Ĉ|ƒφ,θ(x), (26)
with ƒb∈R being a parameter adjusted after each iteration step as
ƒb=u0−ƒφ,θ(x0)|Ĉ|ƒφ,θ(x0). (27)
This effectively allows the solver to find a function ƒφ,θ(x)||ƒφ,θ(x) which solves the differential equation shifted to any position, then being shifted to the desired initial condition as shown in Eq. (26). This method of boundary handling guarantees exact matching to initial values given and does not require a separate boundary term in the loss function, thus the derivative loss term does not have to compete with the boundary loss. Furthermore, as the cost function may be allowed to match to the solution shifted by any amount, this simplifies the choice for optimal angles and removes the dependence on initial θinit. However, this method does require to have an exact initial value which can be an issue in specific situations.
In a further embodiment, a boundary handing technique may be used that relies on the classical shift of the solution, but defined by the gradient descent procedure on par with variational angles optimization. This removes the need to include boundary information in the cost expectation, but information needs to still be included in the loss function whether via a boundary loss term or regularisation. Thus, a solution in the form
ƒ(x)=ƒc+ƒφ,θ(x)|Ĉ|ƒφ,θ(x), (28)
Is sought for, where ƒc∈R is a variational parameter alongside the quantum ansatz angles and updated accordingly via the classical optimiser, therefore the gradients for ƒc have to be calculated additionally when using this boundary handler. Strengths of the described method are that due to the classical shift even if the random initial angles start such that ƒφ,θ
Given that the aim is to find a variational spectral representation of the differential equations solution using large basis sets, the optimization procedure benefits from having a good initial guess, or “pre-trained” DQCs. This can be achieved by introducing the regularization procedure, also helping avoid getting the optimizer trapped in local minima.
The variants of the regularization procedure include: 1) feeding-in prior information about the potential solution; 2) biasing the DQC-based solution into a specific form; 3) searching solution in the region close to the boundary values, and feeding-in points from the first training into next sessions. The input for procedures 1) and 2) include regularization points for the variable(s) {xreg}r=1R, together with corresponding function values {ureg}r=1R for R points. Similarly, regularization based on the derivative values may be considered. A simple strategy may be employed wherein an additional contribution to the loss function comes from the regularization points, θ(reg) [ƒ,x]. This loss is defined such that DQC-based function matches the regularisation values at corresponding grid points. This has a form analogous to the boundary loss contribution. Using MSE loss as example, the regularization contribution reads
where ζ(nj) is introduced as an iteration step-dependent regularization weight, and thus denoting an optimization schedule. In general, higher emphasis on the regularisation-based training at initial stages may be required, which shall diminish to zero at higher iteration numbers. This allows to use a prior information at first, setting a rough solution or preferred function behavior, followed by precise derivative loss optimisation at later training stages.
One possible choice of an optimization schedule corresponds to linearly decreasing regularization weight, ζ(nj)=1−nj/niter, where nj is current iteration number and niter the maximum iteration number. This strategy works for small learning rates and large number of iterations, such the optimizer has sufficient “time” to adjust to the constantly changing loss landscape. Another choice corresponds to the reverse sigmoid optimization schedule}, where a smooth drop of regularization weight is performed at pre-defined training stage. This schedule may be parameterized as
where ndrop denotes the iteration step number at which regularisation weight drops, and δj assigns the transition rate. This allows the DQC to initially focus almost entirely on the regularization optimization, later switching the focus on the gradient optimization.
Finally, using the elements and strategies described above, a workflow for constructing the differential equation solver based on derivative quantum circuits described.
The derivative quantum circuits may be determined 611 and their expectation value dC(x, θ)/dx is estimated 610 for the specified cost function, at point xj. Repeating the procedure 606 for all xj in {X}, function values and derivative values may be collected, and the loss function is composed for the entire grid and system of equations (forming required polynomials and cross-terms by classical post-processing) as shown in 612. The regularization points are also added, forcing the solution to take specific values at these points. The goal of the loss function is to assign a “score” to how well the potential solution (parametrized by the variational angles θ) satisfies the differential equation, matching derivative terms and the function polynomial to minimize the loss.
With the aim to increase the score (and decrease the loss function), we also compute the gradient of the loss function 612 with respect to variational parameters θ. Using the gradient descent procedure (or in principle any other classical optimization procedure 614), the variational angles may be updated from iteration nj=1 into the next one nj+1 in step 616: θ(n
Once the classical loop is exit, the solution is chosen as a circuit with angles θopt that minimize the loss. Finally, the full solution is extracted by sampling the cost function for optimal angles uφ,θ(x)||uφ,θ(x). Notably, this can be done for any point x, as DQC constructs the solution valid also beyond (and between) the points at which loss is evaluated originally.
Hereunder it is described how the algorithm performs in practice. For this a differential equation with known analytical solution may be selected, and compare it to the one obtained by the derivative quantum circuit. We choose a single ODE for the initial value problem which reads
where λ and κ, and u0 sets the value of the function u at x=0. Eq (31) has a solution in the form of a damped oscillating function,
u(x)=exp(−κλx)cos(λx)+const, (32)
Where the constant const is determined by the initial condition. While the problem is fairly simple being a single ODE, reproducing the damped oscillating solution requires a reach basis, that needs to include both oscillatoric and increasing/decreasing functions. As λ and κ grow, the function starts to oscillate and decay rapidly, and the solution becomes harder to express.
To show how the proposed method works, derivative quantum circuits are used to solve Eq. 31 using optimization of differentiable quantum feature maps. Specifically, two cases with parameters λ=8 and λ=20 are selected and fixed κ=0.1, u0=1. Note that parameters are chosen to make DQC construction challenging, with λ=20 being a complex case as the resulting solution is highly nonlinear and oscillatoric. An optimization grid of 20 points may be considered, starting from x=0, with maximal time of 0.9 (dimensionless units are used). To find the solution a quantum register with N=6 qubits may be used, and the cost function is chosen as total magnetization in the Z direction, Ĉ=Σj=1N{circumflex over (Z)}j. For the variational circuit, a standard hardware-efficient ansatz described in this application may be chosen, setting the depth to d=5. To search for optimal angles θopt adaptive stochastic gradient descent is performed using Adam with automatic differentiation enabled by analytical derivatives. Specifically, the workflow was coded using a well-known package, which allows fast and efficient implementation. A full quantum state simulator is used in the noiseless setting. In this example we use the floating boundary handling.
The circuit-based solution was searched using three different feature maps described in in this application, and their performance was compared. These correspond to the product feature map, the sparse version of the Chebyshev feature map and the tower Chebyshev feature map. Results for these feature maps were labelled as Prod, Cheb, and ChebT, respectively. To assess the performance several metrics may be used. The first metric is the full loss (denoted as LF). It refers to the loss calculated from the differential equations and any boundary or regularisation terms, θ[dxƒ, ƒ, x]=θ(df) [dxƒ, ƒ, x]+θ(budr)[ƒ, x]+θ(reg) [ƒ, x]. The second metric corresponds to differential loss LD, being a part of the full loss excluding regularization contribution. Finally, the third metric is the quality of solution (LQ). The quality of the solution is the distance of the current DQC-based solution from the known true solution. This is calculated by evaluating the DQC-based solution and true solution at a set of points and using the MSE loss type, being equal to LQ=(1/M) Σj=1M[ƒ(xi)−u(xj)]2. Quality of solution gives us a useful way to compare how two different training setups perform, especially if they are training to solve the same differential equations.
For λ=8 it is observed that both Chebyshev feature maps converge closely to the true solution. The more expressible Chebyshev tower feature map takes longer to converge but reaches a solution closer to the true solution. The less powerful product feature map fails to converge with the loss quickly plateauing.
For λ=20 the true solution is more oscillatoric and has stronger damping, making the solution harder to represent. The product feature map still fails to converge, but now also failing to converge is the simpler Chebyshev feature map. The full loss for both cases plateaus quickly. The more expressible ChebT feature map continues to perform well. This confirms that choosing a feature map expressible enough for the problem is important, and more simulations with more qubits offers the way to increase the power drastically, however more expressible feature maps can come with increased training time.
Performing the benchmarking for evolution-enhanced feature maps, it is observed that adding an evolution operator T creates a large pool of fitting functions and helps to represent highly oscillatoric functions, as implemented in quantum circuit learning. At the same time, DQCs in this case are more difficult to train, as keeping track of the gradients for oscillatoric function requires finer training grid and has a slower convergence. First, it is envisaged that τ[nj] can be set as a flowing parameter that increases from zero as iteration number nj grows (similar to annealing procedure as in adiabatic quantum computing). This will ensure easy trainability at the start, followed by adjusting θ's for a larger fitting function set providing extra precision. Another approach is considering τ as a variational parameter, initiated at some small value, where gradient descent is performed using numerical differentiation or measuring wavefunction overlaps with the SWAP test.
Finally, the effect of ansatz depths for the variational circuit is compared. Here, d=3,6,12,24, λ=20 and the Chebyshev tower feature map but the rest of the training set up remains the same as previously considered. It is observed that for lower depths the solver is slower to converge and does not reach as high accuracy as it does for higher depths. As depth increases, more layers of parametrized gates are included in the variational ansatz and so the number of variational angle parameters increase. This causes an increase in the number of gate operations needed in each iteration and how many parameters the classical optimiser needs to update, raising the time taken per iteration. As the depth of the ansatz continues to increase eventually the problem of barren plateaus could be encountered. Then vanishing gradients would cause the solver to struggle to improve the parameters, however at d=24 with over 400 variational parameters these problems were not observed.
Building up on the single ODE example, systems of differential equations may be considered, taking two strongly coupled differential equations as an example. These equations may describe the evolution of competing modes u1(x) and u2(x) as a function of variable x, which in this case corresponds to time. The associated rate equations read
where λ1,2 are coupling parameters, u1,0, u2,0 are initial conditions. The larger |λ1| is in comparison to |λ2| the more strongly coupled the two equations will be. This can be intuitively seen considering |λ1| leading to larger contribution of u2 into the equation for du1/dx derivative than u1, and vice versa.
To move from considering one differential equation to a system of equations we need to encode multiple functions using quantum circuits as described with reference to the embodiments in this application. For this specific example a simple parallel encoding may be selected, where separate cost functions and ansatzes are considered. In this case each function has a separate set of parameters to be optimised. The loss is changed accordingly to include information on separate contributions from two coupled differential equation that are optimised simultaneously. We encode each function using the differentiable feature map combined with individual variational ansatz parametrized by the set of angles θ1 and θ2 and before deciding on the boundary evaluation type we have
ƒ1(x)=Øϕ†(x)θ
ƒ2(x)=Øϕ†(x)θ
where Ĉ(1,2) are in principle disserent cost functions for each equation. For the loss function we consider the sum of the MSE losses for the first and second differential equation. This loss is written as θ[dxƒ, ƒ, x]=Σi=1ML(F1 [dxƒ, ƒ, xi], 0)+Σi=1ML(F2 [dxƒ, ƒ, xi ], 0) with F1 and F2 as written in Eq. (33)-(34), and L depends on the loss choice as detailed in the Methods section. There can be additional boundary loss terms depending on boundary evaluation method chosen. If present, they contribute to the loss function as a sum of the boundary terms for u1 and u2. Note that we consider the loss as a sum of individual contributions coupled together we are trying to minimise both simultaneously with equal weight. Due to the coupling between the two equations this could lead to competition between the two loss terms (a parameter update which would cause a loss decrease for one DE's loss may lead to an increase in the other). This may result in increasing the chance of converging to a local minima rather than the global minimum, that can be mitigated if loss contribution are weighted in some way.
We define the problem setting parameters to λ1=5, ˜λ2=3 and initial conditions to u1,0=0.5, ˜u2,0=0. We set up the training scheme using a six-qubit register, cost choice of total magnetization in the Z direction for both u1 and u2, hardware-efficient variational ansatz with depth d=5, ADAM optimizer with learning rate 0.02, and feature map choice of the Chebyshev tower feature map. We test the performance for the three boundary evaluation types: pinned boundary, floating boundary, and optimised boundary.
The pinned and optimised boundary handlers perform similarly, slowly converging to the analytical solution u1,2 (x) within 250 iterations. The two approaches have similar convergence in terms of the full loss, but differ in terms of quality of solution. When using floating boundary type a function close to the true solution is obtained (with LQ value of approximately 10−4). This difference in convergence rate is a result of boundary information having an impact on the loss, demanding the matching for pinned and optimized boundary handlers, whereas the floating boundary automatically matches the initial condition and no loss boundary term is needed. The consequence of competing terms in the loss function can be seen in the early oscillations of the full loss as shown in the figure.
An area where solvers for complex differential equations are much required is fluid dynamics. In this case several outstanding models are hard to tackle due to their nonlinear nature and possible discontinuous solutions in the form of shock waves. Examples include Burger's equation and Navier-Stokes equations. We concentrate on the latter and show how one can approach it with the DQC solver.
Navier-Stokes equations describe a flow of incompressible fluids. This highly nonlinear set of partial differential equations is used to model fluids, magnetoplasma, turbulence etc. It is heavily used in the aerospace industry and weather forecasting. It can be derived from general principles. Namely, we consider fluid motion that obeys Newton's law and we simply track the fluid mass passing through the (infinitesimal) volume. The general form Navier-Stokes equation can be presented in the form:
where (vx, vy, vz) are instantaneous velocities in the (x, y, z) directions, τ is the stress tensor, ƒ the body force per unit mass acting on the fluid element, ρ the density, and p the pressure. Here V is a velocity field.
The building blocks for the Navier-Stokes equations described above are the continuity equation for the density ρ subjected to energy conservation and the momentum conservation rules. Finally, as we use energy conservation, we can rewrite equations in terms of one of the thermodynamic state functions, being temperature, or pressure, or enthalpy etc.
While general by itself, Navier-Stokes equations are usually solved in relevant limiting cases. Specifically, this can correspond to space reduction (2D, quasi-1D, 1D), isotropic/anisotropic media properties, and fluid properties (viscous or inviscid flow).
where Eq. (40) corresponds to the continuity equation, Eq (41) describes the energy conservation, and Eq. (42) stems from momentum conservation. A(x) corresponds to the spatial shape of the nozzle, and essentially is a potential as a function of the lateral coordinate x. γ describes the ratio of specific heat capacities, and is equal to 1.4 for the relevant case of air flow. Here nondimensional variables were used.
The problem with nozzle shape was set as follows:
A(x)=1+4.95(2x−1)2,0≤x≤3, (43)
and specify boundary conditions as
ρ(x=0,t)=1,T(x=0,t)=1,V(x=0)=0.1, (44)
for concreteness solving the initial value problem. Also the initial conditions needs to be specified if solving the dynamical problem. These may be chosen as follows:
ρ(x,t=0)=1−0.944x, (45)
T(x,t=0)=1−0.694x, (46)
V(x,t=0)=(0.1+3.27x)T(x,t=0)1/2. (47)
First, the stationary state problem represented by Eqs. (40)-(42) with conditions above are considered, and equate the time derivatives to zero. Interestingly, when trying to solve the system for the steady state solution using various classical methods, implemented in Mathematica's NDSolve or Julia's software package, the calculations do not converge. This proves to be challenging as the system is stiff. Solutions may become unstable depending on initial values, and specifically the input velocity. To understand the problem, it is instructive to rewrite the system of stationary Navier-Stokes equations in the form:
Immediately it was observed that each function at the RHS diverges at the point x s.t. T(x)=V(x)2. This leads to singular behaviour and breaks classical solvers, including the ones with stiff handling. At the same time, full dynamical solution and it t→∞ extrapolation are possible. Graphs 1008 and 1010 of
To construct the solution a two-stage optimization approach may be employed, where the solution is first obtained at smaller x, and later generalized until the end of the nozzle. For the circuit a six-qubit quantum register may be considered with a Chebyshev quantum feature map, keeping in mind that x∈[0,1). The cost function may be chosen as the total magnetization =Σj=1N{circumflex over (Z)}j. Because three functions {ρ(x), T(x), V(x)} are considered, three equations contribute to the loss function in the combined manner. Floating boundary handling may be used, where each curve is adjusted according to starting values, chosen as ρ(0)=1, T(0)=1, V(0)=0.1. The variational ansatz is taken in the standard hardware efficient form with d=6 depth. Adam is used as a classical optimizer, and the learning rate is set to 0.01.
We proceed to search for the full solution that includes the divergent nozzle part for x>0.5 at the second training stage. At this session we choose the grid of 40 points, where to regions of (0, 0.4) and (0.6, 0.9) with 20 points each are used. As we discussed before, the key problem of the convergent-divergent nozzle in the subsonic-supersonic transition case is the divergence around the middle of the nozzle. This causes a major problem to classical solvers, that are unable to find steady state solution. Divergent contribution from this region also impacts the loss function, and makes training complicated. However, by excluding the region around the nozzle throat, (0.4, 0.6), in the training this problem was alleviated. Proceeding with the use of the same ansatz and boundary handing, and feed the variational angles from stage 1 as initial parameters leading to sub-optimal solution. We employ weak regularization to ensure that required subsonic-supersonic solution is made. Namely, we use 20 points in the (0, 0.4) region benefiting from the previously found solution (in general we have access to as many points as we need), and also add 5 points in the x∈(0.6,0.9) region representing weak bias towards desired solution. The training is performed for niter=600, learn rate of α=0.005 regularization switch-off function ζ(j) set to remove smoothly the regularization weight around j=150. The full solution is shown in graph 1104 of
The embodiments describe in this application define a general framework for solving general (systems of) differential equations using differentiable quantum circuits on gate-based quantum hardware. The method makes use of quantum feature map circuits to encode function values into a latent space, which allows us to consider spectral decompositions of the trial solutions to differential equations. The differential quantum devices can accurately represent non-linear solutions using few qubits, by exploiting a number of spectral basis functions exponential in the number of qubits.
Analytical circuit differentiation can be used to represent function derivatives appearing in differential equations. The embodiments also show how analytical circuit differentiation can be used to represent function derivatives appearing in differential equations, and constructed loss functions which aim to improve the prepared trial solution. The embodiments open up a new way of solving general, complex, (non-)linear systems; as an example, solutions to the Navier-Stokes equation were presented, but the same method can be applied across all disciplines where differential equations arise. Further it is well-known that the analytical gradients, which are used here, are much less susceptible to noise than numerical gradients evaluated on variational quantum circuits. Noisy black-box optimization algorithms for NISQ application would perform well in this method.
where unitary operators correspond to shifted feature map circuits. Importantly, Eq. (A1) represents a linear combination of unitaries (LCU) acting on the initial state. While LCU is a non-unitary operation in itself, using an ancillary register one can simulate its action efficiently using controlled operations.
First the general implementation of LCU operations is shown. To implement an action of the sum of unitaries on the initial state as Ŝ|ψ0:= for real coefficients >0 we need an additional register with NA=[log2(L)] qubits. Also an operator {circumflex over (V)} acting on ancillas as {circumflex over (V)}|0⊗N
where ⊥ is an orthogonal state where the system is projected once LCU implementation did not work. Performing measurement of NA ancilla qubits, the system collapses to the normalized LCU state Ŝ|Ø once 0 bits are read out on the ancillary register. The probability of implementing Ŝ|Ø operation depends on the sum of expansion coefficients as p=1/c2.
Using the general LCU framework, known to provide exponential speed-up for quantum simulation and matrix inversion.
where states of the ancillary register 1204 flag the feature map application in a specific point. Similarly, the derivative evaluated at the grid of points reads as follows:
with slightly expanded register to accommodate for the sum of unitaries representing the circuit derivative. This procedure can be generalized to higher-order derivatives. Importantly, exponentially many points can be encoded as number of states in {|i} grows as 2N
where l is a superindex for pairs of degrees spanning over the set . The next goal is to write the information about structure of F[⋅] as a quantum register. With a linear function simply the unitary feature map is applied. For functions in higher powers use QNPUs need to be used that multiply states in the unitary fashion. Finally, monomials of variable serve as weights. Using same steps as before we arrive to
implemented with an ancillary register 1204 (may be different from the derivative part). Finally, the training needs to be performed using the two-part cost function =diff+bound. First, functions in the latent space may be matched by maximising their overlap as
diff,θ=(1−|dψ∀,θ|F∀,θ2) (A7)
and
bound,θ=(u0−ψ∀,θ(x=0)|Ĉ|ψ∀,θ(x=0))2. (A8)
keeping in mind θ parametrization. This overlap maximizing can be achieved by using the SWAP test protocol, depicted schematically in 1220, 1222, 1224,1225,1226. Matching the two parts at all points simultaneously the solution to the differential equation system can be found.
Suggestions for the values of the parameters in the terms in the equation 1320 may be provided in as approximate values (rather than exact values). Initial suggested parameter values may be provided as regularization data for use during the training process. An example of the loss contribution that would effectively enable this is:
where zeta may be some function similar to Eq. 30, epoch-dependent. The parameters 1320 are initialized in αi(0) and any deviation is initially punished with this loss term in order to first push the solver to find solutions close to the original estimated DE form. In the final epochs of optimization, this loss contribution may be completely turned off such that DE hyperparameters alpha are fully variationally optimized to fit the observations.
The goal of solving the differential equation in such setting is then achieved by also relying on experimental data/measurement points that are already known 1322. These can be just a few points but should be known with some non-negligible certainty. In an embodiment, these data points may be added as representing parameter boundary conditions in a regression strategy wherein differential-equation-matching, boundary-condition-matching, and prior-knowledge-data-matching are ensured. In this way, a generalized solution to the equation can be found, while simultaneously finetuning the values of the equation parameters with higher accuracy given the data. Hence, these embodiments enable parametrized differential equation solving using tunable variational equation parameters on equal footing with circuit parameters.
where the dot indicates a vector inner product, wherein a parameter may be represented as a vector of coefficients and F is a vector of functionals on ƒ and x. In other words, the right-hand side RHS of this equation can be any linear combination of a set of library terms that are expected to be relevant to solving the differential equation. During the loss function optimization, now both the function parametrization, θ as well as the function coefficients, , are optimized just like in the case of
In
Also, an additional Loss term for weight-regularization purposes may be introduced:
reg,
=∥∥
This loss term ensures that non-important or unlikely terms are suppressed in the optimization over , in some sense increasing sparsity in the structure. This is useful, as most processes in nature can be described with relatively few terms, and the challenge is often rather which of those terms should appear and with what strength.
A trial model 1708 may be used to narrow down the generality, wherein the trial model may indicate that at least one of the terms, in this example the double derivative with respect to time ∂utt, exists in the model for sure. Further, a library of functionals φ(u) may be used, wherein the aim of the scheme is to learn their coefficients, wherein a zero coefficient implies absence of the term. The loss function may be differentiated with respect to both theta 1710 and 1711, so that both the function shape and the model fit can be optimized. Graph 1712 illustrates how DQC manages to find the correct parameters for the example of a damped harmonic oscillator within approx. 150 iterations of the algorithm. The final results are the coefficients of the trial model 1708 shown in 1714. We find that only the linear term and the derivative with respect to time are non-zero coefficients, and we also find those coefficients to accurately represent the values that we used to generate the data in this case. This shows one can effectively solve differential equations based on partial knowledge of the underlying model, combined with data/observations of the physical or observed process, using DQC.
In an embodiment, non-regularization (real) information, data points, physical measurements, or any other a-priori known data may be used to solve systems of parametric differential equations where parameters of the equation are not exactly known or known to low degree of confidence. Those data points may be considered in the total loss function in order to perform partial regression on the data while still conforming to the (set of) differential equations. Such loss term contribution may have the following form as an example:
Where ui are given known function values at data points xi. The importance of these data can be tuned with η but is epoch-independent, and therefore behaves similar to a boundary loss term.
The computation of a trial function ƒ 1426 may be summarized by equation 1424 indicating that the function is the expectation value of a predefined Hermitian cost operator Ĉ as described with reference to Eq. (1). The function may depend on one or more vectors {right arrow over (x)} defining the variable(s) of the differential equation, a vector {right arrow over (α)} defining a classical optimization parameter and one or more variational parameters {right arrow over (θ)}1, {right arrow over (θ)}2. In some embodiments, a quantum feature map may also depend on a variational parameter {right arrow over (θ)}FM. Further, in an embodiment, one or more initialization unitaries ini(1)({right arrow over (θ)}ini(1)), ini(2)({right arrow over (θ)}ini(2)) 1404, 1416 may be used to initialize variational parameters of the quantum circuit.
The line break 1414 signifies that any of the components shown may or may not be repeated or similar components can be introduced as required by the application. Qubits may be initialized in the |Ø state 1402, and after application of the qubit operations represented by quantum circuits 1404-1416 outputs may be measured in the X, Y or Z basis 1418 in order to compute expectation values of operators 1422 which sum up to a function expression 1420.
As described with reference to
Therefore, in an embodiment, instead of random values, a predetermined set of angles may be used for initialization of the variational parameters. In particular, in an embodiment, one or more initialization quantum circuits, i.e. initialization unitaries, may be computed using a classical computer. In particular, at least some of the variational parameters ({right arrow over (θ)}ini(1) {right arrow over (θ)}ini(2) may be determined based one or more initialization quantum circuits that are based on a classical simulable quantum circuit, i.e. a quantum circuit that can be classically simulated on a conventional computer. Classical simulable quantum circuits define quantum circuits that can be classically simulated on a conventional computer. Here, Clifford gates are the elements of the Clifford group, which is generated by three gates, Hadamard, CNOT, and the S gates. Clifford-gate quantum circuits, i.e. quantum circuits that only consist of Clifford gates, can be efficiently simulated with a classical computer. Other examples of classical simulable quantum circuits are quantum circuits that can be described in terms of single qubit operations such as single qubit rotations or matchgate circuits.
If for some settings of all variational parameters, the overall circuit may be described by a classical simulable quantum circuit, e.g. a Clifford-gate quantum circuit, the entire quantum circuit can be evaluated efficiently on a classical computer. Hence, in an embodiment, the features map quantum circuits FM1, FM2, . . . , the variational quantum circuits 1, 2, . . . and the initialization quantum circuits init1, init2, . . . may be designed in a parameterized way such that they can be set deterministically to a classical simulable quantum circuit. In that case, the variational angles may be computed classically in an efficient way on a classical computer. For example, by choosing the quantum circuit to have form =p({right arrow over (θ)}1)†p({right arrow over (θ)}2) and setting a relation between the variational parameters, for example {right arrow over (θ)}1={right arrow over (θ)}2 then such variational the quantum circuit represents the identity operation, which is a Clifford operation and thus classically simulable.
Thus, by efficiently computing the non-trivial x-dependence in the output ƒ=ƒ(x) for a particular variational parameter setting of the quantum circuit, with some (free) variational parameters that do not cause the quantum circuit out of the space of the classical simulable quantum circuit, e.g. the Clifford-gate space, then ƒ can be fitted to a dataset in an efficient way. In particular, if the number of basis functions and tunable parameters of the classical simulable quantum circuit is only polynomial, then the dataset can be fitted in at most polynomial time to a satisfactory accuracy. After computing the values of the optimal variational parameters using a classical computer, the circuit can be initialized based on those values. Subsequently, all variational parameters can be optimized further by taking the quantum circuit out of the Clifford gate space.
Then, at least some of the variational parameters may be set such that the overall quantum circuit becomes a classical simulable quantum circuit. For example, if some of the variational parameters are initially set such that {right arrow over (θ)}1={right arrow over (θ)}2 and {right arrow over (θ)}3={right arrow over (θ)}4, those circuits reduce to the identity operator I. This makes the quantum circuit classically simulable, because effectively there is no entanglement between the qubits. This way, each expectation value can be computed separately. For qubit n, the expectation value ƒn (x)=Cn can be expressed as 1620, which is simply (as a function of x) a constant plus a cos(n arccos(x)) term with some coefficient, and all multiplied by an αn coefficient in front of the Z operator. That means that when all ƒn are summed together, the whole expression becomes ƒ(x) 1622 defining a sum over all Chebyshev polynomials up to order n, with a constant β0 and with coefficients βn being some trivial function of the parameters θini(i,j) and αn.
The Chebyshev polynomial function of degree n, 1622, can then be fitted to the function or data of interest, e.g. a certain approximation of function ƒ(x), with minimal classical overhead (providing a scaling of O(n)). After finding the optimal settings for variational parameters θini(i,j) and αn, the full quantum circuit may be initialized based on those values on a real quantum computer. Then, the hybrid quantum-classical variational feedback loop of
As already described with reference to
First a function may be designed as a circuit measurement on some operator C (equation B1):
It may be assumed that part of the circuit is made using feature maps of the form
where Ĝ is a so-called generator of the features. Formal derivation of this expression yields the function derivative in terms of the commutator of the generator with the (effective) cost function (equation B2):
The generator Ĝ may be diagonalized such that it becomes diagonal and a sum of projectors P with eigenvalues λj (equation B3):
Combining equations B2 and B3 provides the following expression for the derivative without loss of generality (equation B4):
A more compact notation of this expression can be provided (equation B5):
with the realization that all terms are real (since the function value is real, the derivative must be, too). There are s terms for s unique gaps in the generator spectrum. For each gap s, there are two terms, real and imaginary. The relationship between a shift in the function evaluation itself and all its higher-order derivatives is shown by the following expression (equation B6):
One can easily show that this recovers the basic parameter-shift rule (equation B7) when two shifts are equal and opposite in sign:
For general shifts, the same symmetry disappears and one would need twice more measurements. Below it is shown that for some feature maps and circuits used in DQC symmetries can be used to substantially reduce the number of circuit evaluations. Typically, feature maps represent a tensor product of rotations (parallel sequences) parameterized by the same variable X. One example corresponds to the product feature map that we define as (equation B8):
where N is the number of qubits used for the encoding.
is a Pauli rotation operator for Pauli matrices {circumflex over (P)}α,j={circumflex over (X)}j, Ŷj, or {circumflex over (Z)}j, (α=x, y, z) acting on qubit j with phase φ. For brevity, let us assume α=z, as other cases can be considered in the same way with additional Hadamard and phase gates before and after the feature map.
Equation B8 can be written in the form of
where the unitary operator is generated by the following generator (equation B9):
that corresponds to a sum of Pauli operators. Being the total effective magnetization, the generator Ĝz in Equation B9 has a spectrum with N+1 eigenvalues Λ={−N, −N+2, . . . , 0, 2, . . . }. Some eigenvalues are multiply degenerate, and the number of unique positive gaps is S=|{circumflex over (Γ)}|=N. As the feature map can also be differentiated by applying the parameter shift rule individually to each rotation, the derivative can be measured using at most 2N circuit evaluations.
First, it is noted that eigenstates of Ĝz coincide with computational basis states and have all real amplitudes. Next, an input state |Ø with real amplitudes may be considered, and ansatze {circumflex over (V)}θ, Ŵθ that preserve this structure, such that |Ψθ remains real. Finally, cost functions with real-valued matrix elements are considered, which can be formed by strings of Pauli X and Z operators, as well as strings with the even number of Pauli Y operators. Given the real structure of (variational) states and readout, the matrix elements s∈ in equation B7 are real for any set of shifts {}, and the difference considered in equation B7 can be expressed as follows:
wherein the imaginary parts of matrix elements Im{s} remain zero. As the number of unknown coefficients reduces by symmetry, it is sufficient to perform measurements at N+1 shifts (may include original function evaluation), reducing the budget of analytic derivative measurement from 2N evaluations required before for the single-qubit parameter shift rule. This halves the measurement time for function derivatives, and speeds up the solution for systems of differential equations.
So far the specific choices made can give reductions in the number of required measurements. Similarly, we can also imagine the situation where only imaginary parts are non-zero, or the matrix elements are complex-valued but have specific structure that connects real and imaginary parts (like in Kramers-Kronig relations that may be familiar to physicists).
The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Number | Date | Country | Kind |
---|---|---|---|
20207672.5 | Nov 2020 | EP | regional |
21167715.8 | Apr 2021 | EP | regional |
21197576.8 | Sep 2021 | EP | regional |
21205728.5 | Oct 2021 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/081737 | 11/15/2021 | WO |