SYSTEMS AND METHODS FOR COMPRESSED SENSING MEASUREMENT OF LONG-RANGE CORRELATED NOISE

Information

  • Patent Application
  • 20230058207
  • Publication Number
    20230058207
  • Date Filed
    May 13, 2022
    2 years ago
  • Date Published
    February 23, 2023
    a year ago
  • CPC
    • G06N10/70
  • International Classifications
    • G06N10/70
Abstract
A method for detecting a two-qubit correlated dephasing error includes accessing a signal of a quantum system, where the quantum system includes a plurality of qubits. Every qubit has a nonzero rate of dephasing and some qubits have a nonzero rate of correlated dephasing. The signal further includes information about a matrix that includes diagonal elements and off-diagonal elements. The off-diagonal elements of the matrix are 2 s-sparse. The method further includes performing randomized measurements of the off-diagonal elements of the matrix and recovering the matrix based on a direct measurement of the diagonal elements of the matrix.
Description
TECHNICAL FIELD

The present disclosure relates generally to the field of noisy intermediate-scale quantum information (NISQ) processors. More specifically, an aspect of the present disclosure provides detecting two-qubit correlated dephasing errors in NISQ devices.


BACKGROUND

The development of noisy intermediate-scale quantum information processors (NISQ devices) has the potential to advance many areas of computational science. An important problem is the characterization of noise processes in these devices, in order to improve their performance (via calibration and error correction), and to ensure correct interpretation of the results.


Accordingly, there is interest in improving measurements that have the mathematical properties needed for compressed sensing, and can be implemented efficiently on a quantum device.


SUMMARY

An aspect of the present disclosure provides computer-implemented method for detecting a two-qubit correlated dephasing error. The method includes accessing a signal of a quantum system, where the quantum system includes a plurality of qubits. Every qubit has a nonzero rate of dephasing and some qubits have a nonzero rate of correlated dephasing. The signal includes information about a matrix that includes diagonal elements and off-diagonal elements. The off-diagonal elements of the matrix are 2 s-sparse. The method further includes performing randomized measurements of the off-diagonal elements of the matrix and recovering the matrix based on a direct measurement of the diagonal elements of the matrix.


In accordance with aspects of the disclosure, the method may further include estimating a two-qubit correlated dephasing error based on the recovered matrix.


In an aspect of the present disclosure, performing the randomized measurements may include preparing entangled Greenberger-Horne-Zeilinger states (GHZ states) on random subsets of the plurality of qubits.


In another aspect of the present disclosure, the randomized measurements may be based on noise spectroscopy and quantum sensing.


In yet another aspect of the present disclosure, the recovered matrix may include a restricted isometry property.


In accordance with further aspects of the present disclosure, the method may further include estimating a vector of decay rates that are dependent on the matrix.


In an aspect of the present disclosure, the recovered matrix may be further based on the estimated decay rate.


In another aspect of the present disclosure, the method may further include estimating a relaxation time of the quantum system based on the estimated decay rate.


In yet another aspect of the present disclosure, the method may further include estimating a decoherence time of the quantum system based on the estimated decay rate.


In yet another aspect of the present disclosure, the method may further include detecting a long-range correlated dephasing error based on the recovered matrix.


An aspect of the present disclosure provides a computer-implemented method for detecting a two-qubit correlated dephasing error, the method including: accessing a signal of a quantum system, the signal including a plurality of qubits; dephasing entangled states of the plurality of qubits based on performing Ramsey spectroscopy using entangled states of random subsets of qubits of the plurality of qubits; measuring a linear function of a correlation matrix, where the correlation matrix corresponds to correlated Markovian dephasing between pairs of qubits of the plurality of qubits; generating a first vector and a second vector; and estimating a decay rate based on the first vector and the second vector. A plurality of elements of the first vector and of the second vector are randomly chosen. Every qubit of the plurality of qubits has a nonzero rate of dephasing and some qubits have a nonzero rate of correlated dephasing.


In accordance with further aspects of the present disclosure, the method may further include generating a restricted isometry property-based recovery matrix. The recovery matrix may be further based on the estimated decay rate.


In another aspect of the present disclosure, the method may further include detecting a long-range correlated dephasing error based on the recovery matrix.


In yet another aspect of the present disclosure, the method may further include estimating a relaxation time of the quantum system based on the estimated decay rate.


In a further aspect of the present disclosure, the method may further include estimating a decoherence time of the quantum system based on the estimated decay rate.


An aspect of the present disclosure provides a system for detecting a two-qubit correlated dephasing error, where the system includes: a processor; and a memory, including instructions stored thereon, which when executed by the processor, cause the system to: access a signal of a quantum system, the signal includes a plurality of qubits; and generate a matrix C in Rn×n, wherein its off-diagonal part is 2 s sparse. The matrix C is based on the accessed signal. Every qubit has a nonzero rate of dephasing and some qubits have a nonzero rate of correlated dephasing.


In yet a further aspect of the present disclosure, the instructions, when executed by the processor, may further cause the system to perform randomized measurements based on: estimating (b−a)TC(b−a) for any a≠b in {0,1}n; choosing a sequence of random vectors r(1), . . . , r(m)in {1,0,−1}n; and estimating ΦC=(r(1){circumflex over ( )}TCr(1)), . . . , (r(m){circumflex over ( )}TCr(m)), where Φ:Rn×n→Rm.


In yet a further aspect of the present disclosure, the instructions, when executed by the processor, may further cause the system to estimate a decay rate Γab based on any a≠b in {0,1}n, and estimate a decoherence time of the quantum system based on the decay rate Γab.


In an aspect of the present disclosure, the instructions, when executed by the processor, may further cause the system to recover matrix C based on measuring a plurality of diagonal elements of matrix C directly and determining an off-diagonal element of Matrix C based on using custom-character1-regularized least-squares regression.


In an aspect of the present disclosure, the instructions, when executed by the processor, may further cause the system to detect a long-range correlated dephasing error based on the recovered matrix C.


Further details and aspects of exemplary embodiments of the present disclosure are described in more detail below with reference to the appended figures.





BRIEF DESCRIPTION OF THE DRAWINGS

A better understanding of the features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the present disclosure are utilized, and the accompanying drawings of which:



FIG. 1 is a diagram of an exemplary noisy intermediate-scale quantum information (NISQ) processor, in accordance with examples of the present disclosure;



FIG. 2 is a high level flow diagram of the computer-implemented method for detecting a two-qubit correlated dephasing error, in accordance with examples of the present disclosure;



FIG. 3 is a correlation graph of a noise model of the quantum system of FIG. 1, in accordance with examples of the present disclosure;



FIG. 4 illustrates a flow diagram of the computer-implemented method 600 for determining a two-qubit correlated dephasing error for the quantum system 100, such as the quantum system of FIG. 1, in accordance with examples of the present disclosure;



FIG. 5 is a graph that illustrates single qubit Ramsey spectroscopy, in accordance with examples of the present disclosure;



FIG. 6 is a correlation matrix C corresponding to the correlation graph of FIG. 3, in accordance with examples of the present disclosure;



FIG. 7A is a table that illustrates sample complexity of different methods for reconstructing the correlation matrix C, in accordance with examples of the present disclosure;



FIG. 7B is a table that illustrates sample complexity of different methods for reconstructing the correlation matrix C, in accordance with examples of the present disclosure;



FIGS. 8A-8C are graphs that illustrate scaling of the reconstruction error under various circumstances, in accordance with examples of the present disclosure;



FIG. 9A is a graph that illustrates trajectories of a random walk used in estimating an evolution time, in accordance with examples of the present disclosure;



FIG. 9B is a graph that illustrates the accuracy of a final estimate of the evolution time, in accordance with examples of the present disclosure;



FIGS. 10A-10C are graphs that illustrate an exponential decay of {tilde over (P)}ab(t) when there are state-preparation-and-measurement (SPAM) errors for a 3-qubit Greenberger-Horne-Zeilinger (GHZ) state, in accordance with examples of the present disclosure; and



FIG. 11 illustrates a flow diagram of the computer-implemented method for determining a two-qubit correlated dephasing error for the quantum system, such as the quantum system of FIG. 1, in accordance with examples of the present disclosure.





DETAILED DESCRIPTION

The present disclosure relates generally to the field of noisy intermediate-scale quantum information (NISQ) processors. More specifically, an aspect of the present disclosure provides detecting two-qubit correlated dephasing errors in NISQ devices.


Embodiments of the present disclosure are described in detail with reference to the drawings wherein like reference numerals identify similar or identical elements.


Although the present disclosure will be described in terms of specific examples, it will be readily apparent to those skilled in this art that various modifications, rearrangements, and substitutions may be made without departing from the spirit of the present disclosure. The scope of the present disclosure is defined by the claims appended hereto.


For purposes of promoting an understanding of the principles of the present disclosure, reference will now be made to exemplary embodiments illustrated in the drawings, and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of the present disclosure is thereby intended. Any alterations and further modifications of the novel features illustrated herein, and any additional applications of the principles of the present disclosure as illustrated herein, which would occur to one skilled in the relevant art and having possession of this disclosure, are to be considered within the scope of the present disclosure.


The notation x∝y means that x and y are proportional, i.e., x=cy where c is a “universal” constant. A universal constant is a quantity whose value is fixed once and for all, and does not depend on any other variable. Polylogarithmic factors are written in a compact way as follows: logcn=(log n)c. Big-O notation, such as O(n3), denotes an asymptotic upper bound. Matrices are denoted by capital letters. For a vector v, ∥v∥p denotes the custom-characterp norm (in cases where p=1,2, ∞). For a matrix M, ∥M∥F=(Σjk|Mjk|2)1/2 denotes the Frobenius norm (i.e., the Schatten 2-norm, or the custom-character2 norm of the vector containing the entries of M), ∥M∥ denotes the operator norm (i.e., the Schatten ∞-norm), ∥M∥tr denotes the trace norm (i.e., the Schatten 1-norm, or the nuclear norm), and ∥Mcustom-characterjk|Mjk| denotes the custom-character norm of the vector containing the entries of M. Given an n×n matrix M, let uvec(M)=(Mij)i<j denote the vector (of dimension n(n−1)/2) containing the entries in the upper-triangular part of M, excluding the diagonal. Statistical estimators are written with a hat superscript, e.g., {circumflex over (Γ)} is a random variable that represents an estimator for some unknown quantity Γ. For a random variable {circumflex over (Γ)}, ∥{circumflex over (Γ)}∥ψ2 denotes the sub-Gaussian norm, and ∥{circumflex over (Γ)}∥ψ1 denotes the subexponential norm.


The development of noisy intermediate-scale quantum information processors (NISQ devices) has the potential to advance many areas of computational science. An important problem is the characterization of noise processes in these devices, in order to improve their performance (via calibration and error correction), and to ensure correct interpretation of the results. One challenge is to characterize all of the noise processes that are likely to occur in practice, using some experimental procedure that is efficient and can scale up to large numbers of qubits.


Compressed sensing offers one approach to solving this problem, where one uses specially-designed measurements (and classical post-processing) to learn an unknown signal that has some prescribed structure. For example, the unknown signal can be a sparse vector or a low-rank matrix, the measurements can consist of random projections sampled from various distributions, and the classical post-processing can consist of solving a convex optimization problem (e.g., minimizing the custom-character1 or trace norm), using efficient algorithms. This approach has been used in several previous works on quantum state and process tomography, and estimation of Hamiltonians and Lindbladians.


From a theoretical perspective, one of the main challenges is to design measurements that have the mathematical properties needed for compressed sensing, and can be implemented efficiently on a quantum device. There has been a substantial amount of work in this area, which can be broadly classified into two approaches: “sparsity-based” and “low-rank” compressed sensing. For compressed sensing of “low-rank” objects (e.g., low-rank density matrices and quantum processes), there seems to be a few natural choices for measurements, including random Pauli measurements, and fidelities with random Clifford operations. For compressed sensing of “sparse” objects (e.g., sparse Hamiltonians, Lindbladians, or Pauli channels), however, the situation is more complicated, as a larger number of different measurement schemes and classical post-processing methods have been proposed, and the optimal type of measurement seems to depend on the situation at hand. This complicated state of affairs can be explained in part because “sparsity” occurs in a wider variety of situations than “low-rankness.”


Long-range correlated errors can severely impact the performance of NISQ (noisy intermediate-scale quantum) devices, and fault-tolerant quantum computation. Characterizing these errors is important for improving the performance of these devices, via calibration and error correction, and to ensure correct interpretation of the results. A compressed sensing method for detecting two-qubit correlated dephasing errors may be used for characterizing these errors, assuming only that the correlations are sparse (i.e., at most s pairs of qubits have correlated errors, where s<<n(n−1)/2, and n is the total number of qubits). In particular, the disclosed method can detect long-range correlations between any two qubits in the quantum system (i.e., the correlations are not restricted to be geometrically local).


The disclosed method is highly scalable, because the disclosed method requires as few as m∝s log n measurement settings (where ∝ denotes proportionality), and efficient classical post-processing based on convex optimization. In addition, when m∝s log4n, the disclosed method is highly robust to noise, and has sample complexity O(max(n, s)2 log4(n)), which can be compared to conventional methods that have sample complexity O(n3). Thus, the disclosed method is advantageous when the correlations are sufficiently sparse, that is, when s≤O(n3/2/log2n). The disclosed method also performs well in numerical simulations on small quantum system sizes, and has some resistance to state-preparation-and-measurement (SPAM) errors. A feature of the disclosed method is a new type of compressed sensing measurement, which works by preparing entangled Greenberger-Horne-Zeilinger states (GHZ states) on random subsets of qubits, and measuring their decay rates with high precision.


Characterizing errors when building large-scale quantum devices is needed for calibrating, correcting the errors, and interpreting the result of the quantum devices correctly. For example, quantum simulations have only a limited simulation time. Metrology and quantum sensing may lead to reduced sensitivity, while error correction and fault tolerance require increased overhead.


Thus, there is a need to find methods for characterizing all the likely noise processes in quantum systems that are efficient and scalable.


Referring to FIG. 1, a diagram of an example quantum system 100, is shown. The quantum system 100 may include, for example, a NISQ (noisy intermediate-scale quantum) device (i.e., a NISQ processor). The term “noisy” refers to the fact that quantum processors are very sensitive to the environment and may lose their quantum state due to quantum decoherence (i.e., dephasing). The quantum system 100 includes qubits 102 which are linked by a coupler 104. The qubits are the quantum version of classic binary bits realized with a two-state quantum-mechanical system. The coupler 104, for example, may be implemented by connecting the two qubits to an intermediate electrical coupling circuit. The circuit might be a fixed element, such as a capacitor, or a controllable element. It is contemplated that the quantum system 100 may include any suitable number of qubits 102 and/or couplers 104.


The theory of “sparsity-based” quantum compressed sensing may be applied to a physical problem that is relevant to the development of quantum system 100 (e.g., a NISQ device), such as detecting long-range correlated dephasing errors. A simple model of correlated dephasing, is described by the following Markovian master equation:








d

ρ

dt

=




(
ρ
)

=




j
,

k
=
1


n




c
jk

(



Z
k


ρ


Z
j


-


1
2



{



Z
j



Z
k


,
ρ

}



)

.







Here the quantum system 100 consists of n qubits, and Zj and Zk are Pauli σz operators that act on the j'th and k'th qubits, respectively. The noise is then completely described by the correlation matrix C=(cjk)∈custom-charactern×n. (See FIGS. 3 and 6.) Generalizations of this model with complex coefficients cjk, and an additional environment-induced Hamiltonian are considered.


Here, the diagonal elements cjj show the rates at which single qubits dephase, and the off-diagonal elements cjk show the rates at which pairs of qubits undergo correlated dephasing. Typically, the diagonal of C will be nonzero, while the off-diagonal part may be dense or sparse, depending on the degree of connectivity between the qubits and their environment.


This master equation describes a number of physically plausible scenarios, such as spin-½ particles coupling to a shared bosonic bath. It has also been studied as an example of how collective decoherence can affect physical implementations of quantum computation and quantum sensing.


This model of correlated dephasing is quite different from other models of crosstalk that are based on quantum circuits or Pauli channels. The disclosed model describes crosstalk that arises from the physical coupling of the qubits to their shared environment. The disclosed model has a different character from crosstalk that arises from performing imperfect two-qubit gates, or correlated errors that result when the physical noise processes are symmetrized by applying quantum operations such as Pauli twirls.


The disclosed model of correlated dephasing can be learned efficiently when the off-diagonal part of the correlation matrix C is sparse. It is assumed that C has at most s<<n(n−1)/2 nonzero entries above the diagonal, but these entries may be distributed arbitrarily; in particular, long-range correlated errors are allowed. (Note that physical constraints imply that C is positive semidefinite, hence C=C)


The fact that the disclosed model enables long-range correlations is quite powerful, and is relevant to a number of interesting settings in quantum information science. These settings include experimental NISQ devices, which are often engineered to have long-range interactions, in order to perform quantum computations more efficiently; the execution of quantum circuits on distributed quantum computers, where long-range correlations are generated when qubits are moved or teleported from one location to another; and quantum sensor arrays, where a detection event at an arbitrary location (j,k) in the array may be registered as a pairwise correlation between a qubit that is coupled to row j and a qubit that is coupled to column k in the array.


Referring to FIG. 2 a high level flow diagram of a computer-implemented method 200 for detecting a two-qubit correlated dephasing error is shown. A quantum system 100 such as a NISQ device is observed under noisy evolution 606 (FIG. 4). Random measurements 210 (e.g., Ramsey spectroscopy, FIG. 5) are made and a sparse correlation matrix C 400 (FIG. 6) is characterized based on the random measurements 210. The correlation matrix C 400 corresponds to correlated Markovian dephasing between pairs of qubits.


Referring to FIG. 3 an illustration of a correlation graph of a noise model 300 of the quantum system 100 of FIG. 1 is shown. The qubits 302 experience correlated Markovian dephasing. Non-zero cij are shown, indicating correlated noise affecting the pairs of qubits connected by the lines 304.



FIG. 4 illustrates a computer-implemented method 600 for determining a two-qubit correlated dephasing error for the quantum system 100, such as the quantum system of FIG. 1. Initially, a general form of Ramsey spectroscopy using entangled states on multiple qubits is described. A simple method for directly measuring the correlation matrix C 400, by performing spectroscopy on every pair of qubits is described. This simple method serves as a baseline for measuring the performance of the disclosed compressed sensing method. It is contemplated that the disclosed methods may be performed by a processor and memory. The memory may include instructions thereon, which are executed by the processor.


Initially, a signal is accessed from a quantum system 100. The quantum system 100 contains a plurality of qubits, every qubit has a nonzero rate of dephasing, and some pairs of qubits have a nonzero rate of correlated dephasing. At step 602, vectors a and b whose elements are randomly chosen from {0,1} are generated. At step 604, the operation Uab prepares the state












1

2




(



"\[LeftBracketingBar]"

a





+



"\[LeftBracketingBar]"

b




)

.




At step bub the quantum system 100 evolves under dephasing noise for time t, represented by Et. At step 608, Uab is applied and a computational basis measurement is performed. The probability of obtaining the outcome 0, Pab, decays exponentially (towards ½) as t increases. At step 610, matrix C 400 (FIG. 6) may be recovered by measuring the decay rate Γab for various a's and b's.


Here it is assumed that the entries in the matrix C 400 are real (i.e., with imaginary part equal to zero). This holds true in a number of cases, for instance, when the qubits are coupled to a bath at a high temperature. When C is complex, it can be handled using a generalization of the disclosed method. Note that physical constraints imply that C is positive semidefinite; hence C=CT in the real case, and C=C in the complex case.


In aspects, the entangled states may undergo dephasing. The following procedure enables the measurement of certain linear functions of the correlation matrix C 400. This procedure is very general, and includes single- and two-qubit Ramsey spectroscopy as special cases. Consider an n-qubit state of the form



















"\[LeftBracketingBar]"


ψ
ab




=


1

2




(



"\[LeftBracketingBar]"

a






+



"\[LeftBracketingBar]"

b




)




(


2

)



n



,




(

Eq
.

2.1

)







where a, b∈{0,1}n, a≠b, |acustom-character=|a1, a2, . . . , ancustom-character and |bcustom-character=|b1, b2, . . . , bncustom-character. By choosing a and b appropriately, one can make |ψabcustom-character be a single-qubit |+custom-character state, a two-qubit Bell state, or a many-qubit Greenberger-Horne-Zeilinger state (GHZ state) (while the other qubits are in a tensor product of |0custom-character and |1custom-character states).


For example, the state |ψabcustom-character may be prepared, the state may be allowed to evolve for time t under the Lindbladian (Eq. 1.1). ρ(t) is the resulting density matrix. As can be seen in Eq. (1.2), the coherences (that is, the off-diagonal elements |acustom-charactercustom-characterb|) of ρ(t) decay as exp(−Γabt), where the decay rate Γab custom-character is defined so that custom-character(|acustom-charactercustom-characterb|)=−Γab|acustom-charactercustom-characterb|. This decay rate can be estimated by allowing the quantum system 100 to evolve for a suitable amount of time t, and then measuring in the basis












1

2




(



"\[LeftBracketingBar]"

a





+



"\[LeftBracketingBar]"

b




)

.




The decay rate Γab shows a certain linear function of the correlation matrix C, which can be written explicitly as follows. Let αi=(−1)ai and βi=(−1)bi denote the expectation values of Zi corresponding to the states |acustom-character and |bcustom-character, respectively. In addition, define the vectors α=(α1, . . . , αn) and β=(β1, . . . , βn).










Γ
ab

=

-



ij



c
ij

[



α
i



β
j


-


1
2



α
i



α
j


-


1
2



β
i



β
j



]







(

Eq
.

2.2

)













=

2


r
T


Cr


,




(

Eq
.

2.3

)







Recall the definition of r, r=b−a, (Eq. 2.4) note that







r
=


α
-
β

2


,




and the fact that C=CT is used to symmetrize the equation.


Single-qubit Ramsey spectroscopy (FIG. 4) is a special case of this procedure, where a=(0, 0, . . . , 0) and b=(0, . . . , 0,1,0, . . . , 0) (where the 1 appears in the j'th position). Then |ψabcustom-character is a |+custom-character state on the j'th qubit, and Γab=2cjj tells the rate of dephasing on the j'th qubit.


Two-qubit generalized spectroscopy is another special case, where a=(0, 0, . . . , 0) and b=(0, . . . , 0,1,0, . . . , 0,1,0, . . . , 0) (where the l's appear in the j'th and k'th positions). Then |ψabcustom-character is a maximally-entangled state on qubits j and k, and Γab=2(cjj+cjk+ckj+ckk) provides information about the rate of correlated dephasing on qubits j and k.


An advantage of the disclosed methods is that the methods enable the characterization of errors. The characterized errors may be used to improve quantum systems (and devices) to reduce errors. For example, errors may be mitigated by applying a gate opposite to the characterized error. In another example, there are many quantum systems 100 that include a master clock, that is configured to track the phase of all of the qubits, so if dephasing occurs, the system loses the ability to track the phase of the qubits. The disclosed methods enable mitigation of the dephasing error based on the characterized error.



FIG. 5 is a graph that illustrates single qubit Ramsey spectroscopy. Ramsey spectroscopy is a form of particle interferometry that uses the phenomenon of magnetic resonance to measure transition frequencies of particles. The plot 502 shows the overlap P+, which decays exponentially (towards ½) with decay rate Γ, as a function of time t. The inset block diagram 504 illustrates the Ramsey protocol, where a superposition of qubit states is prepared with the first Hadamard gate H, the quantum system 100 undergoes dephasing (represented by the noise channel εt) for time t, and the second H followed by a measurement in the computational basis measures the overlap P+.


In aspects, the decay rate Γab may be estimated by:


1. Choosing some evolution time t≥0 such that







1
2




Γ
ab


t


2.




This can be done in various ways, for instance, by starting with an initial guess t=τ0 and performing binary search, using ˜|log(Γabτ0)| trials of the experiment.


2. Repeating the following experiment Ntrials times: (Ntrials may be set using equation (2.11) below)


(a) Preparing the state















"\[LeftBracketingBar]"


ψ
ab




=


1

2




(



"\[LeftBracketingBar]"

a






+



"\[LeftBracketingBar]"

b




)

,




allow the state to evolve for time t, then measure in the basis












1

2




(



"\[LeftBracketingBar]"

a





+



"\[LeftBracketingBar]"

b




)

.




Letting N+ and N be the number of
















1

2




(



"\[LeftBracketingBar]"

a





+



"\[LeftBracketingBar]"

b




)






and






1

2




(



"\[LeftBracketingBar]"

a





+



"\[LeftBracketingBar]"

b




)




outcomes, respectively. Note that the probabilities of these outcomes are given by







P
+

=



1
2



(

1
+

e


-

Γ
ab



t



)



and



P
-


=


1
2




(

1
-

e


-

Γ
ab



t



)

.







3. Defining:









Δ
=




N
+

-

N
-



N

t

rials



.





(

Eq
.

2.5








Note that Δ is an unbiased estimator for P+−P, that is, custom-character(Δ)=P+−P=e−Γabt. This motivates the definition of an estimator for Γab:











Γ
ˆ


a

b


=


-

1
t





ln

(
Δ
)

.






(

Eq
.

2.6

)







Next, some bounds on the accuracy of the estimator {circumflex over (Γ)}ab may be set. To do this, the notion of a sub-Gaussian random variable is introduced (e.g., a random variable whose moments and tail probabilities behave like those of a Gaussian distribution). A real-valued random variable X is sub-Gaussian if there exists a real number K2 such that, for all p≥1, (custom-character(|X|p))1/p≤K2√{square root over (p)}. (Eq. 2.7)


The sub-Gaussian norm of X, denoted ∥X∥ψ2, is defined to be the smallest choice of K2 in (2.7), i.e.,












X



ψ
2


=


sup

p

1







p


-
1

/
2


(

𝔼

(




"\[LeftBracketingBar]"

X


"\[RightBracketingBar]"


p

)

)


1
/
p


.






(

Eq
.

2.8

)







In addition, it is known that X is sub-Gaussian if and only if there exists a real number K1 such that, for all t≥0, Pr[|X|>t]≤exp(1−t2/K12). (Eq. 2.9)


The smallest choice of K1 in (Eq. 2.9) is equivalent to the sub-Gaussian norm∥X∥ψ2, in the following sense: there is a universal constant c such that, for all sub-Gaussian random variables X, the smallest choice of K1 in (Eq. 2.9) differs from ∥X∥ψ2 by at most a factor of c.


{circumflex over (Γ)}ab−Γab is a sub-Gaussian random variable, whose sub-Gaussian norm is bounded by
















Γ
ˆ


a

b


-

Γ

a

b






ψ
2





C
0




Γ

a

b




N
trials





,




(

Eq
.

2.1

)







where C0 is a universal constant (e.g., 1/√{square root over (Ntrials)}) In particular, the accuracy of {circumflex over (Γ)}ab can be controlled by setting N trials appropriately: for any δ>0 and ∈>0, where











N
trials




2

δ
2




ln

(

2
ϵ

)



,




(

Eq
.

2.11

)







then {circumflex over (Γ)}ab satisfies the following error bound: with probability at least 1−∈, |{circumflex over (Γ)}ab−Γab|≤2 δe2Γab. (Eq. 2.12)


This shows that the error in {circumflex over (Γ)}ab is at most a small fraction of the true value of Γab, independent of the magnitude of Γab.


In aspects, this may derive an error bound that involves {circumflex over (Γ)}ab rather than Γab, and may be computed from the observed value of {circumflex over (Γ)}ab. To show such an error bound, the triangle inequality may be used to write |{circumflex over (Γ)}ab−Γab|≤2δe2(|{circumflex over (Γ)}ab|+{circumflex over (Γ)}ab−Γab|), and divide by (1−2δe2) to obtain:












"\[LeftBracketingBar]"




Γ
ˆ


a

b


-

Γ

a

b





"\[RightBracketingBar]"






2

δ


e
2



1
-

2

δ


e
2









"\[LeftBracketingBar]"



Γ
ˆ


a

b




"\[RightBracketingBar]"


.






(

Eq
.

2.13

)







It remains to prove (Eq. 2.10) and (Eq. 2.12). First use Hoeffding's inequality to show that Δ is close to its expectation value: Pr[|Δ−(P+−P)|≥δ]≤2 exp(−Ntrialsδ2/2). (Eq. 2.14)


When |Δ−(P+−P)|≤δ, the error in {circumflex over (Γ)}ab may be bounded as follows: (using the fact that P+−P=e−Γabt≥e−2 and







t


1

2


Γ

a

b





)















-


Γ
ˆ


a

b







1
t



ln

(


P
+

-

P
-

+
δ

)












1
t

[


ln

(


P
+

-

P
-


)

+

δ


e
2



]












-

Γ

a

b



+

2


Γ

a

b



δ


e
2




,







(

Eq
.

2.15

)
















-


Γ
ˆ


a

b







1
t



ln

(


P
+

-

P
-

-
δ

)












1
t

[


ln

(


P
+

-

P
-


)

-

δ


e
2



]











-

Γ

a

b



-

2


Γ

a

b



δ



e
2

.










(

Eq
.

2.16

)







Thus:






Pr[|{circumflex over (Γ)}ab−Γab|≥2δe2Γab]≤2 exp(−Ntrialsδ2/2).  (Eq. 2.17)


(Eq. 2.17) implies (Eq. 2.10) and (Eq. 2.12).


Referring to FIG. 6, matrix C 400 corresponding to the correlation graph in FIG. 3 is shown. The diagonal elements correspond to single qubit dephasing 402. The off-diagonal elements indicate correlated dephasing noise 404. In aspects, a direct estimation of the correlation matrix C 400 may be performed. The correlation matrix C 400 may be estimated directly, by performing single-qubit spectroscopy to measure the diagonal elements cjj, and by performing two-qubit spectroscopy to measure the off-diagonal elements cjk. This method may be used as a baseline to measure the performance of the disclosed compressed sensing method.


For simplicity, the case where C is real is considered. Since C is positive definite, this implies that cjj≥0 and cjk=ckj.


First, the diagonal elements cjj, for j=1, n, are estimated as follows:


1. Let a=(0, 0, . . . , 0) and b=(0, . . . , 0,1,0, . . . , 0) (where the 1 appears in the j'th position).


2. Construct an estimate {circumflex over (Γ)}ab of the decay rate Γab=2cjj. Define gj={circumflex over (Γ)}ab/2.


To write this in a compact form, define diag(C)=(c11, . . . , cnn), (Eq. 2.18)


that is, the diagonal of C. Next, define g=(g1, . . . , gn), (Eq. 2.19)


This is viewed as an estimator for diag(C).


Next, the off-diagonal elements cjk, for 1≤j<k≤n, are estimated as follows:


1. Let a=(0, 0, . . . , 0) and b=(0, . . . , 0, 1, 0, . . . , 0, 1, 0, . . . , 0) (where the 1's appear in the j'th and k'th positions).


2. Construct an estimate {circumflex over (Γ)}ab of the decay rate Γab=2(cjj+2cjk+ckk) and define







h

j

k


=



1
4




Γ
ˆ


a

b



-


1
2



g
j


-


1
2




g
k

.







To write this in a compact form, define C′ to be the matrix C 400 with the diagonal entries replaced by zeros,










C

j

k



=

{




c

j

k






ifj

k

,





0



ifj
=

k
.










(

Eq
.

2.2

)







The “off-diagonal portion” of matrix C is 2 s sparse. An estimator for Ĉ′ is constructed, as follows:











C
ˆ


j

k



=

{




h
jk





ifj

<
k

,






h
kj





ifj

>
k

,





0



ifj

=

k
.










(

Eq
.

2.21

)







The accuracy of these estimators may be analyzed as follows. Initially, two parameters δ1 and δ2 are chosen. Suppose that, during the measurement of the diagonal elements cjj, the decay rates Γab are estimated with accuracy ∥{circumflex over (Γ)}ab−Γabψ2≤δ1Γab, (Eq. 2.22) and during the measurement of the off-diagonal elements cjk, the decay rates Γab are estimated with accuracy ∥−Γabψ2≤δ2Γab. (Γab (Eq. 2.23).


Bounds of the form (Eq. 2.22) and (Eq. 2.23) can be obtained from equation (Eq. 2.10), by setting Ntrials˜1/δ12 and Ntrials˜1/δ22, respectively. Counting those trials of the experiment that are used to choose the evolution time t may be neglected, because the number of those trials grows only logarithmically with Γab. δ1 and δ2 may be different, because in many experimental scenarios, measurements of cjj take less time than measurements of cjk. Using (Eq. 2.22) and (Eq. 2.23), it can be shown that bounds on the accuracy of gj and hjk are:















g
j

-

c
jj





ψ
2





δ
1



c
jj



,




(

Eq
.

2.24

)

















h

j

k


-

c

j

k






ψ
2






δ
2



c

j

k



+


1
2



(


δ
1

+

δ
2


)




(


c
jj

+

c

k

k



)

.







(

Eq
.

2.25

)







These bounds on the sub-Gaussian norm imply bounds on the moments, such as custom-character|X|≤∥X∥ψ2 and custom-character(X2)≤2∥X∥ψ22.


The results to bound the accuracy of the estimators g and Ĉ′ may be followed. For g, using the custom-character1 and custom-character2 norms, the bounds are:






custom-character(∥g−diag(C)∥1)≤δ1∥diag(C)∥1,  (Eq. 2.26)






custom-character(∥g−diag(C)∥22)≤2δ12∥diag(C)∥22.  (Eq. 2.27)


For Ĉ′, using the custom-character1 vector norm and the Frobenius matrix norm, the bounds are:











𝔼

(






C
^



-

C







1


)



2





j
<
k



(



δ
2



c
jk


+


1
2



(


δ
1

+

δ
2


)



(


c
jj

+

c
kk


)



)




=




δ
2






j

k



c
jk



+


1
2



(


δ
1

+

δ
2


)






j

k



(


c
jj

+

c
kk


)







(

n
-
1

)



(


δ
1

+

δ
2


)








diag

(
C
)





1



+


δ
2






C






1




,








(

Eq
.

2.28

)













𝔼

(






C
^



-

C





F
2

)



2





j
<
k




2
[



δ
2



c
jk


+


1
2



(


δ
1

+

δ
2


)



(


c
jj

+

c
kk


)



]

2





2





j
<
k



6
[



δ
2
2



c
jk
2


+


1
4




(


δ
1

+

δ
2


)

2



c
jj
2


+


1
4




(


δ
1

+

δ
2


)

2



c
kk
2



]






6


δ
2
2






C




F
2


+


3
2




(


δ
1

+

δ
2


)

2






j

k



(


c
jj
2

+

c
kk
2


)






3


(

n
-
1

)




(


δ
1

+

δ
2


)

2







diag

(
C
)



2
2


+

6


δ
2
2







C




F
2

.









(

Eq
.

2.29

)







Note that, in the second step of (Eq. 2.29), the fact that for any real numbers a1, a2, a3, (a1+a2+a3)2≤3(a12+a22+a32) was used.


These are bounds on the expected error of g and {tilde over (C)}′. One can then use Markov's inequality to prove bounds that hold with high probability. For instance, using (Eq. 2.27), with probability at least 1−η, results in














g
-

diag

(
C
)




2




1

η




2



δ
1






diag

(
C
)



2



,




(

Eq
.

2.3

)







Using (Eq. 2.29), with probability at least 1−η, results in:















C
ˆ



-

C





F





1

η


[


3


(

n
-
1

)




(


δ
1

+

δ
2


)

2






diag

(
C
)



2
2


+

6


δ
2
2





C






]


1
/
2






1

η


[



3




n
-
1




(


δ
1

+

δ
2


)






diag

(
C
)



2


+


6



δ
2






C




F



]

.





(

Eq
.

2.31

)







In fact, one can prove tighter bounds on the failure probability, by replacing Markov's inequality with a sharp concentration bound for sub-Gaussian random variables.


The bounds (Eq. 2.30) and (Eq. 2.31) give a rough sense of how well this estimator performs. In particular, these bounds show that the error in estimating C′ depends on the magnitude of diag(C), as well as the magnitude of C′. This is due to the fact that the procedure for estimating the off-diagonal matrix element cjk also involves the diagonal matrix elements cjj and cjk. These bounds may be used as a baseline to understand the performance of the disclosed compressed sensing estimators.



FIGS. 7A and 7B are tables illustrating sample complexity of different methods for reconstructing a correlation matrix C 400 (FIG. 6), of size n×n, with 2 s nonzero elements off the diagonal. In aspects, the naive method is to measure each element of C separately. “RIP-based” and “RIPless” refer to different analytical bounds on the accuracy of the reconstruction of matrix C 400 (FIG. 6). The asterisk (*) indicates that the results using the RIPless bound hold under a technical assumption that the diagonal of C does not have any unusually large elements. The different methods are parameterized in such a way that they reconstruct the diagonal of matrix C 400 (FIG. 6) up to an additive error of size δ∥diag(C)∥2, and they reconstruct the off-diagonal part of C up to an additive error of size δ∥C∥F. Each method makes use of single-qubit spectroscopy (with n experimental configurations or “measurement settings”), as well as multi-qubit spectroscopy (with m measurement settings, where m varies between m∝n2 and m∝s log n). The total sample complexity, shown in the tables of FIGS. 7A and 7B, include both single-qubit and multi-qubit spectroscopy. For the compressed sensing method, the number of measurement settings can be as low as m∝s log n, using the RIPless bound, but the best sample complexity is achieved when m∝s log4n, using the RIP-based bound. This sample complexity, O(max(n, s)2 log4(n)), compares favorably with the sample complexity of the naive method, which is O(n3). Thus, Compressed sensing has an advantage over the naive method whenever s≤O(n3/2/log2n).


The performance of the disclosed compressed sensing method may be compared to the naive method where one measures each element of C independently. A rigorous comparison may be made based on the error bounds for each of these methods. These can be summarized as follows:





Naive:∥Ĉ−C∥F≤O(√{square root over (n)}(δ12)∥diag(C)∥22∥C′∥F),  (Eq. 1.4)





RIP-based:∥W(opt)−C∥F≤O(√{square root over (n)}δ1∥diag(C)∥22(√{square root over (n)}∥diag(C)∥2+√{square root over (2s)}∥C′∥F)),  (Eq. 1.5)





RIPless:∥W(opt)−C∥F≤O(s log5/2(n)[δ1∥diag(C)∥√{square root over (sn log)}(n)+δ2√{square root over (n)}∥diag(C)∥22√{square root over (2s)}∥C′∥F]).  (Eq. 1.6)


Here Ĉ is the estimate of C using the naive method, and W(opt) is the estimate of C using compressed sensing. diag(C) denotes the diagonal of C, and C′ denotes the off-diagonal part of C. m∝s log4n is set for the RIP-based bound, and m∝s log n for the RIPless bound. (2.30) may be used to bound the error in estimating diag(C), and (Eq. 2.31), (Eq. 5.48) and (Eq. 5.39) to bound the error in estimating C′. Finally, δ1 and δ2 are parameters that control the accuracy of the single-qubit and multi-qubit spectroscopy procedures. Using the disclosed method rigorous bounds on the sample complexity may be proven that are required for each method to reconstruct C with comparable accuracy.


For the compressed sensing method, the number of measurement settings can be as low as m∝s log n, using the RIPless bound, but the best sample complexity is achieved when m∝s log4n, using the RIP-based bound. This sample complexity, O(max(n, s)2 log4(n)), compares favorably with the sample complexity of the naive method, which is O(n3). Thus, compressed sensing has an advantage over the naive method whenever s≤O(n3/2/log2n).


The estimation of the decay rate Γab=2rTCr may be performed with high precision (i.e., with small multiplicative error), by allowing the quantum system 100 to evolve for a time t˜1/Γab. Here the time t is chosen by an adaptive procedure that starts with an initial guess τ0 and converges (provably) after˜|log(Γab τ0)| steps (see FIG. 3). To achieve high precision, the time evolution operator custom-character is treated exactly.


The above-noted feature is helpful in experimental setups where measurements are time-consuming and the values of Γab span several orders of magnitude, thus making it difficult to determine the appropriate evolution time and accurately measure the decay rates using conventional methods. This feature can also be used as a standalone technique to estimate the relaxation (T1) and decoherence (T2) times of any quantum system 100.


Since the disclosed method relies on estimating exponential decay rates, it can be made at least partially robust to state-preparation-and-measurement (SPAM) errors (see FIGS. 9A and 9B). This follows some of the same approaches used in randomized benchmarking and gate set tomography. The estimated decay rate may be used to estimate a relaxation time of the quantum system 100. In aspects, a decoherence time of the quantum system 100 may be estimated based on the estimated decay rate.


The disclosed method enables a unitary evolution according to some Hamiltonian HS, and enables the matrix C 400 (FIG. 6) to have complex entries cjk. A similar approach can be used to estimate the Hamiltonian HS, as well as the imaginary part of cjk. The disclosed technology measures correlated dephasing and not Pauli errors. In aspects, the measurement may include Ramsey spectroscopy. In aspects the reconstruction algorithm may include, for example, convex optimization over a continuous domain.


The disclosed method assumes that the noise is sparse, meaning that the noise is supported on a subset S of the domain, where S is small, but unknown.


It is contemplated that the disclosed method for SPAM-robust estimation of sparse noise models, may be used to characterize other kinds of correlated noise processes in many-body quantum systems.


The disclosed technology includes an efficient method for learning the off-diagonal part of the correlation matrix C, under the assumption that it is sparse, i.e., the part of C that lies above the diagonal has at most s nonzero elements, where s<<n(n−1)/2. (Since C is Hermitian, it is sufficient to learn the part that lies above the diagonal; this then determines the part that lies below the diagonal.)


For simplicity, first consider the special case where the entries in the matrix C are real (i.e., with zero imaginary part), which occurs in a number of physical situations (for example, when the system is coupled to a bath at a high temperature.


The disclosed method consists of two steps: first, single-qubit Ramsey spectroscopy is performed in order to learn the diagonal elements of C; second, techniques from compressed sensing (e.g., random measurements, and custom-character1-minimization) are applied in order to recover the off-diagonal elements of C. In aspects, a long-range correlated dephasing error may be detected based on the recovered matrix.


In aspects, random measurements of the correlation matrix may be performed. Initially, each of the diagonal elements cjj, for j=1, . . . , n, are estimated using single-qubit Ramsey spectroscopy, as illustrated in FIG. 5. Let g=(g1, . . . , gn)∈custom-charactern be the output of this procedure. g may be viewed as an estimate of a “sensing operator” that returns the diagonal elements of the matrix C 400,






g≈diag(C)=(c11,c22, . . . ,cnn).  (Eq. 3.1)


(Note that cjj≥0, since C is positive semidefinite.)


In order to estimate the off-diagonal part of C, a compressed sensing technique is used, which involves a certain type of generalized Ramsey measurement with random GHZ-type states, see FIG. 4. First, a parameter m is chosen, which can be roughly m˜s log n or m˜s log4n, which controls the number of different measurements. The particular choice of m may be motivated by a theoretical recovery guarantee. Now, for j=1, . . . , m, perform the following procedure:


1. Choose vectors a, b∈{0,1}n uniformly at random. As in equation (2.4), define






r=b−a.  (Eq. 3.2)


2. Prepare the state















"\[LeftBracketingBar]"


ψ
ab




=


1

2




(



"\[LeftBracketingBar]"

a






+



"\[LeftBracketingBar]"

b




)

.




This is a GHZ state on a subset of the qubits, with some bit flips. It can be created by preparing those qubits i where ai=bi in the state |aicustom-character, preparing those qubits i where ai≠bi in a GHZ state, and applying a Pauli X operator on those qubits i where ai>bi. (This requires a quantum circuit of depth ┌log2(n)┐+2.)


3. Construct an estimate {circumflex over (Γ)}ab of the decay rate Γab=2rTCr. Define hj={circumflex over (Γ)}ab.


Let h=(h1, . . . , hm)∈custom-characterm be the output of the above procedure. Again, h may be viewed as an estimate of a “sensing operator” Φ:custom-charactern×ncustom-characterm that is applied to the matrix C 400 (FIG. 6),






h≈Φ(C)=[Φj(C)]j=1, . . . ,m  (Eq. 3.3)





Φj(C)=2(r(j))TCr(j)  (Eq. 3.4)


where r(1), r(2), . . . , r(m)∈{1,0,−1}n are independent random vectors chosen from the same distribution as r (described above). Note that Φj(C)≥0, since C is positive semidefinite. The factor of 2 is chosen to ensure that Φ has a certain isotropy property, as described herein.


In aspects, the correlation matrix C∈custom-charactern×n may be reconstructed. C is positive semidefinite, due to physical constraints, and its off-diagonal part is sparse, with at most s<<n(n−1)/2 nonzero elements above the diagonal. In general, this sparsity constraint leads to an optimization problem that is computationally intractable. However, in this particular case, this problem can be solved using a strategy from compressed sensing: given g≈diag(C)∈custom-charactern and h≈Φ(C) ∈custom-characterm, matrix C 400 is recovered by solving a convex optimization problem, where the custom-character1 (vector) norm of the off-diagonal part of the matrix is minimized. This strategy succeeds when m≥c0s log n, and is highly robust to noise when m≥c0s log4n, where c0 is some universal constant.


Initially, the case where the measurements are noiseless, i.e., g=diag(C) and h=Φ(C) is considered. The following convex optimization problem is solved:











Find


W






n
×
n









that


minimizes







i

j





"\[LeftBracketingBar]"


W
ij



"\[RightBracketingBar]"





,




(

Eq
.

3.5

)









such


that
:











diag

(
W
)

=
g

,




(

Eq
.

3.6

)














Φ

(
W
)

=
h

,




(

Eq
.

3.7

)












W

0.




(

Eq
.

3.8

)







Here, Wcustom-character0 means that W is positive semidefinite, which implies that W=WT. As a sanity check, note that W=C is a feasible solution to this problem. (Recall that C is positive semidefinite.)


In phase retrieval, one wishes to estimate an unknown vector x from measurements of the form |rTx|2. In aspects, a PhaseLift algorithm works by “lifting” the unknown vector x to a matrix X=xxT, so that the problem becomes one of learning a rank-1 matrix X from measurements of the form rTXr; then one solves a convex relaxation of this problem. It is contemplated that in cases where the unknown vector x is sparse (“compressive phase retrieval”), variants of the PhaseLift algorithm (as well as other approaches) can also be used.


In the disclosed method, the unknown matrix C is almost always full-rank (because every qubit has a nonzero rate of dephasing). Physical constraints imply that C is positive semidefinite, so it can be factored as C=BBT, which is superficially similar to X=xxT; however, an important difference is that B is a square matrix, whereas x is a vector.


In aspects, the matrix C 400 may be reconstructed from noisy measurements. In the case where the measurements of g and h are noisy, the above convex optimization problem is modified, by relaxing the constraints (Eq. 3.6) and (Eq. 3.7). This leads to some technical complications, due to the fact that two variables are being reconstructed that have different characteristics: the diagonal part of C (which is not sparse), and the off-diagonal part of C (which is sparse).


To deal with these issues, two different ways of performing this reconstruction, when the measurements are noisy are: (1) simultaneous reconstruction of both parts of C, and (2) sequential reconstruction of the diagonal part of C, followed by the off-diagonal part of C. The former approach is arguably more natural, but the latter approach enables more rigorous analysis of the accuracy of the reconstruction.


Assuming the following bounds on the custom-character2 norms of the noise terms (u and v), that is,






g=diag(C)+u,∥u∥2≤∈1,  (Eq. 3.9)






h=Φ(C)+v,∥v∥2≤∈2.  (Eq. 3.10)


Simultaneous reconstruction of the diagonal and off-diagonal parts of C: the constraints (Eq. 3.6) and (Eq. 3.7) are relaxed in the simplest possible way, by replacing them with:





∥diag(W)−g∥2≤∈1,  (Eq. 3.11)





∥Φ(W)−h∥2≤∈2.  (Eq. 3.12)


This leads to a convex optimization problem that attempts to reconstruct both the diagonal part of C, which is not necessarily sparse, and the off-diagonal part of C, which is assumed to be sparse. (Note that W=C is a feasible solution to this problem.)


The reconstruction algorithm involves two different estimators (an custom-character1-regularized estimator for the off-diagonal part of C, and a least-squares estimator for the diagonal part of C). These two estimators are coupled together (through the constraint on Φ(W), and the positivity constraint Wcustom-character0).


Therefore, this method can behave quite differently, depending on whether the dominant source of error is g or h. When g is the dominant source of error, this method will behave like a least-squares estimator, whose accuracy scales according to the dimension n; when h is the dominant source of error, this method will behave like an custom-character1-regularized estimator, whose accuracy scales according to the sparsity s (neglecting log factors). From a theoretical point of view, this makes it more difficult to prove recovery guarantees for this method.


Sequential reconstruction of the diagonal part of C, followed by the off-diagonal part of C: In practice, one is often interested in the regime where g is known with high precision, and h is the dominant source of error. This is because measurements of g are relatively easy to perform, because they only require single-qubit state preparations and measurements; whereas measurements of h are more costly, because they require the preparation and measurement of entangled states on many qubits. So measurements of g can often be performed more quickly, and measurements on different qubits can be performed simultaneously in parallel; hence one can repeat the measurements more times, to obtain more accurate estimates of g.


In this regime, it is natural to try to recover the diagonal part of C directly from g, and then use custom-character1-minimization to recover only the off-diagonal part of C. This leads to a convex optimization problem which is arguably less natural, but it makes it easier to prove rigorous guarantees on the accuracy of the reconstruction of C. Starting from the convex optimization problem ((Eq. 3.5)-(Eq. 3.8)) for the noiseless case, and the last two constraints are relaxed to get:











Find


W






n
×
n









that


minimizes







i

j





"\[LeftBracketingBar]"


W
ij



"\[RightBracketingBar]"





,




(

Eq
.

3.13

)









such


that
:











diag

(
W
)

=
g

,




(

Eq
.

3.14

)

















Φ

(
W
)

-
h



2




ϵ
2

+


ϵ

1






mn
,








(

Eq
.

3.15

)














d

(

W
,

K
+


)



ϵ
1


,


and


W

=


W
T

.






(

Eq
.

3.16

)







Here K+={W′∈custom-charactern×n|W′custom-character0} denotes the (real) positive semidefinite cone, and










d

(

W
,

K
+


)

=


min


W




K
+







W
-

W





F






(

Eq
.

3.17

)







to be the minimum distance from W to a point W′ in K+, measured in Frobenius norm; note that this is a convex function. While this convex optimization problem looks complicated, it follows from a simple underlying idea: since the diagonal elements of W are fixed by the constraint (Eq. 3.14), this is simply an custom-character1-regularized estimator for the sparse, off-diagonal part of W.


The attentive reader will notice two potential concerns with this approach. First, in general, C will not be a feasible solution to this convex optimization problem, since the diagonal elements of C will not satisfy (Eq. 3.14). However, C lies close to a feasible solution. To see this, let {tilde over (C)} be the matrix whose off-diagonal elements agree with C, and whose diagonal elements agree with g. Then C is within distance ∈1 of {tilde over (C)} (in Frobenius norm), and {tilde over (C)} is a feasible solution. To see this, check that {tilde over (C)} satisfies the constraints (Eq. 3.14), (Eq. 3.15) and (Eq. 3.16), since:

















Φ

(

C
~

)

-

Φ

(
C
)




2

=





Φ

(

diag

(

g
-

diag

(
C
)


)

)



2











(

m





g
-

diag

(
C
)




1
2


)


1
/
2












ϵ

1






mn
.










(
3.18
)







(Here, {tilde over (C)} is written in a compact form, {tilde over (C)}=C+diag (g−diag(C)), where the diag(·) notation has the following meaning: for a matrix M, diag (M) is the vector containing the entries Mjj that lie along the diagonal of M; and for a vector v=(v1, . . . , vn), diag(v) is the diagonal matrix with v1, . . . , vn along the diagonal.)


Second, the optimal solution W may violate the positivity constraint (Eq. 3.8), making it un-physical/non-physical. (Similar issues can arise when performing quantum state and process tomography.) However, W can be easily corrected to get a physically-admissible solution. This follows because equation (3.16) shows that W lies within distance ∈1 of a physically-admissible solution W′custom-character0, and this solution W′ can be obtained by truncating the negative eigenvalues of W.


Finally, there are different ways of relaxing the positivity constraint (Eq. 3.8), and (Eq. 3.16) is not the strongest possible choice. For instance, a stronger constraint could be used than (Eq. 3.16), such as: d′(W,K+)≤∈1, where d′(W,K+)=minW′∈K+∥diag (W−W′)∥2. However, the constraint (Eq. 3.16) may be simpler to implement using numerical linear algebra software.


Since these are convex optimization problems, they can be solved efficiently (both in theory and in practice), for instance by using interior point algorithms. Nonetheless, some care is needed to ensure that these algorithms can scale up to solve very large instances of these problems. In particular, enforcing the positivity constraint (Eq. 3.8), and its relaxed version (Eq. 3.16), can be computationally expensive.


In aspects, the positivity constraint may be omitted. The matrix C 400 can be reconstructed by solving the convex optimization problem (Eq. 3.13)-(Eq. 3.16). This analysis holds even without the positivity constraint (Eq. 3.16). In aspects, it is easy to check that the positivity constraint is not used, and indeed, most of the theory of compressed sensing applies to all sparse signals, not just positive ones, although positivity can be helpful in certain situations.


This observation has a practical consequence: by omitting the positivity constraint (Eq. 3.16), one can make the convex optimization problem simpler, and thus easier to solve in practice (e.g., by using second-order cone programming, rather than semidefinite programming). One can then take the resulting solution, and project it onto the positive semidefinite cone, as is sometimes done in quantum state tomography, without increasing the error (in Frobenius norm). This technique is useful for scaling up the disclosed method to extremely large numbers of qubits.


Next, setting the Error Parameters ∈1 and ∈2 is described. To set the parameters ∈1 and ∈2 in equations (3.9) and (3.10): First, consider ∈1, which bounds u, the error in g. Note that, when gj is estimated using the above noted procedure, large-deviation bounds on uj are obtained. In particular, for some δ1>0, ∥ujψ2≤δ1cjj, (Eq. 3.19)


where ∥·∥ψ2 is the sub-Gaussian norm. This bound can be obtained from equation (2.10), by setting Ntrials˜1/δ12. The trials of the experiment that are used to choose the evolution time t may be neglected in the count, because the number of those trials grows only logarithmically with Γab.


This implies that ∥u∥22 is a subexponential random variable, whose subexponential norm is at most 2δ12∥diag(C)∥22. This implies that ∥u∥2 is bounded with high probability: for any τ≥1, Pr[∥u∥2≥τδ1∥diag(C)∥2]≤e·exp(−τ2/2c), (Eq. 3.20)


where c>0 is some universal constant. τ may be chosen to be a sufficiently large constant, so that the failure probability is small. (Note that in some cases, one can prove stronger bounds, by taking advantage of the fact that the coordinates of u are independent, and using a Bernstein-type inequality. This bound is stronger when the diagonal elements of C satisfy∥diag(C)∥<<∥diag(C)∥2, i.e., when u has many independent coordinates with similar magnitudes.)


The above bound does not immediately tell how to set ∈1, because the bound depends on diag(C), which is not known exactly. Instead, a bound that depends on g is derived, which is known explicitly, and can be used to set ∈1. To do this, assume that δ1 is sufficiently small so that τδ1<1/4. With high probability:















u


2





τδ
1






diag

(
C
)



2












τδ
1

(




g


2

+



u


2


)













τδ
1


1
-

τδ
1







g


2

=:

ϵ
1



,







(
3.21
)







Using the triangle inequality, and some algebra shows how to set ∈1 so that equation (3.9) holds.


1 may also be bounded in terms of diag(C), as follows:













ϵ
1






τδ
1


1
-

τδ
1





(





diag

(
C
)



2


+



u


2



)













τδ
1


1
-

τδ
1





(

1
+

τδ
1


)






diag

(
C
)



2











2


τδ
1







diag

(
C
)



2

.









(

Eq
.

3.22

)









    • In aspects, this bound may be used to analyze the accuracy of the estimate of matrix C 400. ∈2 bounds v, which is the error in h. The same approach as above may be used. When hj is estimated a bound on the sub-Gaussian norm of vj is obtained: for some δ2>0,








vjψ2≤δ2Φj(C).  (Eq. 3.23)


(This bound can be obtained from equation (2.10), by setting Ntrials˜1/δ22. δ2 to may be different from δ1, because the measurements used to estimate hj are more costly than the measurements used to estimate gj, hence one may prefer to use different values for Ntrials in each case.)


This implies that ∥v∥22 is a subexponential random variable, hence ∥v∥2 is bounded with high probability: for any τ≥1,






Pr[∥v∥2≥τδ2∥Φ(C)∥2]≤e·exp(−τ2/2c),  (Eq. 3.24)


where c>0 is some universal constant, choose τ to be a sufficiently large constant, so that the failure probability is small. (Note that, for typical choices of Φ(·), it is expected that ∥Φ(C)∥<<∥Φ(C)∥2. This implies that v has many independent coordinates with similar magnitudes. When this occurs, one can prove a stronger bound using a Bernstein-type inequality.


The above bound does not immediately show how to set ∈2, because the bound depends on Φ(C), which is not known exactly. Instead, derive a bound that depends on h, which is known explicitly, and can be used to set ∈2. To do this, it is assumed that δ2 is sufficiently small so that τδ2<1/4. With high probability:















v


2





τδ
2






Φ

(
C
)



2













τδ
2

(




h


2

+



v


2


)














τδ
2


1
-

τδ
2







h


2

=:


ϵ
2

.










(
3.25
)







This indicates how to set ∈2 so that equation (3.10) holds. Finally, ∈2 can also be bounded in terms of ∥Ccustom-character1, as follows:













ϵ
2






τδ
2


1
-

τδ
2





(





Φ

(
C
)



2

+



v


2


)














τδ
2


1
-

τδ
2





(

1
+

τδ
2


)






Φ

(
C
)



2













2


τδ
2






Φ

(
C
)



2













2


τδ
2



m






Φ

(
C
)

















4


τδ
2



m





C





1

.











(
3.26
)







In aspects, this bound may be used when analyzing the accuracy of the estimate of C.


Referring to FIGS. 8A-8C, graphs that illustrating scaling of the reconstruction error ∥W(opt)−C∥ under various circumstances are shown. Here W(opt) denotes the estimate of C obtained via compressed sensing. The solid lines are the average errors over 100 random instances of the problem, and the shaded region is their 95% confidence interval obtained by bootstrapping. FIG. 8A illustrates the reconstruction error as a function of the number of measurement settings m (assuming noiseless measurements) for various values of sparsity s (and n=64 qubits). The errors go through a phase transition whose location mc scales linearly with s. FIG. 8B illustrates the reconstruction error as a function of the number of measurement settings m (assuming noiseless measurements) for various numbers of qubits n (and sparsity s=12). The phase transition point mc scales logarithmically with n. FIG. 8C illustrates the reconstruction error as a function of the number of measurement settings m, for different values of added noise strength σ, with fixed parameters (n, s)=(64,12). The inset shows that the recovery errors (when m>mc, i.e., after the transition point) scale linearly with σ, as expected.


Numerical simulations are used to test the performance of the disclosed method on realistic system sizes, with different levels of sparsity, and when the data contain statistical fluctuations due to finite sample sizes. As shown in FIGS. 8A-8C, the disclosed method performs well overall.


The protocol for randomly chosen C matrices is numerically simulated. In these examples it is assumed that the diagonal elements of C are known, that is, ∈1=0 in Eq. (3.11). The convex optimization problem given by equations (3.5)-(3.7) is solved using, for example, CVXPY, a convex optimization package for Python®.


The case of noiseless measurements, corresponding to ∈2=0 in Eq. (3.12) is investigated. In FIG. 8A the recovery error is shown as a function of the number of measurements, m, for a fixed number of qubits, n, and various choices of the off-diagonal sparsity, s. The sharp transition in the recovery error as a function of m is evident. Moreover, as shown in the inset of FIG. 8A, the transition point mc, which is defined as the point where ∥C−W(opt) drops below 0.25, scales linearly with s, consistent with the analytical results. In FIG. 8B s is fixed, n is varied, and the recovery error as a function of m is studied. A phase transition may be observed as m increases. In this case, mc scales polynomially with log(n) as suggested in the inset of FIG. 8B.


The effect of noisy measurements on the recovery error is investigated. Random C matrices may be generated, with a fixed number of qubits n and sparsity s. Noise is simulated by adding a random vector e, whose entries are independent Gaussian random variables with mean 0 and standard deviation σ, to measurement vector h. Eq. (3.7) may be replaced in the previous convex program with Eq. (3.12) and choose ∈2=√{square root over (m)}σ. The scaling of the reconstruction error ∥W(opt)−C∥ as a function of σ is shown in FIG. 8C. The recovery error after the phase transition point scales linearly with σ, consistent with the proposed analytical bounds.


The convex optimization problem (Eq. 3.13)-(Eq. 3.16) is used to prove rigorous recovery guarantees that show that the optimal solution W(opt) is close to the true correlation matrix C 400 (FIG. 6), provided that m≥c0s log n (and with better robustness to noise, when m≥c0s log4n). Here, m is the dimension of the measurement vector h, s is the sparsity (the number of nonzero elements) in the off-diagonal part of the matrix C, and c0 is some universal constant.


In aspects, two different results are proven: a non-universal recovery guarantee, using the “RIPless” framework, as well as a universal recovery guarantee, using RIP-based techniques. Here, RIP refers to the “restricted isometry property,” a fundamental proof technique in compressed sensing. There are different advantages to the RIPless and RIP-based bounds: the RIPless bounds require slightly fewer measurements, while the RIP-based bounds are more robust when the measurements are noisy.


In aspects, two variants of the problem (Eq. 3.13)-(Eq. 3.16) are introduced: constrained custom-character1-minimization and the LASSO (“least absolute shrinkage and selection operator”). Generally speaking, recovery guarantees that hold for one of these problems can be adapted to the other one, with minor modifications.


To simplify the problem, start with the convex optimization problem (Eq. 3.13)-(Eq. 3.16). |Remove the positivity constraint d(W,K+)≤∈1. Change the objective function to sum over all i<j rather than all i≠j; since W is symmetric, this merely changes the objective function by a factor of 2. Finally, shift the variable W by subtracting away diag(g), so that its diagonal elements are all zero. Similarly, shift the measurement vector h to get h′=h−Φ(diag(g)). (Eq. 5.1)


This provides an equivalent problem:











Find


W






n
×
n









that


minimizes







i
<
j





"\[LeftBracketingBar]"


W
ij



"\[RightBracketingBar]"





,




(

Eq
.

5.2

)









such


that
:











diag

(
W
)

=
0

,




(

Eq
.

5.3

)
















Φ

(
W
)

-


h




2






ϵ
2

+


ϵ
1




mn
,









(

Eq
.

5.4

)












W
=


W
T

.





(

Eq
.

5.5

)







An operation diag(·) that has two meanings: given an n×n matrix M, diag(M) returns an n-dimensional vector containing the diagonal elements of M; and given an n-dimensional vector v, diag(v) returns an n×n matrix that contains v along the diagonal, and zeros off the diagonal.


Define C′ to be the off-diagonal part of the correlation matrix C, that is, C′ is the matrix whose off-diagonal elements match those of C, and whose diagonal elements are zero.






C′=C−diag(diag(C)).  (Eq. 5.6)


h′ may be viewed as a measurement of C′, with additive error z, h′=Φ(C′)+z. (Eq. 5.7)


The solution W(opt) is an accurate estimate of C′. The error term z may be written in the form:












z
=


h


-

Φ

(

C


)








=

h
-

Φ

(
C
)

-

Φ

(

diag

(

g
-

diag

(
C
)


)

)









=

v
-

Φ

(

diag

(
u
)

)



,







(
5.8
)







where u and v are the noise terms in (Eq. 3.9) and (Eq. 3.10). next, bound z using (Eq. 3.10) and (Eq. 3.18), ∥z∥2≤∈2+∈1√{square root over (mn)}. (Eq. 5.9)


It will be convenient to write






D={W∈
custom-character
n×n
|W
T
=W,diag(W)=0}  (Eq. 5.10)


to denote the subspace of symmetric matrices whose diagonal elements are all 0. Let uvec: D→custom-charactern(n−1)/2 denote the linear operator that returns the upper-triangular part of W, uvec: Wcustom-character(Wij)i<j. (Eq. 5.11)


Write ΦD: D→custom-characterm to denote the measurement operator Φ restricted to act on the subspace D. The problem (Eq. 5.2)-(Eq. 5.5) may be rewritten in a more concise form:











Find






W



D






that


minimizes






uvec

(
W
)



1



,




(

Eq
.

5.12

)









such


that
:















Φ
D

(
W
)

-

h





2




ϵ
2

+


ϵ
1




mn
.








(

Eq
.

5.13

)







In aspects, a variant of the problem includes the LASSO:


Find W∈D that minimizes










1
2







Φ
D

(
W
)

-


h




2
2


+
λ





uvec

(
W
)





1

.






(

Eq
.

5.14

)







This can be viewed as a Lagrangian relaxation of the previous problem ((Eq. 5.12)-(Eq. 5.13)), or as an custom-character1-regularized least-squares problem. In addition, the convex optimization problems that were described earlier can also be relaxed into a LASSO-like form, in a similar way. The choice of the regularization parameter λ requires some care.


In aspects, the Regularization Parameter λ may be set by the following procedure. In general, the regularization parameter λ controls the relative strength of the two parts of the objective function in. When the noise in the measurement of h′ is strong, then λ must be set large enough to ensure that the custom-character1 regularization term still has the desired effect. However, if A is too large, it strongly biases the solution W(opt), making it less accurate.


An approach to setting λ, may be to have a goal that is to ensure that the solution W(opt) converges to (a sparse approximation of) the true correlation matrix C. To do this, λ must be set large enough to satisfy two constraints, which involve the noise in the measurement of h′ (see equation (4.1) and the equation below (4.2)). When these constraints are satisfied, the error in the solution W(opt) is bounded by equation (4.2). (Note that this error bound grows with 2L, hence one should choose the smallest value of λ that satisfies the above constraints.)


In order to set 2L, first, precise statements of the two constraints on λ are provided:





∥uvec(ΦDz)∥≤λ,  (Eq. 5.15)





∥uvec(ΦD,Tc(I−P)z∥≤λ.  (Eq. 5.16)


Here, z is the noise term in the measurement of h′ in equation (5.7); ΦD:custom-characterm→D is the adjoint of the measurement operator ΦD; T⊂{(j,j′)|1≤j<j′≤n} is the support of (a sparse approximation of) the true correlation matrix C; ΦD,T is the sub-matrix of ΦD that contains those columns of ΦD whose indices belong to the set T; P is the projection onto the range of ΦD,T; Tc is the complement of the set T; and ΦD,Tc is the sub-matrix of ΦD that contains those columns of ΦD whose indices belong to the set Tc.


In order to set λ, compute the quantities in equations (5.15) and (5.16), and to do this, some bounds on the noise z are needed. Two ways of obtaining such bounds include:


One straightforward way is as follows. Equation (5.8) can be used to write






z=v−Φ(diag(u)),  (Eq. 5.17)


where u and v are the noise terms in the measurements of g and h, respectively. Bounds on u and v (see equations (3.21) and (3.25)) imply bounds on z, via an elementary calculation. However, one can get better bounds on z by using a more sophisticated approach, starting with bounds on the sub-Gaussian norms of uj and vj, such as equations (3.19) and (3.23). Assume that g and h are measured using the procedures described previously. Then equations (3.19) and (3.23) provide bounds on the sub-Gaussian norms of uj and vj:





uj104 2≤δ1cjj,  (Eq. 5.18)





vj104 2≤δ2Φj(C).  (Eq. 5.19)


Using these bounds, set λ so that it satisfies equations (5.15) and (5.16) with high probability. More precisely, set










λ
:=


(




ϵ
1
′″

·
4



mn


+

ϵ
2
′″


)


4


m



(

1
+

2


)





ln

(
n
)

/

c






,




(

Eq
.

5.2

)














where



ϵ
1
′″


:=



δ
1


1
-

ϵ
1








g






,


ϵ
1


:=

2




ln

(
n
)

/

c
0





δ
1



,




(

Eq
.

5.21

)














ϵ
2
′″

:=



δ
2


1
-

ϵ
2








h






,


ϵ
2


:=

2




ln

(
m
)

/

c
0






δ
2

.







(

Eq
.

5.22

)







Here assume that δ1 and δ2 are sufficiently small (for instance, δ1≲1/√{square root over (ln(n))} δ2≲1/√{square root over (ln(n))} to ensure that ∈″1<½ and ∈″2<½. Also, here c′ and c0 are universal constants (which are defined in the proof below).


Now let the correlation matrix C and measurement operator ΦD be fixed, and note that the noise term z is stochastic. With high probability (over the random realization of z), equations (5.15) and (5.16) are be satisfied; here the failure probability is at most (e/n3)+(e/m3)+(2e/n1+2√{square root over (2)}).


Finally, the following simple upper bounds on ∈′″1, ∈′″2 and λ (these follow from the definitions of ∈′″1 and ∈′″2, the definitions of g and h, and the definition of Φ(·)):





∈′″1≤3δ1∥diag(C)∥28,  (Eq. 5.23)





∈′″2≤6δ2∥Ccustom-character1,  (Eq. 5.24)





λ≤O(√{square root over (mln(n))}(δ1∥diag(C)∥√{square root over (mn)}+δ2∥Ccustom-character1)).  (Eq. 5.25)


These bounds may be used to analyze the accuracy of the estimate of matrix C 400 (FIG. 6). Next, isotropy and incoherence of the measurement operator is described. The rows of the measurement operator ΦD have two properties, isotropy and incoherence, which play a fundamental role in compressed sensing. Let Q be the matrix (of size m by n(n−1)/2) that represents the action of ΦD (using the fact that the subspace D is isomorphic to custom-charactern(n−1)/2); that is, Q and ΦD are related by the equation: ΦD(C)=Q·uvec(C),∀C∈D. (Eq. 5.26)


The rows of Q are chosen independently at random, and each row has the form q=4uvec(rrT)∈custom-charactern(n−1)/2, (Eq. 5.27)


where r is sampled from the distribution described in (Eq. 3.2). q is centered if it has mean custom-character(q)=0, and q is isotropic if its covariance matrix is the identity:






custom-character(qqT)=I.  (Eq. 5.28)


It is straightforward to check that q is centered and isotropic (up to a normalization











factor


of


2

)

,


since
:


{






𝔼
[


r
i



r
j


]

=
0




i
<
j







𝔼
[


r
i



r
j



r
k



r
l


]

=
0





i
<

j


and






k


<

l


and



{

i
,
j

}





{

k
,
l

}



=








𝔼
[


r
i



r
j



r
k



r
l


]

=
0





i
<

j


and


k

<

l


and





"\[LeftBracketingBar]"



{

i
,
j

}





{

k
,
l

}




"\[RightBracketingBar]"




=
1







𝔼
[


r
i



r
j



r
k



r
l


]

=

1
4






i
<

j


and


k

<

l


and


i


=


k


and


j


=
l





.







(

Eq
.

5.29

)







(Note that in the last line of Eq. 5.29, a case with i=l and j=k, as the requirements of i<j and k<l leads to a contradiction.)


In addition, q is incoherent with parameter μ>0 if, with probability 1, all of its coordinates are small: ∥q∥2≤μ. (Eq. 5.30)


In order for this to be useful for compressed sensing, one needs μ to be small, at most polylogarithmic in the dimension of q. For example, q is incoherent with parameter μ=16.


A non-universal recovery guarantee, using the “RIPless” framework is described herein, which in turn relies on the isotropy and incoherence properties described above.


Let C∈custom-charactern×n be a correlation matrix, and let C″∈D be its off-diagonal part (see equation (5.6)). C′ is approximately s-sparse, i.e., there exists a matrix C(s)∈D that has at most s nonzero entries above the diagonal, and that approximates C′ in the (vector) custom-character1 norm, up to an error of size ηs. This can be written compactly as:





∥uvec(C′−C(s))∥1≤ηs,  (Eq. 5.31)


where uvec(·) was defined in equation (5.11). (Recall that both C′ and C(s) are symmetric, with all zeros on the diagonal. Hence it suffices to consider those matrix entries that lie above the diagonal.)


Choose the measurement operator Φ at random (see equation (3.4)). Assume that m (the dimension of h) satisfies the bound:









m






c
˜

0

(

1
+
β

)

·
4


s


log





n

(

n
-
1

)

2

.






(

Eq
.

5.32

)







Here, {tilde over (c)}0 is a universal constant, and β>0 is a parameter that can be chosen freely by the experimenter. Note that m scales linearly with the sparsity s, but only logarithmically with the dimension n of the matrix C. This scaling is close to optimal, in an information-theoretic sense. This is one way of quantifying the advantage of compressed sensing, when compared with measurements that do not exploit the sparsity of C.


Measure g≈diag(C) and h≈Φ(C), and calculate h′ (see equation (5.1)). Let δ1 and δ2 quantify the noise in the measurements of g and h, as described in equations (5.18) and (5.19). Next, solve the LASSO problem, setting the regularization parameter λ according to (Eq. 5.20). Let W(opt) be the solution of this optimization problem.


Now the following recovery guarantee is noted, which shows that W(opt) provides a good approximation to C′, in both the Frobenius (custom-character2) norm, and the vector custom-character1 norm. Theorem 1: For any correlation matrix C satisfying the sparsity condition (5.31), with probability at least






1
-


1

2


n

(

n
-
1

)


-

6


e

-
β







(over the random choice of the measurement operator Φ, with m set according to (Eq. 5.32)), the solution W(opt) is close to C′, with an error that is bounded by:















W

(
opt
)


-

C





F




2




c

(

1
+
α

)

[



η
s


s


+

λ


s



]



,




(

Eq
.

5.33

)


















W

(
opt
)


-

C







1




2



c

(

1
+
α

)

[


η
s

+
λs

]



,




(

Eq
.

5.34

)







where c is a universal constant, and







α



log

3
/
2


(


n

(

n
-
1

)

2

)


.




In these bounds, the first term upper-bounds the error that results from approximating C′ by a sparse matrix, and the second term upper-bounds the error due to noise in the measurements of g and h.


To make the second term more transparent, the second term can be combined with the bound on λ from equation (5.25):





λ≤O(√{square root over (m ln(n))}(δ1∥diag(C)∥√{square root over (mn)}+δ2∥Ccustom-character1)),  (Eq. 5.35)


where δ1 and δ2 quantify the noise in the measurements of g and h, as described earlier. Also, it is useful to consider the special case where C′ is exactly s-sparse, so ηs=0, and where as few measurement settings as possible are used, by setting m∝s log n. In this case:





W(opt)−C′∥F≤O(log3/2(n)λ√{square root over (s)})  (Eq. 5.36)





O(√{square root over (s)} log3/2(n)√{square root over (m log(n))}·[δ1∥diag(C)∥√{square root over (mn)}+δ2∥Ccustom-character1])  ((Eq. 5.37)





O(s log5/2(n)[δ1∥diag(C)∥√{square root over (sn log(n))}+δ2∥Ccustom-character1])  (Eq. 5.38)





O(s log5/2(n)[δ1∥diag(C)∥√{square root over (sn log(n))}+δ2√{square root over (n)}∥diag(C)∥22√{square root over (2s)}∥C′∥F]).  (Eq. 5.39)


This can be compared with the error bound (Eq. 2.31) for the naive method, and the RIP-based error bound (Eq. 5.48) for compressed sensing. Generally speaking, compressed sensing has an advantage over the naive method when s is small, and the RIPless bound is useful in the regime between m∝s log n and m∝s log4n, where the RIP-based bound does not apply. (When m is in the regime m∝s log4n or larger, the RIP-based bound applies, and provides better results than the RIPless bound.)


Next, a universal recovery guarantee, using an older approach based on the restricted isometry property (RIP) is described. This also relies on the isotropy and incoherence properties shown above.


First, the number of measurement settings is set as: m≥c0s log4n, (5.40) where s is the sparsity parameter, and co is some universal constant. (Note that m is slightly larger, by a poly(log n) factor, compared to the RIPless case.) Also, recall that D is the subspace of symmetric matrices whose diagonal elements are all 0 (see equation (5.10)), and ΦD is the measurement operator restricted to act on this subspace.


In aspects, with high probability (over the random choice of ΦD), the normalized measurement operator ΦD/√{square root over (m)} satisfies the RIP (for sparsity level 2s). To see this, recall the isotropy and incoherence properties shown above. These properties imply that the measurement operator ΦD is sampling at random from a “bounded orthonormal system.” Such operators are known to satisfy the RIP, via a highly nontrivial proof.


From this point onwards, let the measurement operator ΦD be fixed. ΦD is capable of reconstructing the off-diagonal parts of all sparse matrices C, i.e., ΦD can perform “universal” recovery.


As described above, let C∈custom-charactern×n be a correlation matrix, and let C′∈D be its off-diagonal part (see equation (5.6)). Assume that C′ is approximately s-sparse, i.e., there exists a matrix C(s)∈D that has at most s nonzero entries above the diagonal, and that approximates C′ in the (vector) custom-character1 norm, up to an error of size ηs. This can be written compactly as: ∥uvec(C′−C(s))∥1≤ηs, (Eq. 5.41)


where uvec(·) was defined in equation (5.11). (Recall that both C′ and C(s) are symmetric, with all zeros on the diagonal. Hence it suffices to consider those matrix entries that lie above the diagonal.)


Measure g≈diag(C) and h≈Φ(C), and calculate h′ (see equation (5.1)). Assume that the noise in the measurements of g and h is bounded by δ1 and δ2, as described in equations (3.19) and (3.23). Then solve the custom-character1-minimization problem in (5.12) and (5.13), setting the parameters ∈1 and ∈2 according to equations (3.21) and (3.25). Let W(opt) be the solution of this problem.


The following recovery guarantee, which shows that W(opt) provides a good approximation to C′, in the Frobenius (custom-character2) norm, and in the custom-character1 vector norm. There is one subtle point: the convex optimization problem in equations (5.12) and (5.13) uses the unnormalized measurement operator ΦD, while the error bounds apply to the normalized measurement operator ΦD/√{square root over (m)}. Hence, the noise is smaller by a factor of Vi in these error bounds.)


Theorem 2: With high probability (over the choice of the measurement operator Φ, with m set according to (Eq. 5.40)), for all correlation matrices C (satisfying the sparsity condition (Eq. 5.41)), the solution W(opt) satisfies the following error bounds:















W

(
opt
)


-

C





F





c
1




η
s


s



+


c
2

(



ϵ
2


m


+


ϵ
1



n



)



,




(

Eq
.

5.42

)


















W

(
opt
)


-

C







1






c
1



η
s


+


c
2



s



(



ϵ
2


m


+


ϵ
1



n



)




,




(

Eq
.

5.43

)







where c1 and c2 are universal constants.


In these bounds, the first term upper-bounds the error that results from approximating C′ by a sparse matrix, and the second term upper-bounds the error due to noise in the measurements of g and h.


In order to apply these bounds, one needs to know the values of ∈1 and ∈2. These can be obtained from equations (3.22) and (3.26):





1≤O1∥diag(C)∥2),  (Eq. 5.44)





2≤O2√{square root over (m)}∥Ccustom-character1).  (Eq. 5.45)


Also, it is useful to consider the special case where C′ is exactly s-sparse, so ηs=0, and where as few measurement settings as possible are used, by setting m∝s log4n. In this case:





W(opt)−C′∥F  (Eq. 5.46)





O(√{square root over (n)}δ1∥diag(C)∥22∥C∥custom-character1)  (Eq. 5.47)





O(√{square root over (n)}δ1∥diag(C)∥22(√{square root over (n)}∥diag(C)∥2+√{square root over (2s)}∥C′∥F)).  (Eq. 5.48)


This can be compared with the error bound (Eq. 2.31) for the naive method, and the RIPless error bound (Eq. 5.39) for compressed sensing. Generally speaking, compressed sensing has an advantage over the naive method when s is small, and the RIP-based bound has better scaling (as a function of n, s, diag(C) and C′) than the RIPless bound, although it requires m to be slightly larger.


The performance of the disclosed compressed sensing method, for a typical measurement scenario is described below. Both the accuracy of the method, and the experimental resources required to implement it are considered. The asymptotic scaling of the disclosed method is described, and compared to the naive method, direct estimation of the correlation matrix.


Overall, the disclosed compressed sensing method has asymptotically better sample complexity, whenever the off-diagonal part of the correlation matrix C is sufficiently sparse. In particular, for a system of n qubits, the disclosed method is advantageous whenever the number of correlated pairs of qubits, s, is at most O(n3/2) (ignoring log factors). These results are summarized in FIGS. 8A-8C.


Let C′ be the off-diagonal part of the correlation matrix C 400, that is, C′ is the matrix whose off-diagonal elements match those of C, and whose diagonal elements are zero. C′ has at most s nonzero elements above the diagonal (and, by symmetry, at most s nonzero elements below the diagonal). The goal is to estimate both C′ and diag(C).


Compressed sensing allows the possibility of adjusting the number of measurement settings, m, over a range from m∝s log n to m∝n2. (Note that m∝s log n is just slightly above the information-theoretic lower bound, while m∝n2 is the number of measurement settings used by the naive method.) Compressed sensing works across this whole range, but the error bounds vary depending on m. There are two cases: (1) For m∝s log4n or larger, both the RIP-based and RIPless error bounds are available, and the RIP-based error bounds are asymptotically stronger. (2) For m between m∝s log n and m∝s log4n, only the RIPless error bound is available.


To make a fair comparison between compressed sensing and the naive method, the accuracy of these methods needs to be quantified in a consistent way. This is a nontrivial task, because the error bounds for the different methods have different dependencies on the parameters n, s, diag(C) and C′; this can be seen by comparing equation (2.31), and Theorems 1 and 2 as described herein.


A simple way of quantifying the accuracy of all of these methods may be used: given some δ>0, each method is required to return an estimate Ĉ′ that satisfies





Ĉ′−C′∥F≤δ∥C∥F.  (Eq. 6.1)


The Frobenius matrix norm may be used, which is equivalent to the vector custom-character2 norm. C (rather than C′) is written on the right hand side of the inequality, in order to allow the recovery error to depend on both the diagonal and the off-diagonal elements of C.


In addition, each method returns an estimate g of diag(C) that satisfies





g−diag(C)∥2≤δ∥diag(C)∥2.  (Eq. 6.2)


For both compressed sensing as well as the naive method, g is obtained in the same way, by performing single-qubit spectroscopy as in (2.19), and the error in g satisfies the same bound (Eq. 2.30).


The cost of implementing each method using real experiments should be accounted for. This cost depends on a number of factors. One factor is the total number of experiments that have to be performed, often called the sample complexity. This is the number of measurement settings, times the number of repetitions of the experiment with each measurement setting. Another factor is the difficulty of performing a single run of the experiment. This involves both the difficulty of preparing entangled states (random n-qubit GHZ states for the compressed sensing method, and 2-qubit Bell states for the naive method), and the length of time that one has to wait in order to observe dephasing.


An advanced quantum information processor is considered, where n-qubit GHZ states are fairly easy to prepare (using O(logn)-depth quantum circuits), and dephasing occurs at low rates, so that the main cost of running each experiment is the amount of time needed to observe dephasing. In this scenario, it is reasonable to use the sample complexity as a rough measure of the total cost of implementing the compressed sensing method, as well as the naive method.


In aspects, the sample complexity for three methods of interest may be calculated: (1) the naive method with m∝n2, (2) compressed sensing with m∝s log4n, and (3) compressed sensing with m∝s log n. This method (2) outperforms the naive method whenever s≤O(n3/2/log2n), and method (3) outperforms the naive method whenever s≤O(n2/3/log2n).


In addition, for each of these methods, the number of samples where single-qubit spectroscopy is performed, and the number of samples where multi-qubit spectroscopy is performed. (Recall that all of these methods use single-qubit spectroscopy to estimate the diagonal of C, and then use multi-qubit spectroscopy to estimate the off-diagonal part of C.) Both of these numbers can be important: multi-qubit spectroscopy is more expensive to implement on essentially all experimental platforms, and requires more samples when s is large; but it is possible for single-qubit spectroscopy to dominate the overall sample complexity, when s is small.


A naive method with m∝n2 is presented. Two parameters, δ1 and δ2, to quantify the accuracy of the measurements are used, as in equations (2.24) and (2.25). An estimate of C′ is provided, whose error is bounded by equation (2.31): with probability at least 1−η,















C
ˆ



-

C





F
2





1
η

[


3


(

n
-
1

)




(


δ
1

+

δ
2


)

2







diag



(
C
)




2
2


+

6


δ
2
2






C




F
2



]

.





(

Eq
.

6.3

)







For simplicity, η is set to be some universal constant, for example, η=0.001. Now, given any δ>0:



















C
^



-

C





F
2






δ
2







diag



(
C
)




2
2


+


(


δ
2

/
n

)






C




F
2













δ
2





C


F
2






,




(

Eq
.

6.4

)







by setting δ12=O(δ/√{square root over (n)}). This satisfies the requirement (Eq. 6.1).


In addition, one can easily check that the estimate g for diag(C) satisfies the requirement (Eq. 6.2).


Then the sample complexity is as follows (see the discussion preceding (Eq. 2.24) and (Eq. 2.25)): the method performs single-qubit spectroscopy on O(n/δ12)=O(n22) samples, and multi-qubit spectroscopy on O(n222)=O(n32) samples. Hence the total sample complexity is O(n32).


In aspects, compressed sensing may be performed with m∝s log4n. The solution W(opt) for the custom-character1-minimization problem in (Eq. 5.12) and (Eq. 5.13) satisfies the RIP-based bound in Theorem 2. Two parameters, δ1 and δ2, are used to quantify the accuracy of the measurements, as in equations (3.19) and (3.23).


The estimator W(opt) satisfies the following error bound (see Theorem 2, and equation (5.48)):





W(opt)−C′∥F≤O(√{square root over (n)}δ1∥diag(C)∥22(√{square root over (n)}∥diag(C)∥2+√{square root over (2s)}∥C′∥F)).  (Eq. 6.5)


Now, given any δ>0:



















W

(
opt
)



-

C





F






1

2



δ







diag



(
C
)




2


+


1

2



δ





C




F












δ





C


F






,




(

Eq
.

6.6

)







by setting δ1=O(δ/√{square root over (n)}) and δ2=O(δ/√{square root over (max(n,s))}). (Eq. 6.7)


This satisfies the requirement (Eq. 6.1). In addition, one can easily check that the estimate g for diag(C) satisfies the requirement (Eq. 6.2). Then the sample complexity is as follows (see the discussion following (Eq. 3.19) and (Eq. 3.23)): the method performs single-qubit spectroscopy on O(n/q)=O(n22) samples, and multi-qubit spectroscopy on


O(m/δ22)≤O(smax(n,s)log4(n)/δ2) (Eq. 6.8) samples. Hence the total sample complexity is at most O(max(n,s)2 log4(n)/δ2). (Eq. 6.9)


This is less than the sample complexity of the naive method, provided the off-diagonal part of the correlation matrix is sufficiently sparse, i.e., when s≤O(n3/2/log2n).


In aspects, compressed sensing may be performed with m∝s log n. The LASSO optimization problem whose solution W(opt) satisfies the RIPless bound in Theorem 1. Two parameters, δ1 and δ2, are used to quantify the accuracy of the measurements, as in equations (3.19) and (3.23). In the following, the diagonal elements of C satisfy a bound of the form














diag



(
C
)








0



(


1

n








diag



(
C
)




2


)

.






(

Eq
.

6.1

)







This assumption may be used to get a stronger error bound for W(opt). The assumption (Eq. 6.10) indicates that none of the diagonal elements cjj is too much larger than the others. This is plausible for a quantum system 100 that consists of many qubits that are constructed in a similar way.


In order to make this intuition more precise, (Eq. 6.10) may be written in an equivalent form:













max



1

j

n





(

c
jj
2

)




0


(


1
n






j
=
1

n


c
jj
2



)



,




(

Eq
.

6.11

)







which indicates that the largest cjj2 is at most a constant factor larger than the average of all of the cjj2. Also, it is informative to consider how (Eq. 6.10) and (Eq. 6.11) compare to the (arguably more natural) assumption that











max

1

j

n






"\[LeftBracketingBar]"


c
jj



"\[RightBracketingBar]"





0



(


1
n






j
=
1

n




"\[LeftBracketingBar]"


c
jj



"\[RightBracketingBar]"




)

.






(

Eq
.

6.12

)







In fact, (Eq. 6.12) is actually a stronger assumption, in the sense that it implies (Eq. 6.10) and (Eq. 6.11), via the Cauchy-Schwarz inequality. The estimator W(opt) satisfies an error bound given by Theorem 1, and equation (5.39). Combining this with the assumption (Eq. 6.10):





W(opt)−C′∥F≤O(s log5/2(n)[δ1√{square root over (s log(n))}∥diag(C)∥22√{square root over (n)}∥diag(C)∥22√{square root over (2s)}∥C′∥F]).  (Eq. 6.13)


Now, given any S>0, ensure that


















W

(
opt
)


-

C





F






1

2



δ







diag



(
C
)




2


+


1

2



δ





C




F













δ
2





C


F
2






,




(

Eq
.

6.14

)














by


setting







δ
1


=

O

(

δ


s

3
/
2





log
3




(
n
)



)


,




(

Eq
.

6.15

)













and



δ
2


=


O

(

δ

s




log

5
/
2


(
n
)




max

(

n
,
s

)




)

.





(

Eq
.

6.16

)







This satisfies the requirement (Eq. 6.1).


In addition, one can easily check that the estimate g for diag(C) satisfies the requirement (Eq. 6.2). Then the sample complexity is as follows (see the discussion following (3.19) and (3.23)). The disclosed method performs single-qubit spectroscopy on O(n/δ12)=O(ns3 log6(n)/δ2) (Eq. 6.17)


In aspects, the disclosed method further includes performing single-qubit spectroscopy samples, and multi-qubit spectroscopy on O(m/δ22)≤O(s3 max(n,s)log6(n)/δ2) (Eq. 6.18) samples. Hence the total sample complexity is at most O(s3 max(n,s)log6(n)/δ2). (Eq. 6.19)


The total sample complexity is less than the sample complexity of the naive method, provided the off-diagonal part of the correlation matrix is sufficiently sparse, i.e., when s≤O(n2/3/log2n)


Referring to FIGS. 9A and 9B, an evolution time t is chosen. During the physical implementation of the measurements of the correlation matrix C, decay rates Γab may be estimated. To do this, quantum states |ψabcustom-character are prepared, allowing the quantum states to evolve for some time t, and then measure the quantum states in an appropriate basis. This works well when t is chosen appropriately, so that Γabt˜1.


For example, one method of estimating the decay rate may use an evolution time t such that







1
2




Γ

a

b



t


2.




An initial guess for t (τ0) is made. Next, a “binary search” is performed. The binary search includes running a sequence of experiments, where one observes the dephasing of the state |ψabcustom-character for some time t, and after each experiment, one adjusts the time t adaptively, multiplying and dividing by factors of 2, in order to get the “right amount” of dephasing. Generally, the sequence of experiments is about ˜|log(Γabτ0)| experiments. More precisely, the following procedure may be performed:


1. Fix some τ0>0; as the initial guess for the evolution time t.


2. For r=1, 2, . . . , Ntrials, perform the following: (Ntrials are set according to equation (7.9) below)


(a) Set s0=0 and t0=2s0τ0 (as the initial guess for t).


(b) For j=0,1,2, . . . , Nsteps−1, do the following: (Nsteps are set according to equation (7.5) below)


i. Prepare the state

















"\[LeftBracketingBar]"


ψ

a

b





=


1

2




(



"\[LeftBracketingBar]"

a






+




"\[RightBracketingBar]"




b



)

,




allow the state to dephase for time then measure in the basis













1

2




(



"\[LeftBracketingBar]"

a





±




"\[RightBracketingBar]"




b



)




ii. If the measurement returns














1

2




(



"\[LeftBracketingBar]"

a





+




"\[RightBracketingBar]"




b



)

,




then set










s

j
+
1


=

{






s

j
+
1





with


probability




e
-
1


e
+
1



,







s
j




otherwise
.











(

Eq
.

7.1

)







If the measurement returns














1

2




(



"\[LeftBracketingBar]"

a





-




"\[RightBracketingBar]"




b



)

,




then set sj+1=sj−1.


iii. Set tj+1=2sj+τ0. (This is the next guess for t.)


(c) Define τr to be the value of sj from the last iteration of the loop, that is, ξr=sNsteps.


3. Compute the average






ξ
=


1

N
trials







r
=
1


N
trials




ξ
r

.







Return {circumflex over (t)}=286 τ0. (This is the estimate for t.)


This procedure can be described intuitively as follows. The inner loop of this procedure (the loop indexed by j) can be viewed as a kind of stochastic gradient descent, which behaves like a random walk on real numbers of the form t=2sτ0 (s∈custom-character) (see the dashed curves in FIG. 9A).


In aspects, this random walk has a single basin of attraction at a point t*=2s0 that satisfies Γabt*≈1, that is, s*≈−log2abτ0) The random walk converges to this point: with high probability, the sequence s0, s1, s2, . . . will reach the point s* after O(|s*|)=O(|log(Γabτ0)|) steps; after that point, the sequence will remain concentrated around s*, with exponentially-decaying tail probabilities (see FIG. 9B). This claim is made precise in equations (7.5) and (7.6).


Finally, the outer loop of this procedure (the loop indexed by r) computes an estimate ξ of s*, by averaging over several independent trials (see the solid curves in FIG. 9A). This then yields an estimate {circumflex over (t)} of t*. The required number of trials, and the accuracy of the resulting estimate {circumflex over (t)}, are analyzed in equations (7.9) and (7.10).


Referring to FIGS. 9A and 9B, plots illustrating the evolution time t are shown. FIG. 9A shows the trajectories of the random walk. Start with an initial guess τ0, that can either be shorter or longer than the optimal time. The evolution time is then stochastically halved or doubled over Nsteps iterations, according to the algorithm described herein. This procedure is repeated (dashed curves) and the outcome is averaged (solid curves) to obtain an estimate {circumflex over (t)}. The region in which







1

2



<


Γ
ab



t
^


<
2




is shaded in green. FIG. 9B illustrates the accuracy of the final estimate t, as a function of the number of steps Nsteps and the starting point τ0. The green shading shows the region where {circumflex over (t)} satisfies the bound







1

2



<


Γ
ab



t
^


<
2.




At the boundary of the green region, Nsteps scales logarithmically with |Γabτ0|, as predicted by Eq. (7.5).


To choose t the random walk may be used. The variables sj, which are related to the tj via the identity tj=2sjτ0. It is easy to see that s0=0, sj+1∈{sj,sj−1,sj+1}, and













𝔼
[


s

j
+
1






"\[LeftBracketingBar]"


s
j



]

=



s
j

+


1
2



(

1
+

e


-

Γ
ab




t
j




)




e
-
1


e
+
1



-


1
2



(

1
-

e


-

Γ
ab




t
j




)









=



s
j

+


1

e
+
1





(


e

1
-


Γ
ab



t
j




-
1

)

.










(

Eq
.

7.2

)







Hence the sequence s0, s1, s2, . . . can be viewed as the trajectory of a random walk on a 1-dimensional chain, beginning at s0, with transition probabilities that vary along the chain. The expected behavior of the random walk can be bounded as follows:










𝔼
[


s

j
+
1






"\[LeftBracketingBar]"


s
j



]



{








s
j

+

μ


when



Γ
ab



t
j





1

2



,










s
j

-

μ


when



Γ
ab



t
j





2


,










s
j

+



e
-
1


e
+
1




when



Γ
ab



t
j




1

,










s
j

-


1

e
+
1




when







Γ
ab



t
j




1

,









(

Eq
.

7.3

)







where μ is a numerical constant,





μ=0.09.  (Eq. 7.4)


Hence the random walk will tend to converge towards some integer s* such that t*=2s0 satisfies







1

2





Γ
ab



t
*





2

.





Note that the expected position of the random walk moves towards s* at a rate that is lower-bounded by μ. So, in order to go from s0=0 to s*≈−log2abτ0), the random walk will take roughly








1
μ





"\[LeftBracketingBar]"


s
*



"\[RightBracketingBar]"



=


1
μ





"\[LeftBracketingBar]"



log
2

(


Γ
ab



τ
0


)



"\[RightBracketingBar]"







steps. It is easy to see that the stationary distribution of the random walk is centered around s*, with exponentially decaying tails; hence, once the walk reaches s*, it will remain concentrated around that point.


Set the parameter Nsteps so that, with high probability, after Nsteps steps, the random walk will converge. Given an upper bound h on the magnitude of s*, i.e., |s*|≤h, or equivalently 2−h≤Γabτ0≤2h. The random walk may be performed for a number of steps











N
steps

=


h
μ

+
η


,




(

Eq
.

7.5

)







where η≥0. Here,






h
μ




is (an upper bound on) the expected number of steps needed to reach s*. Take an additional η steps to ensure that the walk does indeed reach s* with high probability (the probability of failure decreases exponentially with η).


After Nsteps steps, the final position of the walk is close to s*, with exponentially decaying tail probabilities: for any custom-character≥1,














Pr
[




"\[LeftBracketingBar]"



s

N
steps


-

s
*




"\[RightBracketingBar]"








"\[RightBracketingBar]"




s
0


=
0

]





16

μ
2




exp

(



-


μ

(

μ
+
1

)

8





+


μ
2

4


)


+

2



exp

(


-

μ
16



min


{


μη
h

,
1

}



(


+
μη

)


)

.







(

Eq
.

7.6

)







In particular, when







η


h
μ


,




this bound can be slightly simplified:










Pr
[







"\[LeftBracketingBar]"



s

N
steps


-

s
*




"\[RightBracketingBar]"





|

s
0


=
0

]






1

6


μ
2




exp

(



-


μ

(

μ
+
1

)

8





+


μ
2

4


)


+

2



exp

(


-

μ

1

6





(


+
μη

)


)

.







(

Eq
.

7.7

)







The bound (Eq. 7.6) is proved using martingale techniques.


In aspects, the parameter Ntrials may be set, and an error bound on the estimator {circumflex over (t)} may be derived. The bound (Eq. 7.6) implies that the random variables ξr=sNsteps are sub-exponential, and their sub-exponential norms are bounded by some constant ∥ξrψ1≤K. Hence their average ξ satisfies a Bernstein-type concentration inequality: for every δ≥0,











Pr
[




"\[LeftBracketingBar]"


ξ
-

s
*




"\[RightBracketingBar]"



δ

]



2


exp

(


-
c


min


{



δ
2


K
2


,

δ
K


}



N

t

r

i

a

l

s



)



,




(

Eq
.

7.8

)







where c>0 is a universal constant.


Now, for any ∈>0, set










N
trials

=


1
c


max


{


K
δ

,


K
2


δ
2



}




log

(

2
ϵ

)

.






(

Eq
.

7.9

)







The following error bound on the estimator {circumflex over (t)}=2ξτ0: with probability ≥1−∈, provides |ξ−s*|<δ, which implies that





2−0.5−δ≤Γab√{square root over (t)}<20.5αδ,  (Eq. 7.10)


Assuming δ<½, this implies that








1
2

<


Γ

a

b




t
ˆ


<
2

,




as desired.


When the disclosed measurement protocols are implemented in an experiment, errors may occur during state preparation and measurement (SPAM errors). The effect of these errors on estimating the decay rates Γab may be investigated. Let ρ0 and E0 denote the noiseless initial state and observable of interest, respectively.



























ρ
0

=


E
0

=


1
2



(



"\[LeftBracketingBar]"

a










a





"\[RightBracketingBar]"



+



"\[LeftBracketingBar]"

b







b





"\[RightBracketingBar]"



+



"\[LeftBracketingBar]"

b







a





"\[RightBracketingBar]"



+



"\[LeftBracketingBar]"

a







b





"\[RightBracketingBar]"



)

.




(
8.1
)







Error channels εs and εm that act on state preparation and measurement operations may be considered as:





{tilde over (ρ)}=εs0)=ρ0+δρ  (Eq. 8.2)






{tilde over (E)}=ε
m(E0)=E0+δE,  (Eq. 8.3)


where ∥δρ∥tr≤∈s and ∥δE∥≤∈m, and ∈s and ∈m are small parameters. The outcome of the protocol is now given by






{tilde over (P)}
ab(t)=Tr[{tilde over (E)}εt({tilde over (ρ)})],  (Eq. 8.4)


where ∈t=exp(custom-charactert) is the evolution under the correlated dephasing noise (1.1).


The protocol is robust against these kinds of errors, and for short times t the decay of {tilde over (P)}ab(t) is still dominated by Γab. Using Eqs. (8.2) and (8.3):






Tr[{tilde over (E)}εt({tilde over (ρ)})]=Tr[E0εt0)]  (Eq. 8.5)





+Tr[E0εt(δρ)]+Trt0)]





+Trt(δρ)]


The first term is the outcome without errors:










Tr
[


E
0





t

(

ρ
0

)


]

=



1
+

e


-
t



Γ

a

b





2

.





(

Eq
.

8.6

)







Find the effect of errors on the second and third terms, by considering the effect of εt on ρ1 and E1. Specifically:






Tr[E0εt(δρ)]=ηsse−Γabt,  (Eq. 8.7)






Trt0)]=ηmme−Γab,  (Eq. 8.8)


where ηs,m and ζs,m are constants that are determined by δρ and δE, for s and m, respectively. Therefore, these terms decay at the same rate as the first case and do not affect the exponential decay. However, the last term can, in principle, contain different decay rates and can cause deviation from a single exponential decay. Bound the rate at which R(t)=Tr[δEεt(δρ)] grows:












"\[LeftBracketingBar]"



R
˙

(
t
)



"\[RightBracketingBar]"


=



"\[LeftBracketingBar]"






t



Tr
[

δ

E




c

(

δ

ρ

)


]




"\[RightBracketingBar]"







(

Eq
.

8.9

)













=




"\[LeftBracketingBar]"


Tr
[

δ

E




(
δρ
)


]



"\[RightBracketingBar]"




2


ϵ
m




ϵ
s

(

n
+
s

)




,




(

Eq
.

8.1

)







Therefore:













P
˜


a

b


(
t
)

=



1
2



(

1
+

η
s

+

η
m

+


(

1
+

ζ
s

+

ζ
m


)



e


-

Γ
ab



t




)


+

R

(
t
)



,




(

Eq
.

8.12

)







The deviations from a single exponential decay are attributed to R(t). Using Eqs. (8.11) and (8.12) the decay rate of {tilde over (P)}ab(t) is dominated by Γab for evolution times t≲1/(2∈sm(n+s))−1.


Referring to FIGS. 10A-10C, graphs illustrating the exponential decay of {tilde over (P)}ab(t) when there are SPAM errors is shown. The decay of a 3-qubit GHZ state is simulated. A correlation matrix C is selected with uniform single-qubit dephasing rate (cii0) and nearest-neighbor correlations ci,i±10/4. FIGS. 10A-10C show the decay of {tilde over (P)}ab(t) under different noise strengths: (FIG. 10A) Δ=0.01, (FIG. 10B) Δ=0.05, (FIG. 10C) Δ=0.1. The dashed lines show the decay with no SPAM errors, and solid lines show the decay with SPAM errors from randomly-sampled error channels. The solid lines resemble the dashed lines for short evolution times t.


Numerical simulations may be used to investigate the effect of SPAM errors on estimates of the decay rate Γab. Simulate SPAM errors by applying random error channels εs and εm, whose strengths are controlled by a parameter Δ. Compare the decay of {tilde over (P)}ab(t) with and without SPAM errors, for different values of Δ. Observe that, for short times t, the decay with SPAM errors matches the decay without SPAM errors, i.e., the decay rate is dominated by Γab, see FIGS. 10A-10C. This is consistent with the theoretical analysis.


The disclosed method for learning sparse correlated dephasing noise (FIGS. 3 and 4) can be extended to the most general case of the master equation (1.1), where the matrix C is complex, and there is an additional Hamiltonian term Hs.


In aspects, the decay rates may be complex. The complete dynamics imposed by the environment on the quantum system 100 can have a coherent evolution in addition to the decay. The evolution of the quantum system







1

0

0



d

ρ

dt


=



(
ρ
)





is in general given by the Lindblad generator













(
ρ
)

=


-

i
[

ρ
,

H
s


]


+




l
,
m




c

l

m


(



Z
l


ρ


Z
m


-


1
2



{



Z
l



Z
m


,
ρ

}



)




,




(

Eq
.

9.1

)













where



H
s


=




l
,
m




r

l

m




Z
l



Z
m







(

Eq
.

9.2

)







is sometimes called the (generalized) Lamb shift Hamiltonian, and C=(clm) is now a complex matrix. Matrix C may be decomposed as C=V+iT, (Eq. 9.3)


where the real and imaginary parts are separated into V=(vlm), a real symmetric matrix, and T=(tlm), a real skew-symmetric matrix, respectively. Moreover, the Lamb shift Hamiltonian can be encoded in the symmetric matrix R=(rlm).


In aspects, the operator custom-character acts on the off-diagonal matrix elements |acustom-charactercustom-characterb|. This involves “decay rates” that depend on R, V and T in a simple way, although these “decay rates” are now complex rather than real. Specifically,






custom-character(|acustom-charactercustom-characterb|)=(−Γab+iΩab)|acustom-charactercustom-characterb|,  (Eq. 9.4)


where Γab and Ωab are real numbers that capture the decay and oscillations of the matrix element, respectively. The convention is defined for states |acustom-character and |bcustom-character: Zi|acustom-characteri|acustom-character, and similarly for |bcustom-character and βi. Therefore, separating the real and imaginary parts of custom-character(|acustom-charactercustom-characterb|) it is found that:











Γ

a

b


=


-




l
,
m




v

l

m


(



α
l



β
m


-


1
2



α
l



α
m


-


1
2



β
l



β
m



)



=


1
2




(

α
-
β

)

T



V

(

α
-
β

)




,




(
9.5
)







The fact that V is symmetric can be used. Similarly:










Ω

a

b


=



-




l
,
m




r

l

m


(



α
l



α
m


-


β
l



β
m



)



+




l
,
m




v

l

m


(



α
l



β
m


-


1
2



α
l



α
m


-


1
2



β
l



β
m



)



=


-

(



α
T


R

α

-


β
T


R

β


)


+


1
2




(



α
T


T

β

-


β
T


T

α


)

.








(

Eq
.

9.6

)







R and T, are symmetric and skew-symmetric matrices, respectively. Therefore, the coherences, that is, the |acustom-charactercustom-characterb| elements of the density matrix ρ(t), exhibit both oscillations (at a frequency nab) and exponential decay (at a rate Γab) as a function of t. Note that it is possible to measure both Γab and Ωab, by estimating the matrix elements of ρ(t) at different times t, using the same types of Ramsey experiments described herein. In particular, one can extract these parameters using standard techniques for analyzing spectral lines in atomic physics. Here, the squared magnitude of the Fourier transform of the measurement time series is a Lorentzian function, the center of the Lorentzian peak provides the oscillation frequency, and the width of the peak provides the decay rate.


Given estimates of Γab and Ωab, the compressed sensing method can be extended to recover both the correlation matrix C 400 (FIG. 6) and the Hamiltonian Hs. As before, m˜s log n or m˜s log4n can be used for measurement settings. For each measurement setting, a and b are chosen uniformly at random in {0,1}11. From estimates of Γab, V can be recovered, the real part of C, exactly as before. In a similar manner, given estimates of Ωab, T may be recovered, the imaginary part of C, as well as R, the matrix that encodes the Hamiltonian Hs. This is possible, because the equation represents a measurement of R and T that has the needed isotropy and incoherence properties.


In aspects, isotropy and incoherence of the Measurements Ωab may be determined.


Ωab, viewed as a measurement of R and T, with a and b chosen uniformly at random in {0,1}n. This random measurement has the same isotropy and incoherence properties as before, and hence the compressed sensing method will still succeed using these measurements. The incoherence property is easy to see, but some work is required to show the isotropy property.


The measurement Ωab acts on T and R as










Ω

a

b


=




[




α
T




β
T




]

[




-
R





1
2


T







-

1
2



T



R



]

[



α




β



]

.





(

Eq
.

9.7

)







It is advantageous to consider the effect of measurements if the symmetries of R and T are enforced. Note that T and R are real skew-symmetric and symmetric matrices, respectively. Moreover, they are both traceless. Therefore:





ΩabΣi<jiβj−βiαj)Tij+2Σi<jiαj−βiβj)Rij)  (Eq. 9.8)


As before, let uvec be the linear operator that returns the upper-triangular part of a matrix (not including the diagonal elements), that is, uvec: Rcustom-character(Rij)i<j. (Eq. 9.9)


Then Ωab can be expressed as











Ω

a

b


=

q
[




u

v

e


c

(
T
)







2

uve


c

(
R
)





]


,




(

Eq
.

9.1

)







similar to Eq. (5.26), where q is a row vector of the form






q=[uvec(αβT−βαT),uvec(ααT−ββT)].  (Eq. 9.11)


Note that as described above, α and β are chosen uniformly and independently at random from {1, −1}n. It is then straightforward to see that q in this case satisfies the incoherence property (Eq. 5.30) with μ=1. Furthermore, one can check that q is centered and isotropic (up to a normalization factor of √{square root over (2)}), since:









{






𝔼
[


α
i



α
j


]

=


𝔼
[


α
i



β
j


]

=


𝔼
[


β
i



β
j


]

=
0









𝔼
[


(



α
i



β
j


-


β
i



α
j



)



(



α
k



β
l


-


β
k



α
l



)


]

=

2


δ

i

k




δ
jl









𝔼
[


(



α
i



α
j


-


β
i



β
j



)



(



α
k



α
l


-


β
k



β
l



)


]

=

2


δ

i

k




δ
jl









𝔼
[


(



α
i



β
j


-


β
i



α
j



)



(



α
k



α
l


-


β
k



β
l



)


]

=
0




,





(
9.12
)







where it is assumed that i<j and k<l in all cases. The second and the third lines in the above equation capture the correlations in the measurements of T and R, respectively, and the last line captures the cross-correlations between the two measurements.



FIG. 11 illustrates a flow diagram of the computer-implemented method 1100 for determining a two-qubit correlated dephasing error for the quantum system 100, such as the quantum system of FIG. 1. It is contemplated that the performance of the order of the steps of FIG. 11 may be changed. The steps of method 1100 may be performed by a processor.


Initially, at step 1102, a signal of a quantum system 100 is accessed. The signal includes a plurality of qubits 102 (FIG. 1). Every qubit has a nonzero dephasing rate. The signal includes information regarding a matrix (e.g., matrix C 400). The matrix includes diagonal elements and off-diagonal elements (FIG. 6). The off-diagonal elements of the matrix are 2s-sparse.


Next, at step 1104, randomized measurements of the off-diagonal elements of the matrix are performed. The randomized measurements may include preparing entangled GHZ states on random subsets of the plurality of qubits 102. The randomized measurements may be based on noise spectroscopy and quantum sensing.


Next, at step 1106, the matrix is recovered based on a direct measurement of the diagonal elements of the matrix. The recovered matrix may include a restricted isometry property.


In aspects, a two-qubit correlated dephasing error may be estimated based on the recovered matrix. In another aspect, a long-range correlated dephasing error may be detected based on the recovered matrix.


In aspects, a decay rate may be collected in a vector. The recovered matrix may be further based on the estimated decay rate. A relaxation time and/or decoherence time of the quantum system 100 may be estimated based on the estimated decay rate.


Certain embodiments of the present disclosure may include some, all, or none of the above advantages and/or one or more other advantages readily apparent to those skilled in the art from the drawings, descriptions, and claims included herein. Moreover, while specific advantages have been enumerated above, the various embodiments of the present disclosure may include all, some, or none of the enumerated advantages and/or other advantages not specifically enumerated above.


The embodiments disclosed herein are examples of the disclosure and may be embodied in various forms. For instance, although certain embodiments herein are described as separate embodiments, each of the embodiments herein may be combined with one or more of the other embodiments herein. Specific structural and functional details disclosed herein are not to be interpreted as limiting, but as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present disclosure in virtually any appropriately detailed structure. Like reference numerals may refer to similar or identical elements throughout the description of the figures.


The phrases “in an embodiment,” “in embodiments,” “in various embodiments,” “in some embodiments,” or “in other embodiments” may each refer to one or more of the same or different example embodiments provided in the present disclosure. A phrase in the form “A or B” means “(A), (B), or (A and B).” A phrase in the form “at least one of A, B, or C” means “(A); (B); (C); (A and B); (A and C); (B and C); or (A, B, and C).”


It should be understood that the foregoing description is only illustrative of the present disclosure. Various alternatives and modifications can be devised by those skilled in the art without departing from the disclosure. Accordingly, the present disclosure is intended to embrace all such alternatives, modifications, and variances. The embodiments described with reference to the attached drawing figures are presented only to demonstrate certain examples of the disclosure. Other elements, steps, methods, and techniques that are insubstantially different from those described above and/or in the appended claims are also intended to be within the scope of the disclosure.

Claims
  • 1. A computer-implemented method for detecting a two-qubit correlated dephasing error, the method comprising: accessing a signal of a quantum system, the quantum system includes a plurality of qubits, wherein every qubit has a nonzero rate of dephasing, wherein some qubits have a nonzero rate of correlated dephasing, wherein the signal includes information about a matrix, the matrix including diagonal elements and off-diagonal elements, and wherein the off-diagonal elements of the matrix are 2s-sparse;performing randomized measurements of the off-diagonal elements of the matrix; andrecovering the matrix based on a direct measurement of the diagonal elements of the matrix.
  • 2. The computer-implemented method of claim 1, further comprising estimating a two-qubit correlated dephasing error based on the recovered matrix.
  • 3. The computer-implemented method of claim 1, wherein performing the randomized measurements includes preparing entangled Greenberger-Horne-Zeilinger states (GHZ states) on random subsets of the plurality of qubits.
  • 4. The computer-implemented method of claim 1, wherein the randomized measurements are based on noise spectroscopy and quantum sensing.
  • 5. The computer-implemented method of claim 1, wherein the recovered matrix includes a restricted isometry property.
  • 6. The computer-implemented method of claim 1, further comprising estimating a vector of decay rates that are dependent on the matrix.
  • 7. The computer-implemented method of claim 6, wherein the recovered matrix is further based on the estimated decay rate.
  • 8. The computer-implemented method of claim 6, further comprising estimating a relaxation time of the quantum system based on the estimated vector of decay rates.
  • 9. The computer-implemented method of claim 6, further comprising estimating a decoherence time of the quantum system based on the estimated vector of decay rates.
  • 10. The computer-implemented method of claim 1, further comprising detecting a long-range correlated dephasing error based on the recovered matrix.
  • 11. A computer-implemented method for detecting a two-qubit correlated dephasing error, the method comprising: accessing a signal of a quantum system, the quantum system includes a plurality of qubits, wherein every qubit of the plurality of qubits has a nonzero rate of dephasing and wherein some qubits have a nonzero rate of correlated dephasing;dephasing entangled states of the plurality of qubits based on performing Ramsey spectroscopy using entangled states of random subsets of qubits of the plurality of qubits;measuring a linear function of a correlation matrix, where the correlation matrix corresponds to correlated Markovian dephasing between pairs of qubits of the plurality of qubits;generating a first vector and a second vector, wherein a plurality of elements of the first vector and of the second vector are randomly chosen; andestimating a decay rate based on the first vector and the second vector.
  • 12. The computer-implemented method of claim 11, further comprising generating a restricted isometry property-based recovery matrix, wherein the recovery matrix is further based on the estimated decay rate.
  • 13. The computer-implemented method of claim 12, further comprising detecting a long-range correlated dephasing error based on the recovery matrix.
  • 14. The computer-implemented method of claim 11, further comprising estimating a relaxation time of the quantum system based on the estimated decay rate.
  • 15. The computer-implemented method of claim 11, further comprising estimating a decoherence time of the quantum system based on the estimated decay rate.
  • 16. A system for detecting a two-qubit correlated dephasing error, the system comprising: a processor; anda memory, including instructions stored thereon, which when executed by the processor, cause the system to: access a signal of a quantum system, the signal includes a plurality of qubits, wherein every qubit has a nonzero dephasing rate; andgenerate a matrix C in Rn×n, wherein its off-diagonal part is 2s sparse and wherein the matrix C is based on the accessed signal.
  • 17. The system of claim 16, wherein the instructions, when executed by the processor, further cause the system to: perform randomized measurements based on: estimating (b−a)TC(b−a) for any a≠b in {0,1}n;choosing a sequence of random vectors r(1), . . . , r(m)in {1,0,−1}n; andestimating ΦC=(r(1){circumflex over ( )}TCr(1)), . . . , (r(m){circumflex over ( )}TCr(m), where Φ:Rn×n→Rm.
  • 18. The system of claim 17, wherein the instructions, when executed by the processor, further cause the system to: estimate a decay rate Γab based on any a≠b in {0,1}n; andestimate a decoherence time of the quantum system based on the decay rate Γab.
  • 19. The system of claim 18, wherein the instructions, when executed by the processor, further cause the system to: recover Matrix C based on: measuring a plurality of diagonal elements of matrix C directly; anddetermining an off-diagonal element of matrix C based on using 1-regularized least-squares regression.
  • 20. The system of claim 19, wherein the instructions, when executed by the processor, further cause the system to: detect a long-range correlated dephasing error based on the recovered matrix C.
CROSS-REFERENCE TO RELATED APPLICATION/CLAIM OF PRIORITY

This application claims the benefit of, and priority to, U.S. Provisional Patent Application No. 63/188,373, filed on May 13, 2021, the entire contents of which are hereby incorporated herein by reference.

GOVERNMENT SUPPORT

This invention was made with government support under W911NF1510397 awarded by the Army Research Office (ARO) and under FA95501810161 awarded by the Air Force Office of Scientific Research (AFOSR). The government has certain rights in the invention.

Provisional Applications (1)
Number Date Country
63188373 May 2021 US