The Cost of Improving the Precision of the Variational Quantum Eigensolver for Quantum Chemistry

New approaches into computational quantum chemistry can be developed through the use of quantum computing. While universal, fault-tolerant quantum computers are still not available, and we want to utilize today’s noisy quantum processors. One of their flagship applications is the variational quantum eigensolver (VQE)—an algorithm for calculating the minimum energy of a physical Hamiltonian. In this study, we investigate how various types of errors affect the VQE and how to efficiently use the available resources to produce precise computational results. We utilize a simulator of a noisy quantum device, an exact statevector simulator, and physical quantum hardware to study the VQE algorithm for molecular hydrogen. We find that the optimal method of running the hybrid classical-quantum optimization is to: (i) allow some noise in intermediate energy evaluations, using fewer shots per step and fewer optimization iterations, but ensure a high final readout precision; (ii) emphasize efficient problem encoding and ansatz parametrization; and (iii) run all experiments within a short time-frame, avoiding parameter drift with time. Nevertheless, current publicly available quantum resources are still very noisy and scarce/expensive, and even when using them efficiently, it is quite difficult to perform trustworthy calculations of molecular energies.


Introduction
Computer simulations of quantum systems constitute a crucial tool for a deeper understanding of behavior and properties of matter at the atomic scale. However, when investigating the electronic structure of larger molecules, we quickly encounter the limits of classical computers. The space and time requirements for describing the states, and even more so for performing optimizations on them, grow prohibitively. Thus, we resort to many clever types of approximations [1][2][3][4][5][6][7][8][9]. Oftentimes, though, they are not good enough, or start to scale badly.
It is natural to imagine that using one quantum mechanical system to simulate another could be a more efficient approach. The concept of a quantum computer, based on the laws of quantum physics, was first proposed by Richard Feynman [10]. He envisioned that the way to deal with the exponential amount of information appropriate to study physical systems was to use quantum systems as computers themselves. We have come a long way since then, and today we have access to the first small, programmable quantum chips, together with 25 years of development of quantum algorithms [11][12][13] and protocols whereĤ denotes the Hamiltonian operator, |ψ is the system wave function and E represents the energy of the system. Finding the ground state energy of the studied system is a difficult optimization problem. However, we now know several promising approaches for applications in quantum chemistry [21], relying on natural quantum encodings of the problems, and algorithms utilizing superpositions and entanglement, resulting in efficient search and energy evaluation. One way of approximately obtaining the minimum eigenvalue of a Hamiltonian is to start with an initial guess and iteratively search for optimal parameters. One such popular approach aimed at solving the many-particle problem is the hybrid classicalquantum variational quantum eigensolver (VQE) algorithm proposed by Peruzzo et al. [22]. It is an application of the time-independent variational principle that benefits from the cooperation of classical and quantum computers, and is suitable for use in near-term, imperfect quantum devices. For example, in [22], the authors used a small photonic quantum computer to calculate the He-H + system's ground-state energies for various atomic distances. Next, O'Malley et al. [23] studied the properties of the H 2 molecule using Google's digital quantum computer with superconducting qubits. The researchers used two different methods for finding the ground-state energies of H 2 : the phase-estimation algorithm (PEA) and the variational quantum eigensolver (VQE). The former method can in principle obtain the answer with arbitrary precision, but only if there are no errors in the process. As in practice, errors are always present, and so the VQE method works better. However, it also has its pitfalls, which we aim to elucidate.
In this work, we want to understand the possible sources of errors in VQE calculations when implemented on current, publicly available superconducting quantum processors. Our goal is to provide a recipe for the efficient use of the scarce resources, as well as practical guidance for the starting practitioner, with the aim to run quantum chemistry computations on current devices.
If our (superconducting) qubits were a closed system, the dynamical evolution of their state would be fully determined by their initial state and Hamiltonian. In reality, the system is partially open, the qubits interact with their environment, their internal interactions are not perfect, and so the overall state is not deterministic. A generalized theory of the interactions between a quantum state and its environment was derived by Bloch [24], manifesting as extra degrees of freedom we cannot control. These affect the state of the system and result in a loss of coherence [25]. Noise comes from controlling the qubits (e.g., via pulses applied to the qubits) or reading out the final states of the qubits (there is a finite probability that the recorded classical bit value will be flipped from the true outcome of a measurement). We also face errors from random fluctuations of parameters coupled to our qubits, such as (i) thermal voltage and current fluctuations in control lines or (ii) randomly fluctuating electric and magnetic fields in the local qubit environment.
Minimizing various sources of noise and errors is a very complex and demanding task involving materials science, fabrication engineering, electronics design, cryogenic engineering, and qubit design. Of course, our aim is fault-tolerant quantum processing, which is currently out of reach due to the precision and overhead requirements [26]. Meanwhile, we often employ methods for dealing with noisy data. There are many different approaches to correcting the noisy results (see Refs. [27][28][29][30][31][32][33][34][35][36] and references therein), based on different techniques of noise characterization.
Finally, even hypothetical noiseless quantum processors are prone to stochastic noise. Our calculations end with estimating the values of Pauli observables in the final state. This can only be achieved via sampling the outcome probabilities, with a limited number of shots. Moreover, stochastic noise comes in also through the heuristic nature of the classical optimizer part of the VQE algorithm. The good news is that we can investigate and control this noise, thanks to our selection of optimizer, and the parameters of the experimental runs. We will now attempt to characterize in detail how each source of noise influences the convergence of the VQE algorithm. We will then look at how to efficiently distribute our resources to achieve the desired precision of the calculations, if it is possible at all.

Methods
There are several preparatory steps before studying molecules on real quantum devices. We start with the molecular Hamiltonian in the second quantized form [37]. The fermionic Hamiltonian with one-and two-electron terms (h ij and h ijkl ) is given bŷ where a † i and a j are fermionic creation and annihilation operators. We then need to map this fermionic Hamiltonian into qubit operators represented in the Pauli operator basis [38]. This can be achieved in multiple ways, e.g., with the Bravyi-Kitaev [39] or Jordan-Wigner [40] transformations. Here, we chose the Bravyi-Kitaev transformation [39] of the hydrogen molecule Hamiltonian, produced using the publicly available qiskit package [41]. The code for this and all of our following calculations are located in [42]. We also invite the reader to read the introductory VQE tutorial [43].
Choosing the STO-3G basis set, and the distance between hydrogen atoms set to 0.725 Å, not taking into account the Coulomb repulsiom between nuclei, we arrive at a 4-qubit Hamiltonian The Z and X terms are Pauli operators, and the coefficients c i are integrals calculated using the qiskit.chemistry package [44]: Note that the Hamiltonian in Equation (4) commutes with Z 1 and with Z 3 . Therefore, the Hamiltonian is block-diagonal with 4 blocks, each corresponding to a particular computational basis setting of qubits 1 and 3. This can be used to find a Hamiltonian with the same ground energy expressed in 2-qubit space [44]: with the coefficients c 0 = −1.05016, c 1 = 0.40421, With two equivalent formulations (both Hamiltonians have the same ground-state energy) of the problem in hand, we were able to investigate which form is more amenable to VQE optimization. They gave us a chance to explore various problem sizes, parametrized state ansatzes, energy landscapes, and investigate how noise affects the procedures, as the optimizations will involve different numbers of gates. Note that the choice of the STO-3G basis set here was deliberately chosen to keep the studied Hamiltonian relatively small. However, it is possible to formulate the qubit Hamiltonian of the hydrogen molecule using another, possibly larger, basis set. This naturally leads to a larger problem size; for example, consider the 6-31G basis set, which leads to 8-qubit Hamiltonian with 185 terms.
We could then run the VQE algorithm. Its input is a Hamiltonian expressed in terms of qubits and its aim is to find the eigenvector with the lowest eigenvalue. For this, the VQE iterates these four steps:

1.
Quantum: prepare a parametrized quantum state on a quantum device; 2.
Classical: sum the expectation values of the Hamiltonian terms to estimate the energy of the parametrized state; 4.
Classical: use the energy value to update the parameters of the trial quantum state.
until we met the convergence criteria of the classical optimization method.
To minimize the expectation value of the energy, we chose the simultaneous perturbation stochastic approximation (SPSA) [45][46][47], classical optimization method implemented in qiskit. SPSA is a pseudo-gradient method for optimizing problems with varying numbers of unknown parameters. SPSA starts with the initial vector of parameters θ 0 . In each iteration, the parameter vector is simultaneously shifted twice as where c k is a preassigned positive number, k is the iteration number and ∆ k is a randomly generated vector (from the Bernoulli distribution). To approximate the gradient at θ k , we utilized the gradients at θ + k and θ − k as In each iteration step, we thus need to measure the energies of two quantum states, prepared with parameter settings θ + k and θ − k . We then updated the underlying parameters θ k to where a k is a preassigned positive number and g k ( θ k ) is the approximated gradient (8) that depends on θ + k and θ − k . The optimization procedure runs for a number of iterations controlled by the maxiter parameter (in qiskit's implementation). We then updated the parameter vector θ k , and measured the final value of energy for the underlying optimized parameters. There is a subtlety that influences the total number of function evaluations. Qiskit's implementation of SPSA includes additional initial exploration-a calibration phase that depends on the maximum number of iterations with min{maxiter/5, 25} steps.
There are many possible choices for the quantum circuit that prepares the trial state for simulating the molecules using VQE. We employed the hardware-inspired (easy to implement) R y R z and R y variational forms, accompanied by linear entanglement [48]. The circuits consist of two main layers: (i) a layer of parametrized R y or R y and R z singlequbit rotations applied to each qubit in the quantum register, alternating with (ii) an entanglement-creating layer of CNOT gates. We could easily alternate these two layers, creating circuits of varying depths, increasing the complexity of the ansatz and the number of parameters to be optimized.
The simplest variational circuit of depth d = 1 consists of just the entanglementcreating layer, nested between two parametrized rotation layers. The 4-qubit variational circuits of depth d = 2 can be seen in Figure 1. For q qubits and depth d, the R y R z variational circuit has 2q(d + 1) parameters, single-qubit rotations, while the R y variational circuit has q(d + 1) of them. Our qubit Hamiltonians (3) and (5) contain terms with the Pauli X operators. To estimate their expectation values, we needed to measure them in a non-diagonal basis. As only computational-basis measurements are available on the quantum processors, we needed to utilize basis-switching single-qubit gates (post-rotations). For the Pauli operator X, we thus used a π/2 rotation around the y axis, performed by gate R y (π/2), and subsequently took measurements on a computational (Z-)basis.
Let us note that we do not need to estimate all of the Pauli terms in our Hamiltonians individually. We can save resources and reduce the number of required measurements and state preparations by grouping the Pauli operators that require the same post-rotations in the tensor product basis sets [27].
To access the real quantum devices, we used the publicly available cloud-based quantum computing platform IBM Quantum [49]. We found that access to real quantum processors is still limited for a larger systematic study of different types of errors and noise. Therefore, we heavily relied on classical simulations of quantum processors. The simulations were performed both in an ideal noise-free manner and including noise. Importantly, we used the noise-related parameters of the IBM quantum processors, which are publicly available (and changing) on a daily basis. Thereby, our simulations could more closely mimic the actual gates and computations executed on a real device.
The published noise-related data include an approximate noise model consisting of: (i) single-qubit and two-qubit gate errors-depolarizing errors followed by thermalrelaxation errors; (ii) single-qubit readout errors on all measurements; and (iii) all errors including gate, readout and thermal-relaxation errors. Since calibration data change frequently, in noisy simulation results we used the error model from the same calibration for most of the trials.
Finally, in some tests involving real quantum hardware , we employed a simple errormitigation method. It uses least-squares fitting to obtain the error-mitigated outcome probabilities by using a calibration matrix [50].

Results and Discussion
There are several possible sources of error for the energies calculated by the VQE: first, the statistical errors in intermediate and final energy estimations, caused by the probabilistic nature of quantum mechanics; second, Hamiltonian representation and statepreparation ansatz errors, caused by approximations in the Hamiltonian (restricted basis set), as well as the space of states we search over, given by our parametrized quantum circuit; third, hardware errors present in noisy quantum devices running the quantum part of VQE. We study these using simulators of noisy devices, as well as real quantum processor runs. We will now investigate the influence of these errors, and discuss ways and the costs of mitigating them. We will do this for the H 2 molecular Hamiltonians from the previous section.
Note that there are several factors which limit the number of gates/circuit repetitions we can execute. First, we have only limited access to the quantum processors, which severely restricts the number of shots we can use for each datapoint. Second, we do not have unlimited time either. Third, the gates themselves are noisy, so increasing system sizes or gate numbers will not work without serious error mitigation. Thus, simply increasing the number of repetitions/circuit size is not a solution to these errors, and we must work harder on error mitigation to obtain trustworthy results-molecular energies with chemical precision.

Number of Quantum Computer Calls
While searching for optimal parameters, the VQE optimization requires a quantum subroutine: prepare a candidate state and measure its energy. Even if we imagined having a noiseless quantum processor, its probabilistic nature and impossibility of measuring noncommuting observables simultaneously would force us to rely on averages over several measurements, which carry stochastic noise. We thus start by studying how the number of circuit (gate) evaluations on a quantum processor influences the convergence of the VQE algorithm, on ideal quantum devices.
Our goal is to understand and efficiently fight the influence of stochastic errors, inherent in VQE regardless of the quantum hardware. We want to optimally utilize a limited number of gate executions. For this, we have two options. First, we can increase the number of shots (controlled by the shots parameter in the programs) for each evaluation of readout probabilities. Second, we can increase the maximum number of iterations (the parameter maxite) allowed for the classical optimization subroutine.
Increasing the number of shots, improves the precision of the measurement outcomes for each of the terms in (5) or (3). This makes the energy function more stable and easier for the classical optimization routine to minimize. Meanwhile, increasing the allowed number of iterations for the classical optimizer provides it with a better chance to get out of existing local minima and/or to fine-tune the final output value, but again this comes at the cost of increasing the number of times a quantum computer needs to be accessed. Intuitively, it can be expected that increasing the number of quantum gate executions in either case increases the precision of the output. However, presently, the number of gate executions on a quantum computer is a limiting factor for most users-access to real quantum processors is either restricted or quite costly [51]. Additionally, an excessive number of quantum gate executions significantly impacts the real running time of the algorithm.
In this light, it would be interesting to find the minimum number of quantum gate executions that result in VQE output within chemical precision of the real energy value. In our study, we ran the algorithm for different settings of shots and maxiter parameters, on the two-qubit hydrogen molecule Hamiltonian, using the noiseless quantum simulator.
Each combination of the settings was run 1000 times. We present the experimental results in detail in Table A2 (see Appendix B   We visualized the data using a boxplot. The box shows the middle two quartiles of the data (interquartile range, IQR), with a marked median. Outliers are determined using Tukey fences [52]: a distance of 1.5 times the IQR in each direction, as in Figure 2a, or the extremes of our data, as in Figure 2b, where all of our computed energies fall above the red line, close to the box.
From Figure 2a, we can see that with an ideal quantum computer and unrestricted iterations of the SPSA algorithm, VQE performs increasingly better with the increase in the shots parameter. The limit of infinitely many shots is simulated using a statevector simulator, which calculates the outcome probabilities directly from the laws of quantum mechanics. It is important to note that when using a finite number of shots, the algorithm can also output energy values lower than the actual minimum energy. Thanks to stochastic noise, the estimates of outcome probabilities for the Pauli terms in (5), calculated from a limited number of shots, can be non-physical. To obtain realistic final optimized energies, one should thus invest in a precise final energy readout for the optimized state, using a larger number of shots.
This also raises another interesting question: can the inaccuracy in outcome probabilities be overcome by the SPSA algorithm? In other words, are the states discovered by SPSA actually close to the minimum eigenvector of the Hamiltonian, and is most of the energy spread seen in the results caused by the small number of shots? We recalculated the energies of the discovered states in Figure 2a, using the statevector simulator. The results (see Figure 2b) somewhat surprisingly show that indeed, most of the discovered states have real energies within the chemical accuracy.
In the context of trying to minimize quantum computer calls, this suggests that the VQE algorithm can be run with a small number of shots during the run of the classical optimization function and after the optimization produces the final result, the energy of the candidate state with minimum energy needs to be recalculated once more, using a much larger number of shots. However, for this to work, we have to use a classical optimizer which can deal well with noisy data. We could use Bayesian methods [53,54], which could be quite slow, with the need to invert large matrices to guess the next points for a search. Alternatively, we could use other optimization methods such as NEWUOA [55], which can be adapted to work on noisy data. Note that, in the initial stages of this work, we have used the COBYLA optimization method [56], and concluded that very large numbers of shots were necessary for the method to converge at all. As described in the Section 2, in the end we decided on the use of SPSA, thanks to its natural noise resilience, as it explores random points around the current parameter vectors.
On the other hand, at least for our selected Hamiltonian, increasing the maxiter parameter and waiting longer for convergence seems to have diminishing returns after surpassing a value of approximately maxiter = 100. We confirm this in detail in Figure 3. There, we simulate 1000 VQE calculations for various combinations of maxiter/shots settings. For the same shots parameter value, increasing the maxiter parameter only decreases the number of outliers, but the spread of the data remains almost unchanged. Both of these observations suggest that only a small fraction of runs can benefit from the increase in the maxiter parameter beyond 100.
To evaluate the energies more precisely after the last step, we recalculated energies using state vector simulator for shots parameter set to 512 and 1024, with maxiter settings 50, 75 and 100. Results of this calculation can be found in Figure 4 and they suggest that the combination of shots = 1024 and maxiter = 75 already provides the median energy in the chemical accuracy for the 2-qubit H 2 Hamiltonian.

Choice of Hamiltonian and State-Preparation Ansatz
In this section, we study the influence of the size and form of the quantum circuit used in the VQE calculation. The most straightforward way to reduce the complexity of the VQE algorithm is to find the smallest possible qubit representation of the studied Hamiltonian. Indeed, much effort today is focused on efficient encodings of fermionic Hamiltonians (see e.g., [57,58] and references therein), as getting rid of extra qubits (and operations on them) decreases noise and can avoid space limitations. To highlight the importance of finding minimal representations, we have chosen two equivalent variants of H 2 molecule Hamiltonian, one using 2 and another one using 4 qubits (see Section 2).
Once we have chosen a Hamiltonian and its qubit implementation, VQE needs a choice of the variational circuit for state-preparation. We have a vast choice of parametrized gates, arrayed in multiple rounds. This results in varying numbers of classical parameters to be optimized, as well as different reachable quantum states (ansatz quality), with varying amounts of entanglement [59][60][61]. Here, we focus on two types of variational circuits-the R y and the R y R z linear forms with a varying number of rounds, shown in Figure 1 for 4 qubits (the 2-qubit circuits are obvious simplifications).
Note that all the results in this subsection were obtained using a noiseless quantum simulation to highlight the influence of the Hamiltonian/ansatz choice in the ideal case. Of course, once we consider noise and statistical readout errors, the Hamiltonian/circuit choice translates to more errors, thanks to wider circuits with a larger number of gates.
In Figure 5, we depict our results for the convergence of the VQE algorithm, depending on the Hamiltonian choice (2-or 4-qubit), and the ansatz complexity (type and number of rounds).
The two-qubit Hamiltonian (5) turns out to be simple enough that the depth of the variational ansatz (number of layers) does not significantly influence the final optimized energies. Moreover, we can see that the simpler R y ansatz outperforms R y R z ansatz.
This confirms the intuition that the simplest ansatz containing the solution will also have the best performance, thanks to solution space having less parameters to optimize over. Additionally in noisy regime, simpler ansatz requires less gates for implementation, leading to further advantage.
On the other hand, for the 4-qubit Hamiltonian (3), both R y and R y R z variational forms with 1 layer are not expressive enough and the search space does not contain the state with the lowest energy. Using two layers already alleviates this problem. Again, we see that the R y form performs better. Using a 12-parameter, two-level R y ansatz achieves better convergence than the 24-parameter, two-level R y R z ansatz. Moreover, if we allow the R y form with 24 parameters as well, the circuit has depth 5 and outperforms the R y R z ansatz even more significantly. This can be partially explained by the fact that for the same number of search space parameters, the R y ansatz can achieve more intricately entangled states, due to the increased number of entanglement layers. These results only underline the importance of an efficient choice of Hamiltonian encoding and ansatz parametrization. More qubits mean the need for more intricate ansatzes, while introducing more statistical errors when estimating the energies of individual terms, and thus affecting the overall readout precision, even in an ideal (noiseless) case. Further, more complex Hamiltonians can cause the optimization algorithm to converge to a local minimum, thus causing a systematic error. This is in more detail treated in Appendix A.

Imperfect Quantum Devices
So far, we studied the convergence of the VQE algorithm assuming a perfect quantum computer available for the quantum subroutines. This analysis was possible thanks to relative simplicity of classically simulating the few-qubit quantum circuits used during our VQE algorithm runs. We wish to eventually scale up the calculations, and obtain results from the hybrid classical-quantum VQE which are not available classically. For this, we will have to rely on real quantum devices. Of course, current quantum processors are far from perfect, so the calculations will be inherently noisy, influencing also the convergence of VQE algorithm.
Let us group the possible errors into two basic types-gate errors and readout errors. This is a natural divide, since the number of measurement (readout) errors scales only with the number of qubits and can be mitigated with classical techniques. Meanwhile, the gate errors inevitably scale with both depth of the computational ansatz and the number of qubits, unless we use error correction/mitigation techniques.
In Figures 6 and 7, we show the convergence of the VQE using simulations of noisy quantum processors. To obtain realistic parameters for the simulation, we used the error characterization of the real IBM devices, obtained directly from qiskit.  Our first observation is that the quality of calibration of recent experimental quantum processors changes frequently. Not only that, it significantly influences the convergence, as we demonstrate in Figure 6. There, we visualize the convergence of 1000 trials with the same settings, using calibration data from different dates, showing results differing by 10%.
Second, we observe that for these small circuits, the effect of measurement error is more significant than the gate errors. As we study this further in Figure 7 for the 4-qubit Hamiltonian (3), we find that readout errors still affect the convergence more, but the effect is less pronounced. In Figure 7a we use the 2-qubit, 1-round R y variational form, which requires 4 single-qubit gates + 1 CNOT, and up to 2 Pauli operators for readout. For Figure 7b, we recall results presented in Figure 5, and choose the 4-qubit, 2-round R y variational form, which requires 12 single-qubit gates + 6 CNOTs, and up to 2 Pauli operators for readout. The number of gates (circuit width × depth), and thus gate errors, grows faster than the number of estimation terms, which still scales with the circuit width. Thus, we expect that the effect of readout error will become less prominent for larger Hamiltonians compared to the gate errors. Moreover, we can also implement various readout error mitigating strategies-e.g., avoiding systematic bias by using bit flips before the final measurements.

Convergence of VQE on Real Quantum Hardware
Finally, in this subsection, we look at how the convergence predicted by the noisy simulator corresponds to convergence on real quantum hardware. For this, we performed an experimental runs on the ibm_lagos quantum processor, with maxiter = 75 and shots = 1024, and compared it to noisy simulation results with the same parameter settings. To mitigate the readout errors we performed a simple mitigation routine described at the beginning of this Section. In Figure 8, we observe that in fact the predictions agree fairly well with practice. Further, we tested our hypothesis that the discovered states are very close to optimal ones, but their energies have to be evaluated more precisely. To this end we calculated their energies using 40,000 shots on the ibm_lagos quantum processor (40,000 shots result in ≈0.5% error in energy calculations) as well as using a statevector simulator. In Figure 8, we can see that both of these techniques decrease the spread of the data points. However, only the statevector simulator consistently evaluates energies within the chemical accuracy from the exact ground-state value. This only shows that evaluating the energies using a large number of shots still suffers from hardware errors, which are not present when using the statevector simulator. We conclude that a simple VQE implementation on real quantum hardware does not consistently find the minimum energy within the chemical accuracy. Nevertheless, using both error-mitigation and precise energy readouts, one can obtain a trustworthy result, taking the lowest of the precisely evaluated energies. Figure 8. A comparison of results obtained with a noisy quantum simulator with error mitigation (orange points and box plot), with real runs of the ibm_lagos quantum processor (blue points). The energies of real runs were subsequently recalculated with greater precission using 40,000 shots of the ibm_lagos quantum processor (grey points). Finally, we recalculate their energies also with a statevector simulator (red diamonds). The red line represents the physical ground-state energy (−1.86712 Hartree) and the light-red background represents the chemical accuracy (±0.0015 Hartree).

Conclusions
In this work, we provided quantum chemistry calculations with the variational quantum eigensolver on real quantum hardware. We analyzed the influence of different types of errors on the convergence of VQE. We divided them into several categories and studied them separately to understand their impact and to find ways of avoiding them, while using our resources efficiently. For this, we used noiseless and noisy classical simulations, as well as publicly available superconducting quantum processors.
The errors resulting from the probabilistic nature of quantum mechanics significantly differ from the hardware errors. While the statistical error increases the spread of VQE outcomes around the optimum, gate and readout errors not only influence the spread of the outcomes, but often prevent the VQE from finding the optimum altogether.
If we could use noiseless hardware, the stochastic errors could be mitigated by increasing the number of shots, and obtaining precise energy estimates. This turns out not to be that important in the intermediate steps of iterative algorithm, but crucial at the end. Even with smaller numbers of shots per iteration, our optimization was able to find near-optimal states. However, to rule out unphysical results (energies below the actual minimum energy), and also to avoid overestimating the energy, we must invest shots into the final, precise readout. We learned that this approach can significantly reduce the overall cost (gate evaluations) for VQE, which is important, while access to real devices (and large runs) is limited, as well as time consuming.
Second, it is crucial to search for efficient encodings and good trial-state ansatzes (hardware, and problem-motivated). It saves on both gate evaluations, as well as on optimization complexity, reducing the number of required parameters to search over. We have demonstrated this in the comparison of the 4-qubit and 2-qubit Hamiltonians, and two types of ansatzes, with varying depths/numbers of parameters, for the same molecular hydrogen problem.
Finally, real devices' parametrization changes with time, and it is crucial to choose to run your experiments at times when they are calibrated well. Moreover, the current levels of noise are prohibitive, and add up so that the final results do not straightforwardly reach the desired chemical precision. To avoid this, error mitigation must be used, for the readouts (which contribute a surprising amount of error) as well as inside of the circuit.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript:

VQE
Variational Quantum Eigensolver STO-3G Slater-type Orbital basis set with each orbital expanded into 3 Gaussian functions.

6-31G
A double-zeta basis set, which uses six primitive Gaussians to describe each core atomic orbital. Further, each valence orbital is described by two basis functions. The first one is composed of a linear combination of 3 primitive Gaussian functions and the other one is composed of a single primitive Gaussian function.

Appendix A. Convergence to Local Minima
In this Appendix, we go beyond energy minimization and ask: are we finding the correct (ground) state? Our analysis above is primarily focused on identifying the impact of different types of noise and errors that are currently commonly occurring in physical realizations of superconducting quantum processors. We have purposely chosen the H 2 molecule as a material system of our case study because the ground state of H 2 is well known. In fact, it is a frequent textbook example in quantum chemistry and, therefore, it is rather straightforward to show the impact of noise. Importantly, there are two implicit underlying assumptions of our analysis. First, we expect the minimization procedure within our methodology to aim solely at the ground state. Second, we assume that all the obtained values, covering a wide energy range, are indeed ground-state-related solutions that differ from the known exact ground-state energy only due to the noise. These two aspects are mutually related and the validity of our assumptions sensitively depends on both the minimizing method and the studied Hamiltonian. Had the noise-related range of energies been wider than the energy separation to the next higher-energy eigenvalue(s), and had this excited-state eigenvalue been found by the minimization procedure, yet another "source of errors" would have affected our analysis. This error type would be methodological, rather than noise related. We address this issue below and show that the analysis focused on only the eigenvalues, i.e., energies, is insufficient. Note that while estimating the final energy, we performed many measurements, and thus obtained a list of (significant) probabilities. This is not a full tomography, which would be prohibitive for large-scale systems. We only estimate the expectation values of Pauli operators present in the Hamiltonian. In our case, we estimate probabilities for computational basis states to estimate expectation values of Pauli Z operators, while estimating Pauli X operators on two of the qubits forces us to carry out measurements on a rotated basis. We will now study how to utilize the resulting vectors of probabilities to distinguish different classes of the states produced by VQE. Figure A1 shows our results for the 4-qubit Hamiltonian (3) in an energy range of (−1.9, −1.1) Hartree. We can notice a set of clearly outlying values above the energy of −1.3 Hartree. They can be identified in a very straightforward manner because they are separated from the main set of ground-state-related energies by quite a wide energy range that is free of other data points.
To discern what these outlying energies represent we next evaluate higher-energy eigenvalues of the studied qubit Hamiltonians of H 2 . The eigenvalues of the 4-qubit Hamiltonian of H 2 molecule, as determined using classical techniques, are as follows: Their inspection reveals that while the ground-state energy of H 2 is very well separated from the other higher-energy eigenvalues, there are several eigenvalues in the energy range of (−1.3, −1.1) Hartree, albeit some of them degenerated. A challenge that we face is to tell (i) those cases that are noise-related to the ground-state energy from (ii) those that are (again noise-related) to the other eigenvalues. In what follows we propose to distinguish between sets of values corresponding to different eigenvalues by applying a suitable similarity index to the measured probabilities of different basis states. Before we start discussing similarity measures, it is worth noting a few facts related to the measured probabilities of basis states. They are sets of non-negative real numbers smaller or equal to one. The number of elements in each set is equal to n = 2 q , where q is the number of qubits, and the sum of all elements in each set of probabilities equals to one. As discussed above, the measured sets of probabilities are multiplied by values of integrals within the qubit Hamiltonian to evaluate the energies.
Our analysis of similarities of vectors of measured probabilities below includes two different measures. The first one is the Jaccard-Tanimoto (J-T) index, also known as the Jaccard similarity coefficient. It is used for gauging the similarity and diversity of sample sets. It was developed by Paul Jaccard [62] and independently formulated again by T. Tanimoto [63]. The Jaccard-Tanimoto index of two sets X and Y is defined in general as the ratio of intersection of the two sets over their union J − T(X, Y) = |X ∩ Y|/|X ∪ Y|. For two vectors {x i }, {y i } with all components non-negative (x i ≥ 0, y i ≥ 0) and the same length (i = 1, ..., n) it is evaluated as The maximum similarity is characterized by the J-T index equal to one, while very low similarity by the J-T index equal to zero. The very many applications of J-T index include, e.g., various forms of image analysis or identification of words that are mistyped by the users of numerous computer and cell-phone/smart-phone applications.
The second measure is the scalar (inner) product of the vectors representing the set of measured probabilities. It is worth noting that while the vectors of measured probabilities have the sum of components equal to one, their length in a vector sense is in general not equal to one. The length of such a vector is equal to one only when one of the basis states has the probability equal to one and all others have the probability equal to zero. It is also the maximum length in the vector sense. For any other distribution of probabilities the length is lower than one. The other extreme case with the shortest length of the vector has all n basis states with the same probability, equal to 1/n length, and the length of this vector is 1/ √ n. Therefore, all the probability vectors are normalized before we evaluate their scalar product in our analysis.
As individual additive parts of the Hamiltonian can be divided into two groups with each requiring a different topology of the quantum circuit, we have evaluated similarities as obtained for two sets of 200 vectors of measured probabilities that we below call circuit 0 (all qubits measured in the computational basis) and circuit 1 (qubits 1 and 3 measured in the computational basis and qubits 0 and 2 measured in the X basis). Figure A2 shows the values of both the Jaccard-Tanimoto (J-T) similarity index ( Figure A2a,b) and scalar product ( Figure A2c,d) for both the circuit 0 ( Figure A2a,c) and circuit 1 ( Figure A2b,d).   parts (a,b)) and scalar product (parts (c,d)) for both the circuit 0 (parts (a,c)) and circuit 1 (parts (b,d)). Figure A2 shows that the averaged similarity values are decreasing with increasing energy from its ground-state value. Without noise, the probability vector of circuit 0 has the |1100 basis state with probability close to 1 and the other basis states have probability close to 0. From the 200 calculations, 188 of them, i.e., 94% , exhibit variations of this scenario and these results cover energies up to −1.7 Hartree. The J-T similarity index values (see Figure A2a) show a weakly decreasing trend with increasing energy. In contrast, the averaged scalar product of all of these 188 probability vectors has nearly identical value (see Figure A2c). As another extreme, about 5% of vectors of probabilities have very low averaged J-T similarity value close to zero and the energies from the range between −1.3 and −1.2 Hartree. These we, therefore, identify as excited states because their vectors of probabilities are very different from the case when the |1110 basis state has the probability close to one. The remaining 1% cases plotted in Figure A2a,c are characterized by energies and similarity index values that are in between the region close to the ground state on one hand and that of excited states on the other hand. We believe that these are essentially erroneous calculations in which the SPSA algorithm failed to converge to a local minimum.
The above mentioned percentages (94% states similar to the ground state, 5% similar to the excited states and 1% erroneous states) explain the actual value of the averaged similarity index plotted in Figure A2. As 94% vectors of probability vectors are related to the ground state, they are mutually similar and contribute with values close to one. Their averaged similarity index is lower than one because it is reduced by their medium similarity to the erroneous states (1% of contributions into the sum) and nearly zero similarity value with the excited states (another 5% of contributions). Complementarily, the excited states have nearly zero similarity index with 94% of all probability vectors as well as erroneous states and, therefore, their averaged similarity index is close to zero. Quite a similar situation is that of the intermediate values of the erroneous states that are similar to both the ground-state related states as well as the excited states.
The situation is more complicated in the case of circuit 1 (see Figure A2b,d) when the measurements of the ground-state result in outcomes (|0100 , |0110 , |1100 and |1110 ) with rather similar probabilities (they differ by about 1/16). The J-T similarity index performs quite similarly (see Figure A2b) as in the case of circuit 0 (see Figure A2a) but some of the excited states show significant non-zero values (up to 0.4). This trend is a lot more pronounced in the case of scalar product (see Figure A2d) when the values can reach as high as 0.74.
While future studies focused on similarity analysis in other systems are critically needed, a few general conclusions can be drawn. First, the similarity analysis of vectors of measured probabilities can be very fruitful when assessing the computed energies. The similarities seem to be able to tell cases that are just noisy-variants of the ground state from those that are (i) related to other eigenvalues of higher-energy excited states or even (ii) plainly erroneous cases. Second, the two tested similarity measures, the Jaccard-Tanimoto (J-T) similarity index and the scalar product, perform much better when the ground-state and/or excited states are associated with the situation when one of the basis states has probability close to 1 while the others have it equal to 0. It would be thus advantageous to develop methods associating the eigenvalues with these cases. Third, while the similarity index based on the scalar product identifies states related to the ground state by noise a lot more clearly than the J-T index, it tends to overrate the similarity of the excited and plain erroneous states. Therefore, the J-T similarity index seems to better capture continuously varying levels of similarity of different states. Fourth, we have analyzed averaged similarity measures based on the similarity between each vector of probabilities with respect to all other such vectors in our data set. Our analysis does not profit from the fact that we can estimate the vector of probabilities related to the ground state (it has the lowest energy) and determine the similarity with respect to this particular state. As the VQE can be applied to determine also the excited states, the same procedure with determination of the similarity analysis can be applied also to the excited states and the their noise-related states can be more identified. Lastly, when an eigenvalue is associated with the probability vector where one of the basis states has the probability equal to one and others zero, all other vectors that are noise-related to it (and that have probabilities redistributed elsewhere due to the nose and errors), can be used for determination for effective error rates and reversely applied for correcting for these errors.

Appendix B. Data Tables
Let us now present the data behind the figures in the main text. First, in Table A1, we list the median ground-state energies and the percentages of data falling into chemical precision range (with and without precise recalculation of energies), for various settings of the number of shots and maxiter = 1000. This is followed by Tables A2 and A3 in which we elaborete in more detail on energy values obtainable by various shots and maxiter settings. Then, in Table A4, we list the results underlying Figure 5, comparing state preparation ansatzes for 2-and 4-qubit computations. Further, in Table A5, we list the results highlighting the variability of actual hardware when performing runs on different dates, plotted in Figure 7. Finally, in Table A6 we show energies underlying the real runs presented in Figure 8. Table A1. Summary of the medians of 1000 optimized ground-state energies and the percentage of outcomes that were found within chemical precision of exact energy (−1.86712 ± 0.0015 Ha) for different settings of shots parameter and maxiter = 1000. Additionally, in the middle two columns, the final energy is the direct outcome of the SPSA algorithm and the last two columns show the recalculation using the statevector simulator. The R y variational ansatz was used. The error margin of 3 percent was estimated based on the standard deviation of 1000 identically distributed trials. These data are the basis for Figure 2.

Simulator
Shots  Table A2. Summary of the medians of 1000 optimized ground-state energies and the percentage of outcomes that were found within chemical precision of exact energy (−1.86712 ± 0.0015 Ha) for different settings for the maximum number of iterations (maxiter) and number of shots. The R y variational ansatz was used. The error margin of 3 percent was estimated based on the standard deviation of 1000 identically distributed trials. These data are the basis for Figure 3.  3  Table A3. Summary of the medians of 1000 optimized ground-state energies and the percentage of outcomes that were found within chemical precision of exact energy (−1.86712 ± 0.0015 Ha) for different settings for the maximum number of iterations (maxiter) and number of shots. The R y variational ansatz was used. An error margin of 3 percent was estimated based on the standard deviation of 1000 identically distributed trials. The rightmost two columns are the recalculation of obtained energies using the statevector simulator. These data are the basis for Figure 4.  Table A4. A comparison of various settings of quantum circuits for 2-qubit and 4-qubit Hamiltonians of H 2 . The R y R z and R y variational forms accompanied by linear entanglement with different depths of circuit and numbers of parameters used. In the case of the 2-qubit Hamiltonian, we used an unrestricted maximum number of iterations of the SPSA optimizer, and in the case of 4-qubit Hamiltonian, the maxiter was set to 400. Columns labeled % represent the percentage of the runs that ended within chemical accuracy (−1.86712 ± 0.0015 Ha) of the optimum. The error margin of 3 percent was estimated based on the standard deviation of 1000 identically distributed trials. These data are the basis for Figure 5.  Table A5. A comparison of results from different dates and different noise models built from a real quantum processor, ibmq_santiago. The number of shots was set to 4096. The calculations were performed using both the R y R z and R y variational forms. The classical optimization method SPSA was used, for 2-qubit system the maxiter was unrestricted and for 4-qubit system, the maxiter was set to 400. The last column represents the percentage of the runs that ended within chemical accuracy (−1.86712 ± 0.0015 Ha) of the optimum. The error margin of 3 percent was estimated based on the standard deviation of 1000 identically distributed trials. These data are the basis for Figures 6 and 7.