Hybrid Quantum-Classical Neural Network for Calculating Ground State Energies of Molecules

We present a hybrid quantum-classical neural network that can be trained to perform electronic structure calculation and generate potential energy curves of simple molecules. The method is based on the combination of parameterized quantum circuits and measurements. With unsupervised training, the neural network can generate electronic potential energy curves based on training at certain bond lengths. To demonstrate the power of the proposed new method, we present the results of using the quantum-classical hybrid neural network to calculate ground state potential energy curves of simple molecules such as H2, LiH, and BeH2. The results are very accurate and the approach could potentially be used to generate complex molecular potential energy surfaces.


Introduction
Quantum computing has shown its great potential in advancing quantum chemistry research [1]. Many quantum algorithms have been proposed to solve quantum chemistry problems [2][3][4], such as the Phase Estimation Algorithm; Aspuru-Guzik et al. [5][6][7][8] to calculate eigenstate energies of simple molecules; the Variational Quantum Eigensolver (VQE) [9][10][11] to solve electronic structure problems; quantum algorithms for open quantum dynamics [12]; and benchmark calculations for two-electron molecules conducted on quantum computers [13]. Using quantum computing techniques to perform machine learning tasks [14] has also received much attention recently including quantum data classification [15,16], quantum generative learning [17,18], and quantum neural network approximating nonlinear functions [19]. So far, applying the various quantum machine learning techniques to quantum chemistry is a natural extension [20,21]. However, previous studies focused solely on quantum circuits with only a few nonlinear operations, which are introduced by data encoding [19,22] or repeated measurements until success [23]. Moreover, recently Sim et al. [24] shows increasing the number of layers of the parameterized quantum circuit (PQC) would reach saturation and may not improve the performance when the number of layers is large enough. Furthermore, the nonlinearity is the most important part for the classical neural network [25] which makes neural networks able to produce complex results [23,26,27]. Therefore, quantum machine learning should not solely focus on PQC and nonlinear operations are needed for the quantum neural network.
To solve this problem, here we introduce a new hybrid quantum-classical neural network, by combining quantum computing and classical computing with measurements between the parameterized quantum circuits. In this paper, we first give a detailed description of the whole structure of the hybrid quantum-classical neural network. We then present numerical simulations by using the new hybrid quantum-classical neural network to calculate ground state energies of different molecular In the proposed quantum-classical hybrid neural network, the linear part in the classical neural network is replaced by the quantum circuits and the nonlinear part is replaced by measurements.

Quantum Layer
The quantum layer is enabled by a parameterized quantum circuit consisting of parameterized quantum gates, which allows the PQC to be optimized by adjusting the parameters to approximate wanted results. PQC has been widely used in many areas of quantum computing and quantum machine learning, such as in VQE [9][10][11], quantum autoencoder [20], and quantum generative learning [17]. In the following section, we will provide details of the quantum layer including encoding classical data into quantum circuits and parameterized quantum circuits.

Data Encoding
To implement the quantum layer, the first step is to encode the input classical data into a quantum state. Variational encoding [22] has been proposed to reduce the depth of quantum circuits and has been widely used in many quantum machine learning techniques [19,22,29,30]. Variational encoding is used to prepare a set of quantum gates with parameters generated by the input data and then initialize the state from the basic state with all qubits as |0 with these gates. For an array of data {a 0 , a 1 , ...a n−1 }, an example of variational encoding to encode n qubits is to prepare the gate G as where g i is a set of single qubit quantum gates on qubits i and f i is a classical function to encode a i as the parameter of g i . The encoded state would be G|0 ⊗n . One simple example is given in our numerical simulations: we take the bond length, a, as the encoding data for each qubit. We choose f i as the identity function and g i as R y H, where R y is the rotation-y gate and H is the Hadamard gate. Thus, the encoded quantum state would be (⊗ n−1 i=0 R y (a)H)|0 ⊗n . In most variational encoding the depth of the circuit needed to encode the data would be O(1) [29] for that the number of quantum gates to initialize the quantum state is fixed, which makes variational encoding more suitable for Noisy Intermediate-Scale Quantum (NISQ) devices [31]. Furthermore, recently it has been shown how the variational encoding may help to introduce nonlinearity features in quantum circuits [22,32]. Variational encoding can only be implemented at the beginning of the quantum circuit, but connections between multiple PQC also need to be nonlinear. To enable nonlinear connections, we introduce measurements as connections between multiple PQC. In the numerical simulations, we will be using the variational encoding to perform the simulation and discuss implementing the quantum circuits on NISQ device.

Parameterized Quantum Circuit
A parameterized quantum circuit, also known as a variational quantum circuit [10,28], is a quantum circuit consisting of parameterized gates with fixed depth. This is the main part of the quantum layer to perform the calculation. The parameterized quantum circuit consists of one-qubit gates as well as CNOT. Some more complicated gates may also be used in PQC which can be decomposed into one qubit gates and CNOT [33]. In general, an n qubits PQC can be written as where U( θ) is the set of universal gates and m is the number of quantum gates. θ is the set of parameters {θ 0 , θ 1 ....θ k−1 }, where k is the total number of parameters and |ψ is the encoded quantum state after data encoding. For each unitary gate U i , it may be a quantum gate which does not require a parameter or a quantum gate which takes parameters. Examples of the unitary gate taking parameters are rotational gates, R x (θ), R y (θ), and R z (θ), which are given by where σ x , σ y , and σ z are Pauli matrices. The operation of U can be modified by changing parameters θ. Thus, the output state can be optimized to approximate the wanted state by changing parameters θ. By optimizing the parameters used in U( θ), PQC approximates the wanted quantum states.

Classical Layer
The classical layer in our construction of the quantum-classical hybrid neural network is to serve as the activation function connecting different quantum layers. To achieve nonlinearity, the classical layer is enabled by measurements-expectation values of operators on each qubit of the PQC, for example, σ i z of each qubit i as the classical layer, which would also serve as nonlinear operations. Expectation values of operators can save complexity because quantum tomography is exponentially hard. Though the expectation values of operators may lose some information compared to quantum tomography, some work used expectation values of operators as connections between quantum computation and classical computation and showed great success [34], which indicates expectation values of operators are capable of extracting useful information from quantum circuits.

Numerical Simulations
To demonstrate the power of the proposed quantum-classical hybrid neural network, we present results for calculating the ground state energies of simple molecular systems: H 2 , LiH, and BeH 2 . The inputs for the unsupervised learning are bond lengths and the outputs are the ground state energies. The whole procedure consists of first training the neural network with some bond lengths and then testing the neural network with other bond lengths to generate the whole potential energy curve.

Constructions of the Quantum Layer
The quantum layer consists of two parts: the variational encoding part and PQC part. We choose to use the variational encoding to decrease the depth of the quantum circuit so that it can be implemented on NISQ devices. The construction of the quantum layer follows [29,34]. The input state is initialized as (⊗ n−1 i=0 R y (a)H)|0 ⊗n , where a is the bond length, H is the Hadamard gate, and R y is the rotation-y gate. We only have one bond length while the number of qubits of the PQC is n; we decided to follow the variational encoding in [29] to encode each qubit with same value. The number of qubits n is equal to the number of qubits of the corresponding Hamiltonian. The quantum computation part is to use a simple PQC consisting of R y and CNOT gates, which can be written as where w are adjustable parameters, R y represents rotation-y gate, and CNOT m,n represents CNOT gate with m as the control qubit and n is the target qubit. To achieve better entanglement of the qubits before appending nonlinear operations, the n qubits PQC has n repeated layers in our simulation. By optimizing the parameters, the general PQC tries to approximate arbitrary states so that it can be used for different specific molecules. The construction of the PQC for three qubits is illustrated in the blue part of Figure 2, and the construction of the PQC for four qubits is illustrated in the blue part of Figure 3.  The orange parts are the data encoding, the blue parts are parameterized quantum circuits, and the yellow parts are measurements. The first measurements serve as nonlinear operations connecting two PQC. a is the input bond length, bs are the expectation values of σ z , and ws are adjustable parameters.

Constructions of the Classical Layer
The classical layer is enabled by expectation values of the operators. In our numerical simulations, we are using σ i z for qubit i as the classical layer. The outputs from the classical layer will be encoded into another quantum layer. The second quantum layer is the same as the first one except for the data encoding part it would be ⊗ n−1 i=0 R y (b i π)H, where b i is the measured expectation value from qubit i. We multiply each b i with π when encoding to change the range of the encoding data from [−1, 1] to [−π, π] [35]. The construction of our proposed hybrid quantum-classical neural network is illustrated in Figure 3.

Cost Function
The cost function is defined as where j represents the j th input bond length of the training bond lengths. |φ j is the final state of the proposed hybrid quantum-classical neural network with the input as the j th input bond length and H j is the Hamiltonian corresponding to the j th input bond length. The idea of the cost function is similar to VQE: by optimizing the parameters, the expectation energy of |φ j is minimized to approximate the ground state energy. The evaluation of the Hamiltonian can be done by techniques in [11]. The Hamiltomian can be written as the sum of tensor products of Pauli matrices H = ∑ i c i P i , where c i is the coefficient and P i is the tensor product of Pauli matrices. Instead of evaluating the whole Hamiltonian, we can evaluate each term of the Hamiltonian and the expectation of the Hamiltonian can be obtained by H = ∑ i c i P i , which does not need quantum tomography or take exponential complexity. The whole training procedure is done by taking a set of bond lengths and corresponding Hamiltonian and minimizing the cost function as equation (5). After the training, we test the model with other bond lengths.

Simulation Results
The Hamiltonian of the molecule systems can be derived by transforming the corresponding second quantization Hamiltonian into sum of tensor products of Pauli matrices. For H 2 , we use the Jordan-Wigner transformation [36] to get a 4-qubit Hamiltonian. We decided to apply the complete active space (CAS) approach [37,38], which divides the orbitals into inactive orbitals such as always occupied low energy orbitals and always unoccupied high energy orbitals, and active orbitals, to reduce the number of qubits of LiH and BeH 2 Hamiltonian [4,11] and the reduced Hamiltonian is only of the active orbitals. For LiH, we assume the first two lowest energy spin orbitals are always occupied and use the binary code transformation [39] considering spin symmetry to save two qubits. We get an 8-qubit LiH Hamiltonian. For BeH 2 , we assume the first two lowest energy spin orbitals are always occupied and the first two highest energy spin orbitals are never occupied, and use the binary code transformation [39] considering spin symmetry to save two qubits. We get an 8-qubit BeH 2 Hamiltonian.
In the simulation, H 2 used four qubits and 32 parameters. LiH and BeH 2 both used eight qubits and 128 parameters. The gate and parameter complexity of the proposed hybrid quantum-classical neural network in this simulation is O(n 2 ), where n is the number of qubits of the Hamiltonian. Here, we present the results using our proposed hybrid quantum-classical neural network for ground state energies of H 2 , LiH, and BeH 2 in Figures 4 and 5. We can see from these figures that the training data points converge very close to the diagonalization results without pre-known ground state information of the transformed Hamiltonian in Pauli matrices format. Furthermore, after training, by inputting the other bond lengths we can also get good approximating ground state energies with optimized parameters. BeH 2 has some deviation when the bond length is large, which may be solved by improving the parameterized quantum circuit. For example, the work in [24], which discusses expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms, shows that increasing the depth of PQC will increase the expressibility and different constructions of PQC have also different expressibility.
Furthermore, to show that the intermediate nonlinear measurements improve the performance, we present the comparison of the results of our proposed hybrid quantum-classical neural network and quantum neural network removing intermediate measurements. The setting of the quantum neural network removing intermediate measurements is illustrated in Figure 6.
In Figure 7, we present the comparison of the results of our proposed hybrid quantum-classical neural network and quantum neural network removing intermediate measurements. The proposed hybrid quantum-classical neural network and quantum neural network, removing intermediate measurements, are trained with same set of bond lengths as in Figure 4. We can see without the intermediate nonlinear measurements, the quantum neural network can only achieve bad results. However, by adding the intermediate nonlinear measurements, the results converge closely to the diagonalization results.
The parameters of the proposed hybrid quantum-classical neural network and quantum neural network removing intermediate measurements, are initialized from a Gaussian distribution with standard deviation as 0.1 and mean as 0. Because different initialization of parameters will result in different starting of the optimization and may lead to different final results, to eliminate the effects of parameter initializations, here we present the quantitative comparison of the two constructions with four different parameter initialization from same Gaussian distribution with different random seeds. All are trained with the same set of the training bond lengths as in Figure 4. In the Table 1, we can see that our proposed quantum neural network performs better than the quantum neural network without intermediate measurements. Our simulation results show that adding intermediate nonlinear measurements would help to improve the expressibility of the PQC. Furthermore, adding intermediate measurements would also decrease the circuit depth which makes it more suitable for current NISQ devices.

Materials and Methods
Orbital integrals in the second quantization Hamiltonian are calculated by STO-3G minimal basis using PySCF [40] and the transformation is done by OpenFermion [41]. The simulation is done by Qiskit [42]. The tensor production orders in OpenFermion and Qiskit are opposite. For a n qubits, the tensor production order in OpenFermion is q 0 ⊗ q 1 ... ⊗ q n−1 , while the tensor production order in Qiskit is q n−1 ⊗ q n−2 ... ⊗ q 0 . We decided to follow the tensor production order in OpenFermion. In simulation, we treat the qubit indexed in Qiskit reversely. For n qubits, the qubit indexed as q 0 in Qiskit is treated as q n−1 , the qubit indexed as q 1 in Qiskit is treated as q n−2 , etc. By doing this, we change the tensor production order in Qiskit same as OpenFermion. The optimization is performed by the Broyden-Fletcher-Goldfarb-Shanno algorithm [43] with maximum 500 iterations and gradient norm tolerance to stop as 10 −5 . In the simulation, the expectation of the operator is simulated by matrix production of the operator matrix and the Hamiltonian can be treated as a single operator. To save the simulation time, instead of evaluating each P i to get H = ∑ i c i P i , we treat H as a single operator and only evaluate once.

Conclusions
In this work, we proposed a new hybrid quantum-classical neural network by combing PQC and measurements to achieve nonlinear operations in quantum computing. We have shown that the proposed hybrid quantum-classical neural network can be trained to obtain the electronic energies at certain bond lengths and then generate the whole potential energy curve. The results of H 2 , LiH, and BeH 2 are very accurate and demonstrate the power of the proposed hybrid quantum-classical neural network.
Furthermore, we show that the intermediate nonlinear measurements are very important in comparison with quantum neural network removing the intermediate measurements. The intermediate nonlinear measurements can reduce the circuit depth and are more suitable for NISQ devices. Although the method is used to generate one-dimensional potential energy curves, the approach is general and could be generalized to generate multidimensional potential energy surfaces, for example, changing the inputs from the bond lengths to multidimensional coordinates. This will be done in future work.
Author Contributions: S.K. designed the research. R.X. performed the calculations. Both discussed the results and wrote the paper. All authors have read and agreed to the published version of the manuscript.