The Methodology section outlines the design and implementation of the VQC used to realize the XOR function on quantum computers. It details the architecture of the VQC, including the choice of quantum gates and their roles in encoding nonlinear decision boundaries. This section also describes the encoding scheme for the inputs and outputs, the training procedure using parameter-shift gradients within the PennyLane framework, and the process of transferring the optimized parameters from the simulation to physical hardware. This comprehensive approach ensures effective learning and execution of the XOR mapping on a desktop quantum device, bridging theoretical design and practical deployment.
Figure 1 illustrates the hardware platform and implementation workflow of the proposed QNN, highlighting the parameter transfer from simulation to the NMR-based quantum hardware and the subsequent measurement and tomography procedures. The NMR hardware consists of an RF coil that interacts with the sample, applies RF pulses, and senses the resulting signal. It is also equipped with environmental sensors and control systems that sense and maintain a stable operating temperature.
2.1. Variational Quantum Circuit Architecture
XOR is a function that classic linear models cannot learn, whereas VQC inherently possesses the ability to express nonlinearities [
20]. In the VQC, the output is the measurement probability, where the measurement is performed on the second (readout) qubit of the two-qubit system, as shown in Equation (
2):
Here, denotes the two-qubit density matrix prepared by the variational circuit with parameters , I is the identity operator acting on the first qubit, and is the projector corresponding to measuring the second (readout) qubit in the computational basis.
Owing to the rotation, superposition, interference, and entanglement of the quantum state, the final measurement probability becomes a highly nonlinear function, as expressed in Equation (
3):
Therefore, VQC is suitable for imitating the nonlinear structures of neural networks.
Next, we choose
as the training parameter because
can be directly mapped to the “amplitude change” on the Bloch sphere, which is most likely to affect the output probability. As shown in
Figure 2, on the Bloch sphere, the action of the Ry quantum gate is to rotate around the
y-axis, and the same applies to
and
[
21]. Because we ultimately measure the z-basis
and
,
can directly change the amplitudes of
and
, as shown in Equation (
4):
As a simple single-qubit example, the probability of measuring the outcome
after an
rotation is given by Equation (
5):
The result can be nonlinear, continuously differentiable, and directly related to the output probability.
Additionally, only alters the phase and does not change the Z-basis measurement probability; thus, it is not suitable as the primary training parameter. can also cause amplitude changes, but is highly sensitive to NISQ noise. is of crucial importance in the training of quantum neural networks because it can directly and continuously alter the measurement probability under the calculation of the basis measurement, thereby providing stable, differentiable and expressive parameter channels; in contrast, does not change the probability distribution under the same measurement basis and cannot effectively participate in the training. Therefore, is the most natural and compatible choice for the software. However, in terms of hardware, is superior. This is explained in the comparison section.
Because XOR is nonlinearly separable, it must rely on “interaction terms” to be expressed. In quantum circuits, entangling gates provide a mechanism for creating such interaction terms, making entanglement essential for the circuit to express the XOR function [
22]. The truth table for the CNOT gate is presented in
Table 2. In a 2-qubit VQC, CNOT is the only single-gate operation that can generate entanglement, whereas other quantum gates make hardware execution more difficult.
In the experiment, we have two channels,
and
, to which
and
are the inputs. In this article,
and
represent the computational ground states defined by two distinguishable energy levels in a specific physical system, rather than abstract mathematical constructs. In experiments, quantum bits are first prepared in their ground state
through system initialization. This process typically corresponds to the natural ground state of a physical system or a stable state after active polarization. Subsequently, through controlled external driving (such as RF or microwave pulses), a transition between the two energy levels is induced under resonance conditions, thereby achieving the preparation of
to
. This process has been realized on various physical platforms, including spin systems, quantum dot structures, and optical polarization systems. Therefore,
and
have clear physical correspondences and are achievable. Before entering the entanglement layer, the qubits are in the computational basis states
/
. While fixed basis states can already lead to entangled outputs under a CNOT operation, such inputs restrict the accessible quantum state space and limit the expressivity of the circuit. To introduce continuous degrees of freedom and enable a more flexible exploration of the quantum-state manifold, parameterized single-qubit rotations were applied prior to the entangling gate. These rotations do not create entanglement by themselves, but modulate the input state such that the subsequent entangling operation can explore a richer set of quantum correlations. Therefore, we need two
gates to transform the input
/
into a trainable state, determine the type of entanglement that the CNOT will generate, and thereby lay the foundation for nonlinearity. Subsequently, we can obtain the first quantum state as shown in Equation (
6):
Here, and are the computational basis states encoding the classical XOR inputs, and denotes a single-qubit rotation around the y axis with variational parameter .
This step creates the amplitudes
,
. After passing through the CNOT gate, the quantum state becomes complex, as shown in Equation (
7):
If measured directly at this point, it would lead to insufficient model expressiveness, inability to adjust the output, and failure to push the entangled quantum state to the four target points of the XOR. Therefore, the second layer of
is required to “shape” the entangled quantum state. We add two more
gates to adjust the measurement results, expand the model’s expression ability, achieve the final decision boundary, successfully converge to XOR, and obtain the final quantum state, Equation (
8):
Based on the above, we designed a VQC with a
-CNOT-
structure, as shown in
Figure 3.
2.3. Training Procedure
As mentioned before, we need four
gates, and we take the angles
as the training objects. The initial circuit parameters were sampled from a uniform distribution over
, corresponding to a mean value of
and a variance of
. The first layer is
and
, and the second layer is
and
. Regarding the selection of the loss function, because the output of the quantum circuit is essentially an effective probability distribution, the mean square error (MSE) is smooth, continuous, and differentiable [
23], suitable for small-scale VQC, and makes it easier for the VQC to successfully converge. The MSE is perfectly matched with the quantum probability output structure, and no additional transformation is required, as shown in Equation (
10):
Traditional deep learning is based on backpropagation, but quantum circuits are unitary and cannot be directly backpropagated [
24]. Therefore, we use PennyLane’s Gradient Descent Optimizer, Equation (
11):
This eliminates the need to derive gradients, write shift versions of circuits or implement backpropagation. It also enables automatic differentiation of all trainable parameters. Because there are only four sets of data for XOR, we adopted full-batch training (updating the parameters with all four samples at once). Thus, each step fully traverses the four training points, ensuring that the gradient is accurate and unbiased. Unlike mini-batches, there is no variance, and the update direction is more stable and closer to the true optimal direction. The basic update rule of the gradient descent algorithm [
25] is given by Equation (
12):
Here, are the four trainable angle parameters, the step size (learning rate) for each update is 0.4, and is the partial derivative of the loss function with respect to the parameters. During the entire training process, the initial value is randomly selected from 0 to 2. PennyLane automatically computed the gradients of all parameters through the parameter-shift rule, updated the parameters using gradient descent, and then returned the new and new loss. All experiments were performed with a fixed random seed (seed = 0) for parameter initialization to ensure reproducibility of the results. We set this loop to repeat 200 times in our study. Training was terminated when the loss function fell below , indicating convergence, or when the maximum number of training steps (200) was reached to avoid unnecessary over-optimization.
After training, we observed that the loss decreased rapidly and converged stably; the parameters remained stable without fluctuations, and the output probabilities matched the truth table of the XOR operation.
2.4. Parameter Transfer
We obtained four
rotation angle parameters after training using the PennyLane library. These
correspond to the angles of the ideal quantum gates. The gate model of the desktop quantum computer is a pulse model (NMR implementation), and the
gate is essentially a rotation around the
y-axis, corresponding to the RF pulse of the H-channel or P-channel, as shown in Equation (
13):
In this desktop quantum computer, the rotation angle is determined by the product of the pulse amplitude and duration, as expressed in Equation (
14).
Here,
represents the gyromagnetic ratio,
is the RF field amplitude (fixed by the platform), and
is the pulse duration (which can be adjusted). Therefore, in a desktop quantum computer, “setting the pulse width
” is equivalent to setting the rotation angle
. The platform provided the calibrated duration
of the
pulse; therefore, we can convert
into the pulse width of each gate, as shown in Equation (
15).
We validated the transfer by running a pulse-only version of the circuit and confirming that the output probability distribution matched the simulated distribution within the experimental precision.
2.5. Tomography Procedure
Measurements are performed in the Pauli bases , which form a complete operator basis for single-qubit state reconstruction, allowing the estimation of the expectation values , , and . Quantum hardware can only be measured on the computational basis (Z basis): , so measuring X or Y essentially involves rotating the eigenstates of X or Y to the Z-basis. The eigenstates of X are and . The Hadamard gate satisfies and . That is, ; The eigenstate of y is: , , through . Therefore, . The same measurement procedure is applied in both simulation and hardware experiments to ensure consistency.
For each measurement basis, the circuit is executed with
N repeated shots to estimate the measurement outcome probabilities
and
. The expectation values of the Pauli operators are then calculated using Equation (
16).
The single-qubit density matrix is reconstructed via linear inversion using the measured expectation values according to Equation (
17).
The reconstructed density matrices are subsequently used for quantitative and qualitative analyses. State fidelity is evaluated to quantify the agreement between the simulated and hardware-executed quantum states. In addition, the expectation values obtained from tomography are used to visualize quantum states on the Bloch sphere. This enables a direct and consistent state-level comparison between simulation and hardware results after parameter transfer.