A Modified Depolarization Approach for Efficient Quantum Machine Learning

Quantum Computing in the Noisy Intermediate-Scale Quantum (NISQ) era has shown promising applications in machine learning, optimization, and cryptography. Despite the progress, challenges persist due to system noise, errors, and decoherence that complicate the simulation of quantum systems. The depolarization channel is a standard tool for simulating a quantum system's noise. However, modeling such noise for practical applications is computationally expensive when we have limited hardware resources, as is the case in the NISQ era. We propose a modified representation for a single-qubit depolarization channel with two Kraus operators based only on X and Z Pauli matrices. Our approach reduces the computational complexity from six to four matrix multiplications per execution of a channel. Experiments on a Quantum Machine Learning (QML) model on the Iris dataset across various circuit depths and depolarization rates validate that our approach maintains the model's accuracy while improving efficiency. This simplified noise model enables more scalable simulations of quantum circuits under depolarization, advancing capabilities in the NISQ era.


I. INTRODUCTION
Quantum Computing has seen significant progress in recent years, with the development of quantum algorithms for a variety of applications, including machine learning [1]- [6], optimization [7]- [11], and cryptography [12]- [15].However, the development of quantum algorithms is still in its infancy, and many of the algorithms that have been developed are not yet ready for practical use [16], [17].Due to the susceptibility of NISQ device operations to errors and decoherence [18], simulating quantum systems remains a major challenge in developing quantum algorithms [17].
In the NISQ era, system noise is not merely a nuisance to be minimized but a fundamental force shaping the field of QML research.Interestingly, a considerable number of works have chosen to regard noise not as a challenge but as an opportunity to advance their research.Studies have shown that quantum learning of n-bit parity functions remains remarkably efficient under depolarizing noise, a testament to the inherent resilience of quantum algorithms compared to their classical counterparts [19].This early work demonstrated the potential for quantum algorithms to maintain a learning advantage even in noisy conditions.While traditionally viewed as a detrimental factor to quantum computation, depolarization noise under certain conditions can enhance the robustness and functionality of quantum learning algorithms against adversarial attacks [20]- [24].This counterintuitive finding highlights the potential of noise to endow quantum models with robustness against malicious attempts to manipulate the model's outputs.
However, harnessing the power of noise as a training tool requires careful consideration.The effectiveness of adversarial training techniques, for example, hinges on the assumption that the test attack and the training attack employ the same methods to generate adversarial examples.In real-world scenarios where attackers may employ diverse and unknown strategies, this advantage is not guaranteed [25], [26].Therefore, deriving robust guarantees against worst-case scenarios remains crucial for building truly secure and resilient quantum learning algorithms.
The challenges posed by noise extend beyond algorithm design, impacting the very foundations of QML.The inherent noise in NISQ machines also presents significant challenges to the learning capabilities of Quantum Neural Networks [18].The presence of system noise can significantly diminish the quantum kernel advantage [27], raising concerns about the viability of quantum kernel methods [28].Additionally, calculating numerical gradients on noisy qubits presents a delicate balancing act: reducing the step size to improve accuracy can obscure subtle differences in the cost function for nearby parameter values [29].This complexity necessitates a deeper exploration into controlled noise simulations, such as the depolarization channel, to better understand and counteract these effects.
In the worst-case scenario, we can use the depolarization channel to simulate the quantum system's noise [28].However, executing depolarizing noise in a controlled manner on quantum hardware presents both critical challenges and intriguing opportunities for advancing our understanding and mitigation of this noise model.One of the primary challenges in executing depolarizing noise lies in its inherently probabilistic nature.Depolarization introduces errors with a certain probability, often modeled by the Kraus operators, onto the quantum state [30].Implementing such probabilistic errors precisely on hardware requires sophisticated control techniques and careful calibration procedures.Inaccurate noise injection can lead to deviations from the expected noise model, compromising the validity of subsequent experiments and analyses.To navigate these challenges and unlock the full potential of QML, the development of robust error correction techniques is paramount [31].Techniques like surface codes [32] and stabilizer codes [33] offer promising avenues for mitigating the detrimental effects of noise and safeguarding the integrity of quantum computations.Using noise-estimate circuits combined with other error correction techniques, we can estimate and correct such noise even in the circuits with extensive CNOT gates [34].We can further enhance the accuracy of error mitigation using multi-exponential error mitigation techniques [35].While using this technique, we may be able to model depolarizing noise accurately, but the computational cost can be prohibitive in the NISQ era [36].
The conventional model for representing depolarization noise employs three Pauli matrices, (X, Y, and Z), to capture isotropic noise processes affecting quantum states [30].While comprehensive, this standard approach requires three Kraus operators corresponding to each Pauli matrix, resulting in six total matrix multiplications to simulate the noise channel.However, as discussed by [17], this mathematical formalism introduces significant computational overhead, particularly in near-term systems where hardware resources are limited.
To address these challenges, we propose a modified approach for single-qubit depolarization utilizing only two Kraus operators and the X and Z Pauli matrices.This reduced model not only simplifies the mathematical representation but also decreases the computational complexity from six to four matrix multiplications for each noise channel execution.Such an approach provides a more efficient means of simulating depolarization in resource-constrained quantum hardware, an essential capability in the NISQ era where computational resources are scarce.By developing simplified yet representative noise models, our work aims to enable more efficient and scalable approaches to simulating and correcting depolarization noise in deep quantum circuits.
The rest of the paper is organized as follows.Section II provides background on the standard depolarizing channel and derives the proposed modified channel representation.Section III experimentally analyzes the modified channel on the QML task on the Iris dataset.Section IV discusses the implications of our method, and section V concludes with a summary of our contributions and an outlook on future research directions.

II. DERIVATION A. Standard Expression of the Depolarizing Channel
The depolarizing channel represents a quantum noise where the qubit state is replaced by a completely mixed state with a certain probability [30].The standard expression of the depolarizing channel E for a single qubit is given by: where ρ is the density matrix; p is the depolarization rate; and X, Y , and Z are the Pauli matrices, for the X, Y , and Z quantum gates, respectively.
Eq. ( 1) implies that with probability (1 − p), the qubit state remains unchanged, and with probability p, it is subjected to equal mixtures of bit-flip, phase-flip, and both bit and phase-flip errors.
We can represent (1) using Kraus operators as: where, I is an identity matrix.Having defined the standard depolarizing channel and associated Kraus operators, we will next derive an alternative representation of this channel.

B. Alternative Expression of the Depolarizing Channel
We introduce an alternative representation of the depolarization channel characterized by reduced matrix multiplication operations that only use the X and Z Pauli matrices.We define this alternative representation of the depolarizing channel as: In this representation, the state is partly retained with a coefficient of (1 − 2p 3 ) and partly subjected to a specific combination of Pauli X and Z operations with a coefficient of 2p 3 .This alternative expression is validated below to produce the same results as (1).
Proof.Consider the Pauli Matrices: Let us consider an arbitrary density matrix ρ = a b c d for a single qubit quantum state.Substituting ρ in (1) and ( 3) and with trivial algebraic work, we get: Hence, it can be seen that ( 1) and ( 3) is the same for a single qubit and for an arbitrary ρ.
Next, we will define Kraus operators and prove their validity.
Theorem 2. The following Kraus operators for (3) are valid operators.
Proof.From (3), one can immediately see that the corresponding Kraus operator corresponding to the term (1 − 2p 3 )ρ is: 1 − 2p 3 I.Now let us consider the second terminology of (3), i.e., 2p 3 Z((ρX) T X)Z, which enables us to re-write without loss of generality the following: Next, we want a Kraus operator K 1 s.t.K 1 ρK † 1 = ZXρXZ Thus, intuitively, Following the Kraus operator completeness constraint, we can write: To satisfy (5), K 1 must have a magnitude of 2p 3 .Therefore, We added "i" to correct a sign discrepancy while validating the operators, resulting in: Validation of Kraus operators: Any set of Kraus operators satisfies the completeness property.That is, i Solving each of the Kraus operators squared individually, we can get, This proves that the Kraus operators proposed in theorem (2) are valid Kraus operators.
The above derivations show that the modified depolarization channel expression is equivalent to the standard equation.We further proved that the proposed Kraus operators for (3) are valid Kraus operators.The next step is to show that the matrix given by the modified channel is a valid density matrix.To do this, we need to prove that (3) is Hermitian, positive semidefinite, and has a unit trace.Theorem 3. The matrix given by ( 3) is a valid density matrix.
Proof.First, let us shows that ρ ′ m is Hermitian.
A matrix is Hermitian if it equals its own conjugate transpose, A = A † .
To show ρ ′ m is Hermitian, we calculate its conjugate transpose using its definition given by ( 6): Since K 0 and K 1 are constructed from unitary matrices and complex numbers, their conjugate transposes are simply their adjoints, hence ρ ′ m is Hermitian.Second, let us show that ρ ′ m is Positive Semi-Definite.Given the Kraus operators K 0 and K 1 , and density matrix ρ ′ m we want to show that for any vector v, the expectation This expands to: We can express each term as: where w 0 = K † 0 v and w 1 = K † 1 v are vectors in the Hilbert space.
Since ρ is a positive semi-definite density matrix, we have for any vector w, w † ρw ≥ 0. Applying this to w 0 and w 1 : Therefore, the sum is also non-negative: which establishes that ρ ′ m is positive semi-definite.Third, let us show that ρ ′ m has Unit Trace ,i.e., Tr(ρ ′ m ) = 1.The trace of ρ ′ m is given by: Using the cyclic property of the trace, we can rewrite this as: Computing K † 0 K 0 and K † 1 K 1 , we get: Thus, the trace of ρ ′ m simplifies to: As ρ is a density matrix with unit trace, Tr(ρ) = 1, this leads to: Therefore, ρ ′ m maintains the unit trace property, as required for any density matrix.
Next, we derive the expression on how (3) evolves when the depolarization channel is applied m times on a quantum state ρ that leads us to the following lemma.
Lemma 1.When a depolarization channel with p depolarizing rate is applied on a quantum state ρ up to m times, the resulting quantum state is defined as follows up to first order in p: And, for an observable O, the expectation value is given as: In the following section, we verify that the results obtained from ( 8) and the standard channel simulations are negligible.We will demonstrate through experimental evidence that (3)and (7 and 8) can be effectively used for training machine learning models.

III. EXPERIMENT
We start this section by showing that the results from (8) are consistent with simulation results of (1) for multiple values of m and p.Later, we empirically show that the Depolarization rate up to threshold w does not affect the performance of the single qubit QML model for the iris dataset.For the scope of this experiment, we are considering the binary classification.We used the first, Setosa, and third, Virginia, flower classes and only the first 2 features, Sepal length and sepal width.We conducted the experiment on Python with Pennylane for quantum circuits simulations and quantum computation.We would like to mention that otherwise mentioned, we used (3) for the depolarization channel and (8) for the depolarization channel applied m times.

A. Quantum Circuit behavior analysis under Depolarization channel up to m times
Eq. ( 8), posits that when a depolarization channel with a rate p is applied m times to a quantum state ρ, the resultant state ρ ′ mm adheres to a predictable transformation that maintains linearity with respect to p in the first order for small value of p.This suggests that despite the iterative application of noise, the overall system's behavior under the depolarization channel can be approximated linearly.We conduct a series of simulations to substantiate this theory.
Fig. 1 presents a scatter plot visualization for the expectation value differences between (7) and the standard depolarization model as a function of both p and m.There is a minimal deviation between the standard and modified channels' expectation value for low depolarization rates across all gate counts.This alignment implies that the modified equation retains fidelity to the standard model's predictions in the low-noise regime.The plots exhibit a uniform trend where, for small values of p, the difference in expectation values is negligible across all values of m.This negligible difference remains consistent as the number of gates increases, emphasizing the robustness of the modified model.Extending on these results, we analyze its performance in QML by training the QML model on the Iris dataset under the modified depolarization channel.We map the classic data into quantum Hilbert space via a feature map.

B. Data Encoding
We start by initializing the qubit in a computational basis |0⟩ = 1 0 .Let ϕ : x → ϕ(x) be a mapping from input space X to a quantum Hilbert Space R. Let us define ϕ(x) = RX(x 1 )RY (x 0 ), where x 0 and x 1 are features of the input vector, and ), are rotational single qubit quantum gates.We chose the angle encoding scheme because it linearly separates input data in the Bloch Sphere better than amplitude encoding, as shown in Fig. 2. Thus, for an input vector x, the data-encoded state is defined as:

C. Variational Layers
Similar to the encoding scheme we applied a series of variational gates (RY and RX with parameters) whose parameters θ can be optimized during training.For N trainable parameters θ, we define the operation of Variational layers U as: where Gate i (θ i ) is either an RY or RX gate with parameter θ i .Thus, we can define a variational circuit as: Let ρ = |Ψ⟩ ⟨Ψ| be a density matrix.The system undergoes evolution through a depolarization channel.This channel, denoted as E(ρ), transforms the state be ρ of the qubit by mixing it towards a maximally mixed state as the depolarization rate p increases as given by equations ( 1) and (3).Let this state ρ ′ .Now we measure an observable Pauli−Z = 1 0 0 −1 matrix to get a quantum machine learning model that can be defined as: In the following section, we train this QML model on the Iris dataset under a modified depolarization channel and present its result.

D. Training
We trained the model for various values of m=[1, 3, 5, 10, 15] and p=[0.0,0.001, 0.005, 0.01, 0.05, 0.08, 0.1, 0.5] The initial learnable parameters were chosen randomly.We used the Adam optimizer with a 0.1 step size, square loss as a loss function, and a parameter-shift rule to compute the gradients.We trained the model for 30 epochs for each combination of p and m.
The pictorial representation of the depolarization noise at each layer of the circuit and its influence on the QML model training is presented in Fig. 3.The decision boundaries presented in the right column of the figure illustrate the model's behavior across varying levels of noise and circuit depths.We observe that with an increase in circuit depth, while the expressivity of the model may enhance, there is a concurrent increase in its vulnerability to noise, which adversely affects the quality of the decision boundary.Notably, the model's performance appears to be robust for circuit depths ranging from 1−5.The model is capable of creating a decision boundary at a depth of 5, even when the depolarization rate is 0.1 or at a depth of 15 for a 0.05 error rate.This consistent performance of the model at lower noise levels across all circuit depths indicates that QML models can be robust to noise up to a certain threshold.
However, we observe that the model has difficulty making accurate predictions, even with one or three trainable gates when the noise level is 0.5 or higher.This indicates a possible threshold for both circuit depth and noise level that maximizes QML model performance.In our experiments, reaching a plateau at a depth of five suggests that the model's capacity for feature representation may be sufficiently saturated by a depth of five.Beyond this point, we observed a decline in precision, F1 score, and accuracy.Hence, the results indicate that while QML models exhibit robustness in lower noise environments and at shallower circuit depths, their performance diminishes with increased circuit complexity and higher noise levels.
In the next section, we discuss the advantages and limitations of a modified depolarization channel and discuss the trade-off between model complexity and noise resilience.

IV. DISCUSSIONS
We argue that (3) and ( 8) provide the computation advantage for simulating depolarization noise.As long as p is small, similar to (8), we can define the density matrix of a system under depolarization channel up to m times for (1) as follows: The expectation value of an observable O can be defined as: We conducted a study to analyze the computational efficiencies of two depolarizing channel models applied to quantum states.These models were given by ( 1) and (3).We focused on the matrix multiplication overhead and the operational requirements, specifically the use of Pauli gates.
Our findings suggest that the standard depolarizing channel, given by ( 1) and ( 13), requires a significantly higher computational investment.It requires six matrix multiplications for state evolution and an additional four for computing expectation values, as shown in (14), totaling ten multiplications.This approach also demands the use of all three Pauli gates (X, Y, Z), which can introduce complexity in gate operations and potential errors in practical quantum computing environments.
On the other hand, the modified depolarizing channel, given by ( 3) and ( 7), presents a more efficient alternative.It requires only four matrix multiplications for state evolution and two additional ones for expectation value computations, according to (8), totaling six multiplications.The modified model also eliminates the need for direct Pauli Y −gate applications, thereby simplifying the operational framework.
Our analysis highlights the modified channel's potential to reduce computational overhead and operational complexity.In practical scenarios, such as cloud-based quantum computing environments, users often face lengthy queues, leading to extended wait times, sometimes spanning hours, for executing a single operation.By reducing the matrix multiplications, our approach effectively reduces the computational load by at least 16 multiplication and 8 addition operations.Considering that a typical user might only manage to perform one operation per hour on real quantum hardware due to these queuing constraints, our method could result in substantial time savings.This makes it ideal for use in quantum algorithms where efficiency and gate operation minimization are important.
We further argue that the modified channel can be extended to train the QML model.The results on QML model behavior demonstrate the nuanced interplay between circuit depth, noise levels, and the model's performance.The results suggest that there exists a level of quantum circuit complexity where the representational power of the model is optimal.However, as we extend the circuit depth beyond this optimal point, we observe diminishing returns in model performance, highlighting a critical trade-off between the expressiveness of deeper quantum circuits and their susceptibility to noise.

V. CONCLUSION
Our work presents the computational benefits of using a modified depolarizing channel, suggesting a more resource- efficient alternative to the standard model.The modified channel requires fewer matrix multiplications and simplifies operations by avoiding the use of the Pauli Y −gate, thereby providing a practical approach for NISQ-era quantum algorithms.When applied to the training of a QML model on the Iris dataset, this model demonstrates an optimal balance between circuit depth and noise resilience, with performance metrics indicating a peak in representational power at intermediate circuit depths.Beyond this threshold, the model's performance weakens, emphasizing the delicate interplay between circuit depth, noise rate and dataset complexity.The findings of our study not only advance the understanding of depolarizing noise in quantum systems but also guide the development of QML models, ensuring they harness the computational advantage.Our future works include extending the application of the modified depolarization channel to multi-qubit systems and exploring its effectiveness in training complex and higher dimensional datasets.

Fig. 1 :Fig. 2 :
Fig.1: Scatter plots present the difference between the standard channel and modified depolarization channel expectation value.Each channel was applied to a quantum circuit with single qubit gates of 3, 8, and 15 respectively.The result for 3 single qubit gates is presented in plot (a), while plot (b) and plot (c) represent the results for 8 and 15 gates circuits, respectively.The x-axis of each plot represents the number of times the noisy channel was applied and is given by m, while the y-axis gives the varying depolarization rates.

Fig. 3 :
Fig. 3: Experimental results for decision boundary evolution presented in the right column and training dynamics in the left column for a QML model on the Iris dataset, with varied noise levels (p) and depolarization channel applied up to (m) times.The decision boundaries are plotted for depths of 1, 3, 5, 10, and 15, at noise levels ranging from0.0 to 0.5.The results across rows are presented in chronological order in circuit depth.Accuracy and loss graphs display the model's performance over 30 epochs that highlight the impact of noise rate and circuit depth on learning efficacy.