QU-Net: Quantum-Enhanced U-Net for Self Supervised Embedding and Classification of Skin Cancer Images

Halab, Khidhr; Marzoug, Nabil; El Meslouhi, Othmane; Abou Elassad, Zouhair Elamrani; Akhloufi, Moulay A.

doi:10.3390/bdcc10010012

Open AccessArticle

QU-Net: Quantum-Enhanced U-Net for Self Supervised Embedding and Classification of Skin Cancer Images

by

Khidhr Halab

¹

,

Nabil Marzoug

¹

,

Othmane El Meslouhi

²

,

Zouhair Elamrani Abou Elassad

³

and

Moulay A. Akhloufi

^1,*

¹

Perception, Robotics, and Intelligent Machines (PRIME) Laboratory, Department of Computer Science, Université de Moncton, Moncton, NB E1A 3E9, Canada

²

LMPEQ Laboratory, National School of Applied Sciences, Cadi Ayyad University, Safi 46000, Morocco

³

LISI Laboratory, Faculty of Sciences Semlalia, Cadi Ayyad University, Marrakech 40000, Morocco

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2026, 10(1), 12; https://doi.org/10.3390/bdcc10010012

Submission received: 20 October 2025 / Revised: 7 December 2025 / Accepted: 24 December 2025 / Published: 30 December 2025

Download

Browse Figures

Versions Notes

Abstract

Background: Quantum Machine Learning (QML) has attracted significant attention in recent years. With quantum computing achievements in computationally costly domains, discovering its potential in improving the performance and efficiency of deep learning models in medical imaging has become a promising field of research. Methods: We investigate QML in healthcare by developing a novel quantum-enhanced U-Net (QU-Net). We experiment with six configurations of parameterized quantum circuits, varying the encoding technique (amplitude vs. angle), depth and entanglement. Using the ISIC-2017 skin cancer dataset, we compare QU-Net with classical U-Net on self-supervised image reconstruction and binary classification of benign and malignant skin cancer, where we combine bottleneck embeddings with patient metadata. Results: Our findings show that amplitude encoding stabilizes training, whereas angle encoding introduces fluctuations. The best performance is obtained with amplitude encoding and one layer. For reconstruction, QU-Net with entanglement converges faster (25 epochs vs. 44) with a lower Mean Squared Error per image (0.00015 vs. 0.00017) on unseen data. For classification, QU-Net with no entanglement embeddings reaches 79.03% F1-score compared with 74.14% for U-Net, despite compressing images to a smaller latent space (7 vs. 128). Conclusions: These results demonstrate that the quantum layer enhances U-Net’s expressive power with efficient data embedding.

Keywords:

quantum machine learning; medical imaging; skin cancer; U-Net; multi-modal data; self supervised embedding; feature representation learning

Graphical Abstract

1. Introduction

Quantum machine learning (QML) [1] is an emerging field that uses quantum computing’s capabilities to enhance traditional machine learning algorithms. With the quantum success in achieving an exponential speed-up in many classical tasks [2,3], one can wonder if we can achieve the same results to accelerate machine learning algorithms. Recently, the field has witnessed an increase in research to investigate the potential quantum advantage.

However, while many papers report that adding quantum components improves the results, it remains unclear whether such improvement can generalize to other tasks. Additionally, the quantum advantage may also be influenced by the commercialization of quantum technologies. Moreover, as shown by [4], recent research in the field seems to be biased by the notion of the eventual supremacy of quantum technology, often aiming to prove this assumption rather than conducting an objective investigation. This raises concerns about the rigor and impartiality of current studies, potentially biasing the understanding of Quantum Machine Learning’s true capabilities and limitations.

Skin cancer is among the most common types of cancer, where skin cells grow abnormally. It occurs most often on parts of the skin that are exposed to ultraviolet (UV) radiation, whether it comes from the sun or artificial tools such as tanning beds. Skin cancer is caused by DNA damage in the cell nucleus, leading to mutations that result in uncontrolled growth. There are three main types of skin cancer: basal cell carcinoma (BCC), squamous cell carcinoma (SCC), and melanoma, which is the least common but the most dangerous type of skin cancer as it spreads rapidly to other parts of the body’s skin, making it particularly lethal if not detected and treated early [5].

In this work, we propose a quantum-enhanced U-Net (QU-Net) that integrates a quantum layer as its bottleneck. QU-Net projects the image-data from the ISIC-2017 dataset [6] into a lower expressive space using a self-supervised approach [7,8] by training the model to reconstruct the images. The new embeddings are represented by the measurements on qubit states that are manipulated by a parameterized quantum circuit (PQC) in the quantum layer, and we experiment with several PQC configurations that vary in encoding strategy (amplitude vs. angle), circuit depth, and entanglement structure. To ensure fairness across all comparisons, all models are trained under identical conditions, using the same optimizer, GPU hardware, hyperparameters, and framework implementation. The quantum circuits in QU-Net are executed on the PennyLane default.qubit simulator. To evaluate the expressiveness of the new embeddings, we compare them to the embeddings extracted from a classical U-Net. Theoretically, if the new embeddings preserve the information from the lesion images, we should obtain relatively high or similar classification metrics when distinguishing between benign and malignant cases. If this is the case, it would indicate that a significant amount of the original information represented in the form of lesion images has been compressed into a lower-dimensional representation while maintaining comparable classification performance with reduced computational cost during inference.

Typically, an end-to-end approach obtains superior inference performance, as it enables the model to establish a direct mapping function between input data and output predictions. However, the ISIC-2017 dataset includes additional patient metadata in a structured format (CSV files). To incorporate this multimodal information [9], we adopt a two-phase prediction pipeline. In the first phase, image features are extracted through the self-supervised learning process. In the second phase, the quantum embeddings [10] obtained from the PQC are concatenated with the structured patient metadata. This approach based on feature representation learning [11] allows the usage of multiple data modalities in the prediction process.

Our experiments show that amplitude encoding stabilizes training, while angle encoding often introduces fluctuations. Among all configurations, the best performance is obtained with amplitude encoding and a single rotation layer. For reconstruction, QU-Net with entanglement converges faster than U-Net (25 epochs vs. 44) and reaches a lower MSE (0.00015 vs. 0.00017). For classification, QU-Net with no entanglement achieves an F1-score of 79% compared with 74.14% for U-Net, despite compressing images into a smaller latent space (7 vs. 128).

These results suggest that incorporating a quantum layer as the bottleneck enhances the U-Net with faster convergence and improved generalization. It also allows for compressing the data into a more efficient and expressive embedding.

2. Related Works

Classical deep learning approaches have demonstrated significant progress in automated skin cancer diagnosis, though challenges in data efficiency, model interpretability, and generalization remain active areas of research [12].

For the ISIC 2017 dataset, numerous studies employed deep learning architectures for skin lesion classification. For instance, ref. [13] proposed an ensemble framework combining a modified Inception-ResNet-v2 with EfficientNet-B4, enhanced with soft-attention. They achieved an F1-score of 79%. Ref. [14] introduced a ViT model with an optimized weighted cross-entropy loss function to address the dataset imbalance. They also added a regularization term to encourage the model to focus on the lesion area. They compared their ViT with well-established vision models. They reported an F1-score of 88.4% for ViT, in contrast to 85.2%, 83.5%, 86%, and 87% F1-scores for ResNet50, VGG19, ResNeXt, and standard ViT, respectively. Other approaches rely on transfer learning, such as the modified VGG-16 model by ref. [15], which obtained an F1-score of 71% on ISIC-2017.

These results are summarized in Table 1:

Recently, many machine learning applications in the field of healthcare tried to integrate quantum components in their model design, mostly in the form of a dressed quantum circuit [16]. For instance, in their effort to improve personalized medical treatment, ref. [17] tried to predict the drug response for cancer patients by employing a graph convolutional network (GCN) to extract the drug features and a convolutional neural network (CNN) to extract patient cell line features. They input the combined features to a deep quantum neural network (QNN) where predictions are made. The proposed model outperforms its classical counterpart based on a fully connected layer (FCL) by a 15% margin. Ref. [18] adopt a hybrid quantum-classical transfer learning for the detection of Alzheimer’s, they employ a pretrained ResNet34 to extract 512 features from brain MRI images and classify them using a quantum support vector machine (QSVM) [19], which obtained an accuracy of 97.2% compared with its classical counterpart, which obtained 92.2%. Also for brain MRI images, a quantum convolutional neural network (QCNN) was tested on the binary classification of instances from different datasets by [20], a 98.72% accuracy score was obtained by the QCNN outperforming the classical CNN 94.23% accuracy. In another work, ref. [21] use a multimodal approach where the different types of data transmitted using the internet of medical things (IoMT) to the cloud are combined in the prediction task. A QCNN is employed to extract important features from medical images that are fused with other modality features. The combined information is used to train a variational quantum circuit (VQC). The proposed strategy was tested on two tasks, breast cancer diagnosis and COVID-19 diagnosis, where the model obtained an accuracy of 97.07% and 97.61%, respectively. In their hybrid quantum model, ref. [22] integrates what they call a modified hardware efficient ansatz (MHEA) with the goal of classifying the multi-class cardiac pathologies. They also adopt a multimodal approach where the discriminative features are extracted from the heart MRI images and combined with the patient and the clinical data. They benchmark their proposed model with other classical and quantum models, incorporating a well-established ansatz-like variational quantum eigen-solver (VQE). The authors select the best features using recursive feature elimination with only 30 features. The test was conducted using different simulators and optimization techniques. Their proposed model, incorporating MHEA, outperforms all its counterparts with a maximum average improvement of 7.77% accuracy difference. It also speeds up the testing process by up to 60%. For the classification of thyroid cancers, ref. [23] employs random quantum circuits in a QCNN for discriminative features extraction from thyroid ultrasound images. The proposed QCNN obtained an accuracy of 97.63%, outperforming the classical models, which only obtained a 93.87% accuracy rate. For skin cancer, using the HAM10000 dataset [24], ref. [25] conducted an extensive benchmarking comparing various configurations of qubit rotations and Pauli gates in their proposed quanvolutional neural network (QuanvNN). Additionally, they evaluated the performance of several pretrained models, both in conjunction with quantum support vector machine (QSVM) and independently. They found that combining the Y-axis qubit rotation RY with the Pauli-Z gate in the quantum convolutional layer produced the optimal classification accuracy of 82.86%, outpacing the best results obtained with a classical pretrained MobileNet that obtained an accuracy of 73.42%.

These results are summarized in Table 2.

To investigate quantum advantage, ref. [4] conducted a large-scale study that tested 12 quantum models in 6 binary classification tasks and compared their efficiencies with out-of-the-box classical models. Their findings show that vanilla classic models outperform quantum models. Moreover, they show that removing entanglement from the design of these quantum models leads to an improvement in their performance, suggesting that quantumness may not always lead to improvement. ref. [4] shows that out of 55 publicly available papers with the terms quantum machine learning and outperform, 40% claim that quantum models outperform classical models and 50% claim that improving a quantum method such as optimizers [26,27], the strategy of pre-training [28] or the used ansatz [29,30] lead to better performance than the original models. Only 4% [31,32,33] found that the quantum models did not outperform the classical ones. They attribute this to the noise levels in current quantum hardware, suggesting that hardware limitations rather than the quantum models themselves were responsible for the lack of superior performance.

3. Proposed Method

In this section, we will provide more details about the proposed approach and methodology for QU-Net.

3.1. Data

The ISIC 2017 [6] dataset, released by the International Skin Imaging Collaboration (ISIC), is an open-source collection of high pixel-density dermoscopic images aimed at advancing melanoma detection through automated image analysis. All images in this dataset are dermoscopic RGB photographs captured using standardized optical devices. It contains the following:

2750 skin cancer images with different sizes.
Classification labels
Segmentation masks
Metadata (age and sex)

The dataset comprises 2000 training images, 150 validation images and 600 test images, each accompanied by corresponding ground truth data and additional information about the patient. It is structured to support two primary tasks: lesion segmentation and disease classification. The dataset includes images from various international clinical centers, ensuring a diverse representation of skin lesion types and imaging conditions.

As shown in Figure 1, it is challenging to visually distinguish between benign and malignant skin cancer due to their similar appearances.

After conducting exploratory data analysis on the ISIC-2017 dataset, we identified two major issues. Firstly, the dataset includes images from various sources, resulting in inconsistent image sizes. The second issue is the severe imbalance of the distribution of the skin labels, 81.3% of the cases present in the dataset are benign while 18.7% are malignant. This means that the benign class outnumbers the malignant class approximately 4 to 1.

To address the issues of data heterogeneous sizes while maintaining a high resolution, we standardize the images by resizing them to a uniform shape of (512, 512, 3) using bilinear interpolation. Additionally, we crop images to (400, 400, 3) to focus on regions of interest where the tumors are centered, while excluding areas of normal skin, which reduces the model-trainable parameters. We also augment training data to improve model performance and reduce the effect of imbalance on class learning. Figure 2 shows the image processing steps.

3.2. Proposed Model

In 2015, ref. [34] introduced the U-Net architecture. While its original goal was image segmentation, where it reduced reliance on handcrafted techniques [35], it became widely used for many purposes like image de-noising, generative AI and image embedding, which will be the use-case of our work.

As shown in Figure 3, U-Net is composed of four main parts:

Contracting Path (Encoder): The encoder is responsible for capturing context by progressively down-sampling the input image. This is achieved by a series of convolutional and max-pooling layers that reduce the spatial dimensions of the feature maps while increasing the depth.
Bottleneck: The bottleneck serves as a bridge between the contracting and the expansive paths. Its design compresses the data flow into a limited set of values to capture the most essential information.
Expansive Path (Decoder): The decoder is responsible for producing the desired output by expanding the feature maps. It consists of a series of up-sampling techniques (transposed convolutions, unpooling, …) and information fusion operations (concatenation, pixel-wise addition, …) with corresponding high-resolution features from the contracting path.
Skip Connections: They serve to fuse up-sampled information from the bottleneck with high-resolution feature maps extracted by the encoder. They connect each layer in the encoder with its corresponding symmetric layer on the decoder.

Given the current limitations of quantum hardware and the noisy intermediate-scale quantum (NISQ) [36,37], Appendix A imposes constraints such as a limited number of qubits. Using a quantum layer as the bottleneck in the U-Net architecture aligns with the concept of a bottleneck, which is designed to compress the data flow into a limited set of values, making efficient use of the available quantum resources.

To test the efficiency of QU-Net, we compare it with a classical U-Net architecture. We define the two models with the same architecture, hyperparameters, optimizer, loss function, and training procedure. The only difference is that the QU-Net’s bottleneck layer is a parameterized quantum circuit (PQC), while the classical U-Net uses a fully connected layer.

3.2.1. Parametrized Quantum Circuits

Parameterized Quantum Circuits (PQCs) [38] are the building blocks of Quantum Neural Networks (QNNs). They consist of quantum circuits with tunable parameters, which can be optimized similarly to the weights in classical neural networks. These parameters are adjusted iteratively to minimize a predefined cost function, which enables the quantum model to learn and generalize from data. Figure 4 shows a simple PQC of two qubits parametrized by a Y-axis rotation angle:

A PQC [38] consists of a sequence of quantum gates, which can be classified into two types:

Fixed gates: These are predefined operations that establish entanglement or transformations that do not change during training (CNOT, Hadamard, etc.).
Parameterized gates: These are quantum operations that depend on adjustable parameters, usually represented as rotation angles (Rx, Ry, etc.).

The most fundamental quantum gates are listed in Table 3.

A PQC is generally represented as a sequence of unitary transformations, parameterized by a set of angles

θ

:

U (θ) = U_{L} (θ_{L}) \dots U_{2} (θ_{2}) U_{1} (θ_{1})

(1)

where

$U_{i} (θ_{i})$ represents the unitary transformation at the i-th layer. It may contain multiple parameters or be parameter-free.
L is the total number of layers.
$θ = {θ_{1}, θ_{2}, \dots, θ_{L}}$ denotes the set of all trainable parameters.

The optimization of these parameters is typically performed using a hybrid quantum-classical approach, where a classical optimizer updates the parameters based on a quantum-measured cost function.

3.2.2. Quantum State Preparation and Encoding

Since quantum circuits operate exclusively on quantum states, an essential step before applying PQCs is the encoding of classical data into quantum states [39]. This encoding process determines how classical information is mapped onto the Hilbert space. Various encoding techniques exist, including:

Basis Encoding: Directly encoding classical bits into qubit states.
Amplitude Encoding: Encoding classical data as quantum amplitudes, allowing compact representation.
Angle Encoding: Encoding data using rotation angles of quantum gates.

The choice of encoding method impacts the expressiveness and efficiency of the quantum model. Once the data are encoded, PQCs manipulate the quantum states through entanglement and transformation operations to extract meaningful patterns for tasks such as classification, regression, or generative modeling. In this context, we will experiment with amplitude and angle encoding.

In amplitude encoding, classical data values are encoded into the probability amplitudes of a quantum state. Given a classical data vector

x = [x_{1}, x_{2}, \dots, x_{N}]

, n qubits are required to encode

x

onto a quantum state, where

N \leq 2^{n}

, it is encoded as follows:

|ψ〉 = \sum_{i = 0}^{2^{n} - 1} \frac{x_{i}}{∥ x ∥} |i〉

(2)

Here,

| i 〉

is the computational basis state corresponding to the binary representation of i, expressed as

| b_{1} b_{2} \dots b_{n} 〉

, where

b_{j} \in {0, 1}

.

\frac{x_{i}}{∥ x ∥}

are the normalized data amplitudes, where

∥ x ∥

is the Euclidean norm of the vector

x

, defined as follows:

∥ x ∥ = \sqrt{\sum_{i = 0}^{2^{n} - 1} {| x_{i} |}^{2}}

(3)

The normalization ensures that this condition is satisfied:

\sum_{i = 0}^{2^{n} - 1} {|\frac{x_{i}}{∥ x ∥}|}^{2} = 1

(4)

In amplitude encoding, the number of qubits n required to encode the data is logarithmic to the size of the data vector

x

. Specifically, for a data vector of size N, the number of qubits needed to perform amplitude encoding of the data is as follows:

n_{q u b i t s} = ⌈ {log}_{2} N ⌉

(5)

This means that n qubits can represent

2^{n}

data points, enabling the encoding of an exponentially large dataset into a quantum state. We use amplitude encoding to prepare quantum states, allowing us to load up to

2^{7} = 128

features from the QU-Net encoder onto the quantum layer:

| ψ 〉 = \sum_{i = 0}^{127} c_{i} | i 〉 = \frac{x_{0}}{∥ x ∥} \underset{7 qubits}{\underset{︸}{| 0000000 〉}} + \frac{x_{1}}{∥ x ∥} | 0000001 〉 + \dots + \frac{x_{127}}{∥ x ∥} | 1111111 〉

(6)

In angle encoding, each classical feature is encoded as a rotation angle applied to a single qubit. Given a data vector

x = [x_{1}, x_{2}, \dots, x_{n}]

, one qubit is used per feature, and the quantum state is prepared as follows:

| ψ (x) 〉 = ⨂_{j = 1}^{n} R_{Y} (x_{j}) | 0 〉 = ⨂_{j = 1}^{n} (cos \frac{x_{j}}{2} | 0 〉 + sin \frac{x_{j}}{2} | 1 〉) .

(7)

Each feature

x_{j}

is mapped to a rotation angle (typically after scaling), and no global normalization is required. The number of qubits scales linearly with the number of features:

n_{qubits} = n .

(8)

Figure 5 shows the used parametrized quantum circuits. It employs 7 qubits simulated using PennyLane’s default.qubit backend.

This is followed by the parametrized part, which incorporates three rotational gates per qubit. Each qubit is transformed by a Rot gate (

R (ϕ, θ, ω)

) gate. For a single qubit, the rotation matrix is as follows:

R (ϕ, θ, ω) = (\begin{matrix} e^{- i (ϕ + ω) / 2} cos (θ / 2) & e^{- i (ϕ - ω) / 2} sin (θ / 2) \\ - e^{i (ϕ - ω) / 2} sin (θ / 2) & e^{i (ϕ + ω) / 2} cos (θ / 2) \end{matrix})

(9)

For a single qubit in the state

| ψ 〉 = α | 0 〉 + β | 1 〉

, applying

R (ϕ, θ, ω)

results in the following:

| ψ^{'} 〉 = R (ϕ, θ, ω) | ψ 〉 = (\begin{matrix} e^{- i (ϕ + ω) / 2} cos (θ / 2) & e^{- i (ϕ - ω) / 2} sin (θ / 2) \\ - e^{i (ϕ - ω) / 2} sin (θ / 2) & e^{i (ϕ + ω) / 2} cos (θ / 2) \end{matrix}) (\begin{matrix} α \\ β \end{matrix})

(10)

These parameterized rotations allow for manipulating the qubits’ states by tunable angles

ϕ

,

θ

, and

ω

.

After applying the rotational gates, entanglement can be introduced using CNOT gates. The CNOT gate acts on two qubits and is represented by the following matrix:

CNOT = (\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{matrix})

(11)

For example, if the control qubit is in a superposition:

| ψ_{control} 〉 = \frac{1}{\sqrt{2}} (| 0 〉 + | 1 〉)

(12)

and the target qubit is in the

| 0 〉

state:

| ψ_{target} 〉 = | 0 〉

(13)

the combined state is as follows:

| ψ_{combined} 〉 = | ψ_{control} 〉 \otimes | ψ_{target} 〉 = \frac{1}{\sqrt{2}} (\begin{matrix} 1 \\ 1 \end{matrix}) \otimes (\begin{matrix} 1 \\ 0 \end{matrix}) = \frac{1}{\sqrt{2}} (\begin{matrix} 1 \\ 0 \\ 1 \\ 0 \end{matrix})

(14)

Applying the CNOT gate results in the final entangled state:

| ψ_{entangled} 〉 = CNOT \cdot | ψ_{combined} 〉 = \frac{1}{\sqrt{2}} (\begin{matrix} 1 \\ 0 \\ 0 \\ 1 \end{matrix}) = \frac{1}{\sqrt{2}} (| 00 〉 + | 11 〉)

(15)

When all qubits are connected with CNOT gates, a change in one qubit state influences the states of the remaining qubits.

The circuit is concluded with Pauli-Z measurements on each qubit, extracting the expectation values of the qubits in the Z-basis. The Pauli-Z operator is defined as follows:

Z = (\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix})

(16)

For a single qubit in the state:

| ψ 〉 = α | 0 〉 + β | 1 〉

(17)

where

α

and

β

are complex amplitudes, the probabilities of observing

| 0 〉

and

| 1 〉

are given by the following:

P (0) = {| α |}^{2}, P (1) = {| β |}^{2}

(18)

The expectation value of the Pauli-Z operator for this state is calculated as follows:

〈 Z 〉 = 〈 ψ | Z | ψ 〉

(19)

Substituting

| ψ 〉 = (\begin{matrix} α \\ β \end{matrix})

, we have the following:

〈 Z 〉 = (\begin{matrix} α^{*} & β^{*} \end{matrix}) (\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}) (\begin{matrix} α \\ β \end{matrix})

(20)

Performing the matrix multiplication:

〈 Z 〉 = (\begin{matrix} α^{*} & β^{*} \end{matrix}) (\begin{matrix} α \\ - β \end{matrix})

(21)

This simplifies to the following:

〈 Z 〉 = {| α |}^{2} - {| β |}^{2}

(22)

Thus, the expectation value of the Pauli-Z operator corresponds to the difference in probabilities of the qubit being in the

| 0 〉

and

| 1 〉

states:

〈 Z 〉 = P (0) - P (1)

(23)

Hence, the Pauli-Z measurement is analogous to the ArgMax operation in classical machine learning, where the index of the highest predicted probability, corresponding to the predicted class, is returned. These measurements convert the quantum state back into a classical representation, which is subsequently passed to the decoder part of the model.

Figure 6 shows the architecture of the proposed QU-Net and the evolution of the image shape throughout the model. Initially, we employ a self-supervised learning approach, wherein the objective of QU-Net is to reconstruct the input images. The mean squared error is computed between the output of the model and the image itself and optimized to reduce the loss between the predictions and the identity. This method enables the model to learn a general representation of the images while preserving task independence. Consequently, the bottleneck layer generates a compressed representation of the image that remains uninfluenced by the optimization of any specific task. This ensures that the features captured are general and not biased toward a particular objective; therefore, they can represent the original data.

The rationale behind this approach is to take advantage of the compressed representations learned for the classification of skin cancer, where we try to distinguish between benign and malignant cases. As illustrated in Figure 7, by concatenating these representations with metadata features, we can implement a multi-modal approach that integrates both image data and relevant metadata for skin cancer classification. This approach not only structures the new data representation and allows space-efficient storage, but also enables the use of various classifiers such as Random Forest, XGBoost, and MLP to classify the images.

Similarly, we train a classical U-Net architecture as illustrated in Figure 8 to compare its reconstruction performance with that of QU-Net.

Additionally, we evaluate the learned embedding from the classical U-Net against the quantum embedding in the classification task to assess their respective capabilities in representing the image data. As shown in Figure 9, the output of the encoder is concatenated with other relevant features to create a structured dataset that can be categorized using classical models such as SVM, MLP, random forest…

In QU-Net, the use of 7 qubits in the quantum layer results in the skin images being embedded in a vector of 7 values. The classical U-Net generates an image embedding vector of 128 values. Approximately, all models have 3.6M Parameters.

4. Experimental Results

In this section, we compare the results of the different PQC configurations. The combination with the best results will be compared with a classic U-Net on the reconstruction and the classification of the images represented by the bottleneck embedding, thereby assessing their capability of compressing the information in a meaningful way.

4.1. Reconstruction

Figure 10 show, respectively, images of skin cancer, their reconstructed versions generated by the different models, and the error.

Although all models achieved comparable results in image reconstruction (Figure 10), the loss evolution across experiments reveals notable differences in generalization capability among the architectures.

While training curves showed similar convergence behavior overall (Figure 11), inconsistencies became more apparent on unseen data (Figure 12). Angle encoding-based QU-Nets showed more fluctuations on unseen data. Classical U-Net exhibited more oscillations during validation, whereas QU-Net variants based on amplitude encoding showed smoother and more stable learning dynamics despite operating with a smaller latent representation (7 latent values vs. 128 for U-Net).

In particular, the best reconstruction performance on unseen data was achieved by the Amplitude encoding QU-Net with a single fully entangled layer, converging in 25 epochs and producing the lowest reconstruction error among all the models evaluated, as shown in Figure 13 and Figure 14.

More broadly, amplitude-based PQCs displayed faster and more stable convergence trends compared with angle-based configurations (13, 25, 29 epochs vs. 55, 69, 77 epochs), with the classical U-Net falling in between (44 epochs). These results suggest that amplitude encoding may act as a stabilizing and generalizing factor, particularly in scenarios where the latent bottleneck is extremely compressed.

Table 4 compares the average mean squared error (MSE) per image of U-Net and QU-Net variants on train, validation and test sets. The results suggest that QU-Net with amplitude encoding and single entangled layer gives the best reconstruction of skin cancer images.

4.2. Classification

We quantify the quality of the learned embeddings with the performance of simple models such as logistic regression, k-nearest neighbors, and tree-based classifiers, trained on these embeddings to classify instances as either benign or malignant. Logistic Regression (LR) serves to measure the linear separability of the classes. K-Nearest Neighbors (KNN) is used to examine whether instances from similar classes are grouped close to each other based on the Euclidean distance. Tree-based models such as Random Forest (RF) are also utilized to verify the purity of leaf splits. Taking into consideration the severe imbalance of the classes, we utilize the F1-score as our main metric to measure the model’s performance. We also use stratified cross-validation with 10 folds, where class distributions in the train and test sets are the same across all folds.

Figure 15 shows the performance of the classification models on the embeddings of the different QU-Net variants. Best results are obtained with XGBoost.

A comparison of the classification models on the QU-Net variants embeddings is visualized using a heatmap in Figure 16. Best classification results for each of the embeddings are shown in Figure 17.

Figure 18 shows the average F1-score and accuracy in all models for each of the encoding techniques. Overall, amplitude encoding versions led to slightly better measures. Figure 19 illustrates the mean impact of the entanglement Strategy and the number of layers on the F1-score. On average, the PQC with a single layer and no entanglement obtained better results. A trend can be observed, as the PQC becomes more complex, the results tend to decrease, suggesting that, unlike classical ANNs, increasing PQC depth does not necessarily lead to performance improvement.

Table 5 shows a comparison of the classification results of the selected models using the embeddings from U-Net and QU-Net variant with amplitude encoding, no entanglement and one layer. Both were concatenated with metadata.

The results presented in Table 5 demonstrate that QU-Net embeddings consistently outperform U-Net embeddings across all the evaluated classifiers, both in terms of F1 Score (79.03% vs. 74.14%)and Accuracy (81.83% vs. 81.33%). This indicates that the QU-Net representations are more discriminative and informative, although they have significantly fewer dimensions compared with the U-Net embedding (7 vs. 128).

An observation emerges from the performance of the soft voting classifier. Soft voting works by averaging the class probabilities predicted by each base classifier and selecting the class with the highest average probability. When using U-Net embeddings, the soft voting strategy underperforms relatively to some of its individual models. This suggests that probability magnitudes that reflect the confidence of the base classifiers were not sufficient to achieve an improvement through averaging. In contrast, the soft voting strategy achieves the highest performance metrics with QU-Net embeddings compared with individual classifiers. This indicates that models trained on QU-Net embeddings were more confident in their predictions, allowing the ensemble to benefit from probabilistic consensus.

Table 6 shows a comparison of existing methods with our approach. Although no hyperparameter tuning, feature selection, or classification improvement techniques were used in this study, our results seem competitive with works focused on classification.

An additional experiment was conducted where metadata features were not appended to the latent embeddings of U-Net and QU-Net.

Table 7 shows the difference in classification results on the embeddings without metadata. We can observe that generally, QU-Net embeddings obtained better F1-scores.

Table 8 shows a comparison of the classification results of QU-Net and U-Net with and without the multi-modal approach.

We can observe that across all classifiers, the inclusion of metadata improved the classification performance for QU-Net embeddings. The multi-modal approach further enhanced the leaf-purity of the tree-based classifiers. Additionally, the improvement of the Logistic Regression F1-score suggests that the metadata increased the linear separability of classes. Although a decrease can be observed in accuracy, it can be explained by the accuracy’s dependency on class balance. Also, KNN inference amelioration suggests that the metadata reinforced the distance structure of the latent space by reducing intra-class distances while increasing inter-class separability, which aligns well with KNN’s reliance on Euclidean proximity for classification. This indicates that the quantum-enhanced latent space can benefit from auxiliary and meaningful information.

The effect of the multi-modal approach on U-Net embeddings was less consistent. While tree-based classifiers did show performance gains, Logistic Regression produced identical results with and without metadata, suggesting that the added metadata did not introduce any new linear separability into the learned feature space. Furthermore, the same metadata degraded the performance of KNN when added to U-Net embeddings. This implies that the additional features reduced the distance-based consistency of the class clusters. These observations suggest that QU-Net embeddings naturally position similar instances close to each other, and metadata further improves this geometric alignment, whereas U-Net embeddings appear more sensitive to such feature concatenation.

Furthermore, this finding underscores the effectiveness of a multi-modal approach that integrates image-based features with structured and meaningful patient data in the classification of skin cancer.

5. Conclusions and Future Perspective

In this work, we proposed a novel architecture illustrated in Figure 6 that incorporates a quantum layer in the form of a parametrized quantum circuit as the bottleneck of U-Net. We refer to it as QU-Net. The results show that this change has led to several improvements. These include enhanced generalization to unseen images in the reconstruction of skin cancer images. Both models had similar training performance Figure 10, but QU-Net was more able to generalize to unseen images Figure 12. This means that adding a quantum layer helped reduce the overfitting of training data. The quantum layer also enabled faster convergence. The required epochs for QU-Net and U-Net to reach the stable zone are, respectively, 25 and 44, as shown in Figure 12. Additionally, QU-Net obtained better reconstruction error in training, validation and test data, Table 4. Furthermore, the quantum layer led to a more representative embedding of skin cancer images and a more efficient data compression (7 vs. 128). This is demonstrated by the difference in results Table 5 where QU-Net embeddings enabled better classification compared with U-Net embeddings. Also, the multi-modal approach where we train QU-Net Figure 7 and U-Net Figure 9 to learn a general and task-independent structured representation of skin images, and combine it with metadata like patient age and sex, further improved the classification results Table 8.

We also evaluated six PQC configurations, varying the encoding method (amplitude vs. angle), circuit depth, and entanglement strategy. The findings show that amplitude encoding stabilizes training, whereas angle encoding introduces fluctuations. The best reconstruction performance was achieved using amplitude encoding with one variational layer and entanglement, while the best classification performance resulted from the same configuration without entanglement. This suggests that lighter PQCs may generalize better.

Based on our results, it appears that the proposed quantum-enhanced U-Net outperformed the classical version in the reconstruction and classification tasks. Due to the lack of quantum computing resources, a hybrid approach like ours, where we combine classical and quantum machine learning techniques, is necessary to make efficient use of available quantum resources.

In the future, we plan to explore the potential of the suggested architecture in the segmentation task. The chosen quantum circuit used as a bottleneck can be changed to a more sophisticated circuit to put more emphasis on the margin between different class embeddings in the quantum Hilbert space. We also plan to explore an end-to-end hybrid quantum classifier to make the gradient wave back-propagate from error computing to input data processing.

Author Contributions

Conceptualization, K.H., N.M., O.E.M. and M.A.A.; methodology, K.H., N.M., O.E.M., Z.E.A.E. and M.A.A.; software, K.H. and N.M.; validation, K.H., N.M., O.E.M., Z.E.A.E. and M.A.A.; formal analysis, K.H., N.M., O.E.M., Z.E.A.E. and M.A.A.; writing—original draft preparation, K.H.; writing—review and editing, K.H. and M.A.A.; visualization, K.H., N.M. and M.A.A.; supervision, O.E.M., Z.E.A.E. and M.A.A.; funding acquisition, M.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was enabled in part by support provided by the Natural Sciences and Engineering Research Council of Canada (NSERC), funding reference number RGPIN-2024-05287; the AI in Health Research Chair at the Université de Moncton and the Mitacs Globalink Research Award, Grant Number IT39642.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study is publicly available from the ISIC-2017 dataset: https://challenge.isic-archive.com/data/#2017 (accessed on 3 July 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

QML	Quantum Machine Learning
ISIC	International Skin Imaging Collaboration
QU-Net	Quantum-Enhanced U-Net
PQC	Parameterized Quantum Circuit
QNN	Quantum Neural Network
BCC	Basal Cell Carcinoma
SCC	Squamous Cell Carcinoma
NISQ	Noisy Intermediate-Scale Quantum
MSE	Mean Squared Error
CNN	Convolutional Neural Network
QSVM	Quantum Support Vector Machine
QCNN	Quantum Convolutional Neural Network
GCN	Graph Convolutional Network
FCL	Fully Connected Layer
VQC	Variational Quantum Circuit
IoMT	Internet of Medical Things
VQE	Variational Quantum Eigen-Solver
MHEA	Modified Hardware Efficient Ansatz
QuanvNN	Quanvolutional Neural Network
QPCA	Quantum Principal Component Analysis
RF	Random Forest
KNN	K-Nearest Neighbors
LR	Logistic Regression

Appendix A. Quantum Machine Learning

Appendix A.1. Quantum Neural Networks

Quantum Neural Networks (QNNs) [40,41] represent one of the most promising intersections of quantum computing and machine learning. By taking advantage of quantum effects like superposition and entanglement, QNNs became a field of active research where researchers try to achieve a quantum speed-up similar to the achievements in computational tasks.

Figure A1. Quantum neural network.

Quantum Neural Networks [40,41] are quantum circuits that mimic the structure and function of classical neural networks. In a QNN, parametrized quantum gates play the role of neurons, where the weights are replaced by the gate parameters and the quantum states of qubits represent the data being processed. QNNs typically consist of a sequence of layers of quantum gates that are applied to transform quantum states into a more desirable output. Similar to classic ANNs, a QNN tries to map an input to a desired output by optimizing a loss function that quantifies the difference between predictions and ground truth. An optimization step leads to a back-propagation that adjusts the parameters of the gates.

Structure of a Quantum Neural Network

A typical QNN consists of a sequence of the following components:

Data Encoding Layer: in this layer, we embed the classical representation of data into a quantum state in the $H i l b e r t$ space in order to manipulate it by the next parametrized gates. There are several encoding techniques, each with its advantages, like amplitude encoding and angle encoding [39].
Parameterized Quantum Layers: This is the part where the quantum states are manipulated and prepared by parametrized gates; the parameters of these gates are adjusted during training to minimize a cost function.
Measurement Layer: After applying the quantum layers, we take measurements of the quantum states to make predictions and compare them to ground labels.

Appendix A.2. Quantum Models Are Kernel Methods

Kernel methods are a class of machine learning models that follow the following structure:

f (x) = \sum_{i} α_{i} k (ϕ (x_{i}), ϕ (x))

(A1)

where

ϕ

is the kernel function and

k (x_{i}, x)

is a function that measures the similarity between instances and

α_{i}

are scalars. This formula can be understood as projecting the data into a higher dimension where linear separation becomes possible, then performing a

α_{i}

weighted K-nearest neighbours.

Figure A2. A linear classifier in 3D is equivalent to a non-linear decision boundary in 2D. The purple shaded region represents the linear hyperplane in the higher-dimensional space, and the blue line indicates the corresponding non-linear decision boundary in 2D.

In kernel methods, any kernel-based model is equivalent to a linear classifier operating in a high-dimensional feature space induced by the kernel function. This linear separation in the high-dimensional space corresponds to a non-linear decision boundary in the original input space:

k (x, x^{'}) = 〈 ϕ (x) | ϕ (x^{'}) 〉

(A2)

where

ϕ

is the kernel function and k is the inner product.

After the introduction of QSVM [19] and QPCA [42], the potential of quantum models as kernel methods [43,44] became evident. These models not only project data into a high-dimensional space, such as Hilbert space, but also use the kernel trick by directly computing inner products within that space. This inspired many researchers to further enhance these quantum kernels in various ways, but the trainability of a quantum circuit would not directly be possible due to the quantum nature of the computations.

Appendix A.3. The Parameter Shift Rule

The parameter-shift rule [45,46] provides a way to calculate quantum circuit gradients since quantum gates are typically trigonometric functions of their parameters.

To understand the parameter shift rule, let us consider a simple example:

The derivative of

sin (x)

is

cos (x)

. Interestingly, this can also be expressed as the average of shifted sin functions:

cos (x) = \frac{1}{2} [sin (x + \frac{π}{2}) + sin (x - \frac{π}{2})]

(A3)

This important result can be generalized by the following expression:

\frac{\partial 〈 \hat{O} 〉}{\partial θ} = \frac{1}{2} [〈 \hat{O} (θ + \frac{π}{2}) 〉 - 〈 \hat{O} (θ - \frac{π}{2}) 〉]

(A4)

which shows that the derivative of a trigonometric function with respect to a parameter

θ

can be obtained by evaluating the function itself at two shifted values of

θ

.

The shift rule [45,46] also implies that both the computations and their derivatives can be obtained using the same quantum circuit by simply shifting the parameters and re-evaluating the circuit. We can efficiently compute the necessary gradients without needing to modify the circuit structure itself. With this rule, it became possible to compute the gradients and make the circuit parameters trainable.

Appendix B. CO₂ Emissions

The next table shows the CO₂ emissions of this project.

Table A1. CO₂ Emissions of the Project.

Parameter	Value
CO₂ Emission per Hour	244.125 g
Hours per Week	35
Total Duration (Weeks)	10.43 weeks
Total Hours Worked	365.05 h
Total CO₂ Emission	89.12 kg

References

Cerezo, M.; Verdon, G.; Huang, H.Y.; Cincio, L.; Coles, P.J. Challenges and opportunities in quantum machine learning. Nat. Comput. Sci. 2022, 2, 567–576. [Google Scholar] [CrossRef] [PubMed]
Shor, P. Algorithms for quantum computation: Discrete logarithms and factoring. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Santa Fe, NM, USA, 20–22 November 1994; pp. 124–134. [Google Scholar] [CrossRef]
Grover, L.K. A fast quantum mechanical algorithm for database search. In Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing, New York, NY, USA, 22–24 May 1996; STOC ’96. pp. 212–219. [Google Scholar] [CrossRef]
Bowles, J.; Ahmed, S.; Schuld, M. Better than Classical? The Subtle Art of Benchmarking Quantum Machine Learning Models. arXiv 2024, arXiv:2403.07059. [Google Scholar] [CrossRef]
Government of Canada. Skin Cancer. 2025. Available online: https://www.canada.ca/en/public-health/services/sun-safety/skin-cancer.html (accessed on 10 April 2025).
Codella, N.C.F.; Gutman, D.; Celebi, M.E.; Helba, B.; Marchetti, M.A.; Dusza, S.W.; Kalloo, A.; Liopyris, K.; Mishra, N.; Kittler, H.; et al. Skin lesion analysis toward melanoma detection: A challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, DC, USA, 4–7 April 2018; pp. 168–172. [Google Scholar] [CrossRef]
Rani, V.; Kumar, M.; Gupta, A.; Sachdeva, M.; Mittal, A.; Saluja, K. Self-supervised learning for medical image analysis: A comprehensive review. Evol. Syst. 2024, 15, 1607–1633. [Google Scholar] [CrossRef]
Azizi, S.; Culp, L.; Freyberg, J.; Mustafa, B.; Baur, S.; Kornblith, S.; Chen, T.; Tomasev, N.; Mitrović, J.; Strachan, P.; et al. Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nat. Biomed. Eng. 2023, 7, 756–779. [Google Scholar] [CrossRef] [PubMed]
Stahlschmidt, S.; Ulfenborg, B.; Synnergren, J. Multimodal deep learning for biomedical data fusion: A review. Brief. Bioinform. 2022, 23, bbab569. [Google Scholar] [CrossRef]
Lloyd, S.; Schuld, M.; Ijaz, A.; Izaac, J.; Killoran, N. Quantum embeddings for machine learning. arXiv 2020, arXiv:2001.03622. [Google Scholar] [CrossRef]
Wang, F.; Hengjin, k.; Cai, C. Deep Wavelet Self-Attention Non-negative Tensor Factorization for non-linear analysis and classification of fMRI data. Appl. Soft Comput. 2025, 182, 113522. [Google Scholar] [CrossRef]
Goyal, M.; Knackstedt, T.; Yan, S.; Hassanpour, S. Artificial intelligence-based image classification methods for diagnosis of skin cancer: Challenges and opportunities. Comput. Biol. Med. 2020, 127, 104065. [Google Scholar] [CrossRef]
Amiri, S.; Nasrolahzadeh, M.; Mohammadpoory, Z.; Zare Kordkheili, A. Skin Lesion Classification via ensemble method on deep learning. Multimed. Tools Appl. 2024, 84, 19379–19397. [Google Scholar] [CrossRef]
Hu, J.; Xiang, Y.; Lin, Y.; Du, J.; Zhang, H.; Liu, H. Multi-Scale Transformer Architecture for Accurate Medical Image Classification. In Proceedings of the 2025 International Conference on Artificial Intelligence and Computational Intelligence, Kuala Lumpur, Malaysia, 14–16 February 2025; Association for Computing Machinery: New York, NY, USA, 2025; pp. 409–414. [Google Scholar] [CrossRef]
Cinthya, M.; Yustanti, W.; Nuryana, I.K.D.; Putra, C.D.; Putra, R.W.; Dayu, D.P.K.; Faradilla, A.E.; Kurniasari, C.I. Automated Skin Cancer Classification Using VGG16-Based Deep Learning Model. In Proceedings of the E3S Web of Conferences, Surabaya, Indonesia, 5 July 2025; Volume 645, p. 04003. [Google Scholar] [CrossRef]
Mari, A.; Bromley, T.R.; Izaac, J.; Schuld, M.; Killoran, N. Transfer learning in hybrid classical-quantum neural networks. Quantum 2020, 4, 340. [Google Scholar] [CrossRef]
Sagingalieva, A.; Kordzanganeh, M.; Kenbayev, N.; Kosichkina, D.; Tomashuk, T.; Melnikov, A. Hybrid quantum neural network for drug response prediction. Cancers 2023, 15, 2705. [Google Scholar] [CrossRef]
Shahwar, T.; Zafar, J.; Almogren, A.; Zafar, H.; Rehman, A.U.; Shafiq, M.; Hamam, H. Automated detection of Alzheimer’s via hybrid classical quantum neural networks. Electronics 2022, 11, 721. [Google Scholar] [CrossRef]
Rebentrost, P.; Mohseni, M.; Lloyd, S. Quantum Support Vector Machine for Big Data Classification. Phys. Rev. Lett. 2014, 113, 130503. [Google Scholar] [CrossRef] [PubMed]
Choudhuri, R.; Halder, A. Brain MRI tumour classification using quantum classical convolutional neural net architecture. Neural Comput. Appl. 2022, 35, 4467–4478. [Google Scholar] [CrossRef]
Qu, Z.; Li, Y.; Tiwari, P. QNMF: A quantum neural network based multimodal fusion system for intelligent diagnosis. Inf. Fusion 2023, 100, 101913. [Google Scholar] [CrossRef]
Shoieb, D.; Younes, A.; Youssef, S.; Fathalla, K. HQMC-CPC: A Hybrid Quantum Multiclass Cardiac Pathologies Classification Integrating a Modified Hardware Efficient Ansatz. IEEE Access 2024, 12, 18295–18314. [Google Scholar] [CrossRef]
Swathi, G.; Altalbe, A.; Kumar, R.P. QuCNet: Quantum-Inspired Convolutional Neural Networks for Optimized Thyroid Nodule Classification. IEEE Access 2024, 12, 27829–27842. [Google Scholar] [CrossRef]
Tschandl, P.; Rosendahl, C.; Kittler, H. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data 2018, 5, 180161. [Google Scholar] [CrossRef]
Reka, S.; Karthikeyan, L.; Shakil, J.; Venugopal, P.; Muniraj, M. Exploring Quantum Machine Learning for Enhanced Skin Lesion Classification: A Comparative Study of Implementation Methods. IEEE Access 2024, 12, 104568–104584. [Google Scholar] [CrossRef]
Ito, K.; Fujii, K. SantaQlaus: A resource-efficient method to leverage quantum shot-noise for optimization of variational quantum algorithms. arXiv 2023, arXiv:2312.15791. [Google Scholar] [CrossRef]
Wiedmann, M.; Hölle, M.; Periyasamy, M.; Meyer, N.; Ufrecht, C.; Scherer, D.D.; Plinge, A.; Mutschler, C. An Empirical Comparison of Optimizers for Quantum Machine Learning with SPSA-Based Gradients. In Proceedings of the 2023 IEEE International Conference on Quantum Computing and Engineering (QCE), Bellevue, WA, USA, 17–22 September 2023; IEEE: New York, NY, USA, 2023; pp. 450–456. [Google Scholar] [CrossRef]
Kashif, M.; Rashid, M.; Al-Kuwari, S.; Shafique, M. Alleviating Barren Plateaus in Parameterized Quantum Machine Learning Circuits: Investigating Advanced Parameter Initialization Strategies. In Proceedings of the 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE), Valencia, Spain, 25–27 March 2024; pp. 1–6. [Google Scholar] [CrossRef]
Nha Minh Le, I.; Kiss, O.; Schuhmacher, J.; Tavernelli, I.; Tacchino, F. Symmetry-invariant quantum machine learning force fields. New J. Phys. 2025, 27, 023015. [Google Scholar] [CrossRef]
T West, M.; Sevior, M.; Usman, M. Reflection equivariant quantum neural networks for enhanced image classification. Mach. Learn. Sci. Technol. 2023, 4, 035027. [Google Scholar] [CrossRef]
Piatkowski, N.; Gerlach, T.; Hugues, R.; Sifa, R.; Bauckhage, C.; Barbaresco, F. Towards Bundle Adjustment for Satellite Imaging via Quantum Machine Learning. In Proceedings of the 2022 25th International Conference on Information Fusion (FUSION), Linköping, Sweden, 4–7 July 2022; pp. 1–8. [Google Scholar] [CrossRef]
Bordoni, S.; Stanev, D.; Santantonio, T.; Giagu, S. Long-Lived Particles Anomaly Detection with Parametrized Quantum Circuits. Particles 2023, 6, 297–311. [Google Scholar] [CrossRef]
Schreiber, F.J.; Eisert, J.; Meyer, J.J. Classical Surrogates for Quantum Learning Models. Phys. Rev. Lett. 2023, 131, 100803. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar] [CrossRef]
Madabhushi, A.; Metaxas, D. Combining low-, high-level and empirical domain knowledge for automated segmentation of ultrasonic breast lesions. IEEE Trans. Med. Imaging 2003, 22, 155–169. [Google Scholar] [CrossRef] [PubMed]
Preskill, J. Quantum computing in the NISQ era and beyond. Quantum 2018, 2, 79. [Google Scholar] [CrossRef]
Bharti, K.; Cervera-Lierta, A.; Kyaw, T.H.; Haug, T.; Alperin-Lea, S.; Anand, A.; Degroote, M.; Heimonen, H.; Kottmann, J.S.; Menke, T.; et al. Noisy intermediate-scale quantum algorithms. Rev. Mod. Phys. 2022, 94, 015004. [Google Scholar] [CrossRef]
Schuld, M.; Bocharov, A.; Svore, K.M.; Wiebe, N. Circuit-centric quantum classifiers. Phys. Rev. A 2020, 101, 032308. [Google Scholar] [CrossRef]
Ranga, D.; Rana, A.; Prajapaat, S.; Kumar, P.; Kumar, K.; Vasilakos, A. Quantum Machine Learning: Exploring the Role of Data Encoding Techniques, Challenges, and Future Directions. Mathematics 2024, 12, 3318. [Google Scholar] [CrossRef]
Schuld, M.; Sinayskiy, I.; Petruccione, F. The quest for a Quantum Neural Network. Quantum Inf. Process. 2014, 13, 2567–2586. [Google Scholar] [CrossRef]
Beer, K.; Bondarenko, D.; Farrelly, T.; Osborne, T.; Salzmann, R.; Scheiermann, D.; Wolf, R. Training deep quantum neural networks. Nat. Commun. 2020, 11, 808. [Google Scholar] [CrossRef] [PubMed]
Lloyd, S.; Mohseni, M.; Rebentrost, P. Quantum Principal Component Analysis. Nat. Phys. 2014, 10, 631–633. [Google Scholar] [CrossRef]
Schuld, M. Supervised quantum machine learning models are kernel methods. arXiv 2021, arXiv:2101.11020. [Google Scholar] [CrossRef]
Hubregtsen, T.; Wierichs, D.; Gil-Fuster, E.; Derks, P.J.H.S.; Faehrmann, P.K.; Meyer, J.J. Training quantum embedding kernels on near-term quantum computers. Phys. Rev. A 2022, 106, 042431. [Google Scholar] [CrossRef]
Crooks, G.E. Gradients of parameterized quantum gates using the parameter-shift rule and gate decomposition. arXiv 2019, arXiv:1905.13311. [Google Scholar] [CrossRef]
Schuld, M.; Bergholm, V.; Gogolin, C.; Izaac, J.; Killoran, N. Evaluating analytic gradients on quantum hardware. Phys. Rev. A 2019, 99, 032331. [Google Scholar] [CrossRef]

Figure 1. Visualization of different skin cancer images from the ISIC-2017 dataset. Top: benign cases; bottom: malignant cases.

Figure 2. Data processing pipeline showing the steps from original images to preprocessed inputs.

Figure 3. Example of a U-Net architecture showing the encoder-decoder structure with skip connections.

Figure 4. Example of a Parametrized Quantum Circuit with two qubits.

Figure 5. The quantum circuits used as bottleneck for QU-Net with 7 qubits. Circled plus and black circles represent C-Not gates responsible for entangling qubits.

Figure 6. The QU-Net is trained in a self-supervised approach to ensure task-independent quantum embedding that will be used for the classification of skin cancer images.

Figure 7. A multi-modal approach where quantum embeddings of skin images are concatenated with metadata and used to classify skin cancer.

Figure 8. U-Net model trained for the reconstruction, its data compressed representation is used for the classification.

Figure 9. Classic embeddings concatenated with metadata to classify skin cancer.

Figure 10. Skin cancer images and their reconstructions by the QU-Net variants and by U-Net.

Figure 11. Loss curve on train data, the models converge similarly. Shaded regions represent ± standard deviation.

Figure 12. Loss curve on unseen data. Shaded regions represent ± standard deviation.

Figure 13. Validation loss on the last 20 epochs.

Figure 14. Loss comparison between the different models.

Figure 15. F1 Score by embedding type.

Figure 16. Heatmap of the performance of the models on the different data embeddings.

Figure 17. Best F1-score per embedding type (10-fold CV).

Figure 18. Average performance: amplitude vs. angle encoding.

Figure 19. Entanglement strategy and number of layers Impact on F1 Score.

Table 1. Deep learning models for skin lesion classification on ISIC-2017.

References	Method	Description	F1-Score
[13]	Inception-ResNet-v2 + EfficientNet-B4	Ensemble with Soft-Attention	79%
[14]	Improved ViT	Weighted loss + lesion-focused regularization	88.4%
[14]	ResNet50/VGG19/ResNeXt/ViT	Baseline comparison models	85.2%, 83.5%, 86%, 87%
[15]	Modified VGG-16	Transfer learning approach	71%

Table 2. Quantum-enhanced models across different tasks.

References	Tasks	Method	Results
[17]	Patient drug response prediction	GCN for drugs + CNN for cell lines + QNN	15% better than FCL
[18]	Alzheimer detection	ResNet34 + QSVM	97.2% (QSVM) vs. 92.2% (classical)
[20]	Brain MRI binary classification	QCNN	98.72% (QCNN) vs. 94.23% (CNN)
[21]	Breast cancer and COVID-19 diagnosis	QCNN + VQC	97.07% (breast), 97.61% (COVID)
[22]	Cardiac pathology classification	QHEA + multimodal data	3.19–7.77% better than other models
[23]	Thyroid cancer classification	Quantum filter + QCNN	97.63% vs. 93.87% (classical)
[25]	Skin cancer classification (HAM10000)	QuanvNN + QSVM	82.86% (QNN) vs. 73.42% (MobileNet)

Table 3. Quantum Gates: Symbols, Matrices, and Descriptions.

Gate Name	Qubits	Unitary Matrix	Description
Pauli-X (NOT)	1	$(\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix})$	Analogous to classical NOT gate: switches $\| 0 〉$ to $\| 1 〉$ and vice versa
Pauli-Y	1	$(\begin{matrix} 0 & - i \\ i & 0 \end{matrix})$	Rotation through $π$ radians around Bloch sphere Y-axis
Pauli-Z (phase flip)	1	$(\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix})$	Rotation through $π$ radians around Bloch sphere Z-axis
X-Rotation	1	$(\begin{matrix} cos \frac{θ}{2} & - i sin \frac{θ}{2} \\ - i sin \frac{θ}{2} & cos \frac{θ}{2} \end{matrix})$	Rotates state vector about the Bloch sphere X-axis by $θ$
Y-Rotation	1	$(\begin{matrix} cos \frac{θ}{2} & - sin \frac{θ}{2} \\ sin \frac{θ}{2} & cos \frac{θ}{2} \end{matrix})$	Rotates state vector about the Bloch sphere Y-axis by $θ$
Z-Rotation	1	$(\begin{matrix} e^{- i \frac{θ}{2}} & 0 \\ 0 & e^{i \frac{θ}{2}} \end{matrix})$	Rotates state vector about the Bloch sphere Z-axis by $θ$
Hadamard	1	$\frac{1}{\sqrt{2}} (\begin{matrix} 1 & 1 \\ 1 & - 1 \end{matrix})$	Transforms a basis state into an even superposition of the two basis states
CNOT (Controlled-NOT)	2	$(\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{matrix})$	Applies Pauli-X to target qubit if control qubit is $\| 1 〉$
SWAP	2	$(\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{matrix})$	Swaps the states of two qubits
Rot ( $R (ϕ, θ, ω)$ )	1	$(\begin{matrix} e^{- i (ϕ + ω) / 2} cos (θ / 2) & e^{- i (ϕ - ω) / 2} sin (θ / 2) \\ - e^{i (ϕ - ω) / 2} sin (θ / 2) & e^{i (ϕ + ω) / 2} cos (θ / 2) \end{matrix})$	A general rotation gate that combines rotations around Z and Y axes: $R_{Z} (ω) R_{Y} (θ) R_{Z} (ϕ)$ . Provides flexible state preparation and transformation.

Table 4. Average train, validation, and test MSE per image of QU-Net variants and U-Net.

	Amplitude			Angle
Loss (MSE)	1-Ent	2-Ent	1-NoEnt	1-Ent	2-Ent	1-NoEnt	U-Net
Train	0.000106	0.000079	0.000075	0.000084	0.000071	0.000072	0.000115
Validation	0.000133	0.000160	0.000155	0.000170	0.000146	0.000149	0.000147
Test	0.000151	0.000172	0.000172	0.000190	0.000165	0.000163	0.000172

Bold values indicate the best (lowest) result in each row.

Table 5. Comparison of classification performance on U-Net and QU-Net embeddings.

Classifier	U-Net		QU-Net
Classifier	F1 Score (%)	Accuracy (%)	F1 Score (%)	Accuracy (%)
Random Forest (RF)	74.14	81.33	75.86	80.17
LightGBM (LGBM)	72.93	79.33	76.68	80.17
XGBoost (XGB)	77.38	81.33	78.27	80.50
Logistic Regression	70.78	79.33	73.07	80.17
K-Nearest Neighbors	70.43	74.00	76.05	78.33
Soft-Voting (RF-XGB-LGBM)	73.33	80.00	79.03	81.83

Bold values indicate the best (highest) result in each column.

Table 6. Comparison of deep learning models for skin lesion classification on ISIC-2017. Our results represent baseline classifiers performance on quantum embeddings.

References	Method	Description	F1-Score
[15]	Modified VGG-16	Transfer learning approach	71%
[13]	Inception-ResNet-v2 + EfficientNet-B4	Ensemble with Soft-Attention	79%
Ours	QU-Net (quantum-enhanced U-Net)	Quantum embeddings + metadata + baseline classifier	79.03%
[14]	ResNet50/VGG19/ ResNeXt/ViT	Baseline comparison models	85.2%, 83.5%, 86%, 87%
[14]	Improved ViT	Weighted loss + lesion-focused regularization	88.4%

Table 7. F1-score Comparison of U-Net vs. QU-Net (No Metadata).

Classifier	U-Net F1	QU-Net F1	Difference (Q − U)
Random Forest	71.11	72.97	+1.86
LightGBM	72.29	73.23	+0.94
XGBoost	69.77	73.73	+3.96
Logistic Regression	70.78	71.80	+1.02
K-Nearest Neighbors	72.39	71.87	−0.52
Soft-Voting (RF–XGB–LGBM)	71.00	72.78	+1.78

Bold values indicate the best (highest) result in each row.

Table 8. Comparison of classification performance on U-Net and QU-Net embeddings with and without metadata.

Classifier	U-Net		QU-Net
Classifier	With Metadata (%)	Without Metadata (%)	With Metadata (%)	Without Metadata (%)
Random Forest	F1: 74.14	F1: 71.11	F1: 75.86	F1: 72.97
Random Forest	Acc: 81.33	Acc: 80.00	Acc: 80.17	Acc: 80.00
LightGBM	F1: 72.93	F1: 72.29	F1: 76.68	F1: 73.23
LightGBM	Acc: 79.33	Acc: 80.00	Acc: 80.17	Acc: 78.83
XGBoost	F1: 77.38	F1: 69.77	F1: 78.27	F1: 73.73
XGBoost	Acc: 81.33	Acc: 77.33	Acc: 80.50	Acc: 78.33
Logistic Regression	F1: 70.78	F1: 70.78	F1: 73.07	F1: 71.80
Logistic Regression	Acc: 79.33	Acc: 79.33	Acc: 80.17	Acc: 80.50
K-Nearest Neighbors	F1: 70.43	F1: 72.39	F1: 76.05	F1: 71.87
K-Nearest Neighbors	Acc: 74.00	Acc: 76.00	Acc: 78.33	Acc: 76.00
Soft-Voting (RF-XGB-LGBM)	F1: 73.33	F1: 71.00	F1: 79.03	F1: 72.78
Soft-Voting (RF-XGB-LGBM)	Acc: 80.00	Acc: 78.00	Acc: 81.83	Acc: 78.83

Bold values indicate the best result between with and without metadata configurations for each model type (U-Net/QU-Net) per row.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Halab, K.; Marzoug, N.; El Meslouhi, O.; Abou Elassad, Z.E.; Akhloufi, M.A. QU-Net: Quantum-Enhanced U-Net for Self Supervised Embedding and Classification of Skin Cancer Images. Big Data Cogn. Comput. 2026, 10, 12. https://doi.org/10.3390/bdcc10010012

AMA Style

Halab K, Marzoug N, El Meslouhi O, Abou Elassad ZE, Akhloufi MA. QU-Net: Quantum-Enhanced U-Net for Self Supervised Embedding and Classification of Skin Cancer Images. Big Data and Cognitive Computing. 2026; 10(1):12. https://doi.org/10.3390/bdcc10010012

Chicago/Turabian Style

Halab, Khidhr, Nabil Marzoug, Othmane El Meslouhi, Zouhair Elamrani Abou Elassad, and Moulay A. Akhloufi. 2026. "QU-Net: Quantum-Enhanced U-Net for Self Supervised Embedding and Classification of Skin Cancer Images" Big Data and Cognitive Computing 10, no. 1: 12. https://doi.org/10.3390/bdcc10010012

APA Style

Halab, K., Marzoug, N., El Meslouhi, O., Abou Elassad, Z. E., & Akhloufi, M. A. (2026). QU-Net: Quantum-Enhanced U-Net for Self Supervised Embedding and Classification of Skin Cancer Images. Big Data and Cognitive Computing, 10(1), 12. https://doi.org/10.3390/bdcc10010012

Article Menu

QU-Net: Quantum-Enhanced U-Net for Self Supervised Embedding and Classification of Skin Cancer Images

Abstract

1. Introduction

2. Related Works

3. Proposed Method

3.1. Data

3.2. Proposed Model

3.2.1. Parametrized Quantum Circuits

3.2.2. Quantum State Preparation and Encoding

4. Experimental Results

4.1. Reconstruction

4.2. Classification

5. Conclusions and Future Perspective

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Quantum Machine Learning

Appendix A.1. Quantum Neural Networks

Structure of a Quantum Neural Network

Appendix A.2. Quantum Models Are Kernel Methods

Appendix A.3. The Parameter Shift Rule

Appendix B. CO₂ Emissions

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

QU-Net: Quantum-Enhanced U-Net for Self Supervised Embedding and Classification of Skin Cancer Images

Abstract

1. Introduction

2. Related Works

3. Proposed Method

3.1. Data

3.2. Proposed Model

3.2.1. Parametrized Quantum Circuits

3.2.2. Quantum State Preparation and Encoding

4. Experimental Results

4.1. Reconstruction

4.2. Classification

5. Conclusions and Future Perspective

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Quantum Machine Learning

Appendix A.1. Quantum Neural Networks

Structure of a Quantum Neural Network

Appendix A.2. Quantum Models Are Kernel Methods

Appendix A.3. The Parameter Shift Rule

Appendix B. CO2 Emissions

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Appendix B. CO₂ Emissions