Predicting Brain Age and Gender from Brain Volume Data Using Variational Quantum Circuits

The morphology of the brain undergoes changes throughout the aging process, and accurately predicting a person’s brain age and gender using brain morphology features can aid in detecting atypical brain patterns. Neuroimaging-based estimation of brain age is commonly used to assess an individual’s brain health relative to a typical aging trajectory, while accurately classifying gender from neuroimaging data offers valuable insights into the inherent neurological differences between males and females. In this study, we aimed to compare the efficacy of classical machine learning models with that of a quantum machine learning method called a variational quantum circuit in estimating brain age and predicting gender based on structural magnetic resonance imaging data. We evaluated six classical machine learning models alongside a quantum machine learning model using both combined and sub-datasets, which included data from both in-house collections and public sources. The total number of participants was 1157, ranging from ages 14 to 89, with a gender distribution of 607 males and 550 females. Performance evaluation was conducted within each dataset using training and testing sets. The variational quantum circuit model generally demonstrated superior performance in estimating brain age and gender classification compared to classical machine learning algorithms when using the combined dataset. Additionally, in benchmark sub-datasets, our approach exhibited better performance compared to previous studies that utilized the same dataset for brain age prediction. Thus, our results suggest that variational quantum algorithms demonstrate comparable effectiveness to classical machine learning algorithms for both brain age and gender prediction, potentially offering reduced error and improved accuracy.


Introduction
Neuroimaging-derived brain age serves as a valuable biomarker for monitoring the progression of brain-related conditions and aging [1].This metric, often termed "brain age," is calculated using machine learning algorithms applied to magnetic resonance imaging (MRI) data to predict an individual's chronological age.The disparity between the predicted brain age and the actual chronological age reflects deviations from typical age trajectories and is utilized to assess brain health [1].Elevated brain age relative to chronological age has been correlated with diminished cognitive abilities.Moreover, mental health characteristics, such as Alzheimer's disease [2], mild cognitive impairment [2], focal epilepsy [3], multiple sclerosis [4], traumatic brain injury [5], schizophrenia [6,7], bipolar disorder [8], major depressive disorder [9], etc., have been associated with an increased brain age difference.These findings underscore the significance of the brain age difference as a biomarker for assessing brain health.The number of publications related to these studies is increasing every year [1].
Gender classification based on neuroimaging data has emerged as a crucial area of research with significant implications across various domains, including neuroscience, medicine, and psychology [10][11][12].The ability to accurately classify gender from neuroimaging data offers valuable insights into the inherent neurological differences between males and females [10][11][12].For instance, Flint et al. [10] demonstrated an increased misclassification in transgender women when employing structural MRI data for biological sex classification.Understanding these characteristics is essential for unraveling the complexities of brain structure, function, and development, as well as for addressing gender-related disparities in health and cognition.Moreover, gender classification from neuroimaging data contributes to the elucidation of sex-specific brain disorders and conditions.This capability enables researchers and clinicians to discern gender-specific patterns, thereby facilitating early detection, intervention, and treatment of neurological disorders that may manifest differently between males and females.
Quantum machine learning has emerged as a promising tool to enhance classical machine learning techniques [24].Research indicates that both quantum and quantuminspired computing models have the potential to optimize the training process of conventional models, resulting in improved prediction accuracy for target functions with reduced iteration requirements [25,26].Several studies have highlighted the practical advantages of quantum machine learning algorithms, demonstrating their superior performance over classical counterparts in predicting complex medical outcomes [25] and image restoration [26].Among various quantum machine learning methods, parameterized quantum circuits (PQCs), variational quantum circuits (VQCs), or quantum neural networks (QNNs) stand out as particularly promising.For instance, researchers have utilized hybrid quantum neural networks to discover drug molecules [25] and recover contaminated ghost images [26], showcasing superior performance compared to classical counterparts with fewer iterations and higher accuracy, especially when dealing with limited datasets.This suggests their potential for addressing pharmacological and medical challenges, such as predicting patient responses to different medications or evaluating patient prognosis and diagnosis.
This study investigates the application of VQC in predicting brain age and gender using brain morphological features derived from structural MRI data.To the best of our knowledge, these applications represent a novel endeavor.We aim to assess the performance of VQC in comparison with classical machine-learning algorithms.The goal of this study is to explore the potential of quantum machine learning models in predicting brain age and gender based on brain morphometric data, providing invaluable insights into age-related disorders.

Description of Dataset
In this study, we utilized three primary datasets: the IXI dataset (n = 563, age range 18-88 years, https://brain-development.org [14] (accessed on 27 April 2023)), the CAU dataset (n = 156, age range 55-83 years [15]), and an in-house collected dataset (n = 438, age range 14-89 years [27][28][29]).All participants included in our analysis underwent careful screening following local study protocols to confirm their status as healthy in-range 14-89 years [27][28][29]).All participants included in our analysis underwent careful screening following local study protocols to confirm their status as healthy individuals without a history of neurological, psychiatric, or major medical conditions.T1-weighted MRI scans were acquired using either 1.5T or 3T scanners.Detailed information regarding the acquisition protocols for each dataset can be found in the corresponding references [14,15,[27][28][29]. Figure 1 and Table 1 provide an overview of the age and gender distributions across our datasets.For each distribution of datasets, the details are provided in Supplementary Figure S1 and Tables S1-S3.Ethical approvals and informed consents were locally obtained for each dataset to ensure compliance with relevant research ethics guidelines.

Image Processing and Feature Extraction
The structural brain T1-weighted MRI scans of all subjects were processed using the FastSurfer v2.1.0[30], except for the CAU dataset, which had been processed using Free-Surfer [31] run on Ubuntu Linux operating system version 22.04 LTS and was provided in a spreadsheet format, not as raw images.FastSurfer, an alternative version of FreeSurfer, employs deep learning techniques for structural MRI processing.The FastSurfer brain segmentations were carried out on Google Colab using the 'Tutorial_FastSurf-erCNN_QuickSeg.ipynb'notebook.In brief, cortical and subcortical segmentation for each subject was conducted on their T1-weighted image through a series of steps, including skull stripping, segmentation of cortical gray and white ma er, and identification of subcortical structures.Further technical details about the pipeline can be found in reference [30].Notably, this method is highly efficient, taking only a few minutes per subject.

Image Processing and Feature Extraction
The structural brain T1-weighted MRI scans of all subjects were processed using the Fast-Surfer v2.1.0[30], except for the CAU dataset, which had been processed using FreeSurfer [31] run on Ubuntu Linux operating system version 22.04 LTS and was provided in a spreadsheet format, not as raw images.FastSurfer, an alternative version of FreeSurfer, employs deep learning techniques for structural MRI processing.The FastSurfer brain segmentations were carried out on Google Colab using the 'Tutorial_FastSurferCNN_QuickSeg.ipynb'notebook.In brief, cortical and subcortical segmentation for each subject was conducted on their T1-weighted image through a series of steps, including skull stripping, segmentation of cortical gray and white matter, and identification of subcortical structures.Further technical details about the pipeline can be found in reference [30].Notably, this method is highly efficient, taking only a few minutes per subject.
This study utilized estimated subcortical and cortical volume parcellation data.Based on previous studies [14,15], we selected 34 segmentation features from the available 95 labels (refer to Table 2), and later reduced these to 17 features using principal component analysis (PCA) decomposition for age and gender prediction models.This reduction was partly necessitated by the limited qubits available for quantum machine learning algorithms.

Machine Learning Algorithms
Brain age and gender prediction were performed using the scikit-learn library [32] for classical machine learning algorithms and the tensorcircuit package [33] for quantum machine learning algorithms.The tensorcircuit package was selected for its efficiency and ability to utilize a relatively large number of qubits in our experimental environment, allowing us to employ up to 17 qubits in our case.All machine learning algorithms were executed on Google Colab.
In quantum machine learning models, we used variational quantum circuits for age prediction and gender classification tasks.Our VQC model was implemented based on the 'Quantum Machine Learning for Classification Tasks' tutorial notebook [33].We adapted the Ising ZZ coupling gates to CNOT gates (Figure 2).The quantum logic gates used in this study are detailed in Table 3.The quantum circuit in Figure 2 was created using the Pennylane framework [34].
In quantum machine learning models, we used variational quantum circuits for age prediction and gender classification tasks.Our VQC model was implemented based on the 'Quantum Machine Learning for Classification Tasks' tutorial notebook [33].We adapted the Ising ZZ coupling gates to CNOT gates (Figure 2).The quantum logic gates used in this study are detailed in Table 3.The quantum circuit in Figure 2 was created using the Pennylane framework [34].

Model Training and Evaluation
Before applying PCA embedding to the model input, we utilized MinMaxScaler from the scikit-learn library to scale the features between zero and one.This normalized feature vector serves as the input for both classical and quantum machine learning algorithms.Focusing on the quantum model, the feature vector underwent transformation into a quantum layer within the VQC.This quantum layer comprised three components: em-

Model Training and Evaluation
Before applying PCA embedding to the model input, we utilized MinMaxScaler from the scikit-learn library to scale the features between zero and one.This normalized feature vector serves as the input for both classical and quantum machine learning algorithms.Focusing on the quantum model, the feature vector underwent transformation into a quantum layer within the VQC.This quantum layer comprised three components: em-

Model Training and Evaluation
Before applying PCA embedding to the model input, we utilized MinMaxScaler from the scikit-learn library to scale the features between zero and one.This normalized feature vector serves as the input for both classical and quantum machine learning algorithms.Focusing on the quantum model, the feature vector underwent transformation into a quantum layer within the VQC.This quantum layer comprised three components: em-

Model Training and Evaluation
Before applying PCA embedding to the model input, we utilized MinMaxScaler from the scikit-learn library to scale the features between zero and one.This normalized feature vector serves as the input for both classical and quantum machine learning algorithms.Focusing on the quantum model, the feature vector underwent transformation into a quantum layer within the VQC.This quantum layer comprised three components: embedding (PCA embedding was employed here), variational layers, and measurement.In our study, we utilized 17 qubits and constructed 10 repeated blocks for the VQC architecture (Figure 2).The normalized classical features were encoded into the quantum Hilbert space, with the resulting quantum state representing the input data from the preceding classical layer.Each variational layer within the VQC consisted of two parts: rotations with trainable parameters and control gates, typically subsequent to CNOT operations (Figure 2 and Table 3).These rotations acted as quantum gates, transforming the encoded input data based on variational parameters, whereas the CNOT operations entangled the qubits in the quantum layer, facilitating the creation of quantum superposition.Each block contained three layers.In the measurement component, all the qubits were measured and summed at a single node.Subsequently, a sigmoid activation function was applied to produce the final output.Thus, the output of the VQC provided predictions of brain age or gender values.The performance of the VQC model was compared with that of classical machine learning models.
For model training, the preprocessed data were shuffled and distributed once into the training (80%) and testing (20%) sets.We selected this split ratio to ensure sufficient data for training the model while preserving a reasonable portion for testing purposes.The split was performed randomly.
To enhance the representativeness of the dataset and mitigate inadvertent biases, the order of the samples in the training set was shuffled at each epoch, whereas it remained unchanged in the test set.For optimization, we employed an adaptive moment estimation (ADAM) optimizer with a learning rate set to 0.01.
To evaluate the effectiveness of the brain age prediction model, we primarily used the mean absolute error (MAE) metric.This metric measures the discrepancy between the predicted brain age ( ŷ) and the corresponding chronological age (y) for each sample in our dataset.The MAE is defined as follows: where N is the number of samples in the dataset.The model's successful performance is indicated by the low values of the MAE.Other regression metrics, such as the mean squared error (MSE), root mean squared error (RMSE), and r-squared, were also estimated.
On the other hand, to evaluate the performance of the gender classification model, we primarily used the accuracy score defined as follows:

Accuracy = Number o f correct predictions Total number o f predictions
Other classification metrics such as precision, recall, and f1-score were also estimated.All models were implemented in Python and executed on Google Colab.The classical machine learning algorithms were implemented with scikit-learn, while the quantum machine learning algorithm was implemented with the Tensorcircuit framework.Additionally, we conducted an experiment by training the classical and quantum machine learning models with the same hyperparameters on varying sizes of training data, including 57, 115, 231, 462, 694, and 925 samples.

Algorithm Performance for Brain Age Prediction
The performance of each algorithm in the combined dataset is depicted in Figure 3 and Table 4 for both the training set (left four figures) and the hold-out test set (right four figures).Additional metrics, including mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), and r-squared (R 2 ), are presented.For a more comprehensive view, performance metrics for various training sample sizes are detailed in Supplementary Tables S4-S8.
Other classification metrics such as precision, recall, and f1-score were also estimated.All models were implemented in Python and executed on Google Colab.The classical machine learning algorithms were implemented with scikit-learn, while the quantum machine learning algorithm was implemented with the Tensorcircuit framework.Additionally, we conducted an experiment by training the classical and quantum machine learning models with the same hyperparameters on varying sizes of training data, including 57, 115, 231, 462, 694, and 925 samples.

Algorithm Performance for Brain Age Prediction
The performance of each algorithm in the combined dataset is depicted in Figure 3 and Table 4 for both the training set (left four figures) and the hold-out test set (right four figures).Additional metrics, including mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), and r-squared (R 2 ), are presented.For a more comprehensive view, performance metrics for various training sample sizes are detailed in Supplementary Tables S4-S8.The prediction performance varied with the regression algorithms.When the training sample size was 925, the best prediction performance was achieved using VQC (MAE = 6.744,MSE = 80.092, MRSE = 8.949, and R 2 = 0.798), whereas the worst perfor-

Algorithm Performance for Gender Prediction
The gender classification performance of each algorithm on the combined dataset is visualized in Figure 4 and summarized in Table 5 for

Algorithm Performance for Gender Prediction
The gender classification performance of each algorithm on the combined dataset is visualized in Figure 4 and summarized in Table 5 for     In gender classification tasks, accuracy, precision, recall, and f1-score values varied across different training sample sizes.For instance, with a training sample size of 925, accuracy ranged from 0.753 to 0.818, with the highest accuracy achieved by VQC and the lowest by XGBoost.Similarly, precision values ranged between 0.754 and 0.885, recall values ranged between 0.744 and 0.815, and f1-score values ranged from 0.770 to 0.837.VQC consistently demonstrated the best prediction performance across various sample sizes, whereas XGBoost exhibited the lowest performance.The performance trends for the other sample sizes followed a similar pattern, with VQC consistently outperforming the other algorithms in terms of accuracy, precision, recall, and F1-score metrics.

Comparative Study for Brain Age Prediction
For comparative analysis, we constructed a VQC model to predict brain age using the IXI and CAU sub-datasets.The model's performance metrics were as follows: in the IXI dataset, the model achieved an MAE of 6.265, MSE of 65.812, RMSE of 8.106, and R 2 of 0.759 on the training set (N = 450), and an MAE of 7.201, MSE of 83.074, RMSE of 9.114, and R 2 of 0.679 on the test set (N = 112) (Table 6).degraded notably with smaller sample sizes, indicating susceptibility to overfitting or inadequate model complexity.In contrast, the multi-layer perceptron (MLP) demonstrated robust performance across various sample sizes, indicating its adaptability to diverse dataset characteristics.
Similarly, in gender prediction tasks, observed variations in accuracy, precision, recall, and F1-score values across different training sample sizes underscore the significance of both algorithm selection and dataset characteristics.Once again, VQCs consistently outperformed other classical machine learning algorithms across varying sample sizes, achieving superior metrics for all performance measures.This consistent superiority highlights the potential of QML, particularly VQCs, in gender classification tasks, attributed to its ability to capture complex data relationships and generalize across different sample sizes.
Furthermore, the comparative analysis of the VQC model's performance in predicting brain age using IXI and CAU sub-datasets provided valuable insights into its effectiveness across diverse datasets.Our findings indicate that VQC outperforms previous studies [14,15] that utilized Automatic Relevance Determination (Table 6) and Bayesian Ridge (Table 7) algorithms, achieving better brain age prediction metrics using brain morphometric data.These results suggest the superiority of VQC in accurately predicting brain age across different datasets.
One interesting finding is that, although VQC did not demonstrate superior performance compared to other algorithms in the training set for both brain age prediction and gender classification tasks, it exhibited excellent performance in the test set (Figures 3 and 4).This result implies that QML may possess better generalization capabilities than CML algorithms.The quantum advantage might indeed have played a role in enabling this enhanced performance [35].
Overall, our study contributes to the expanding body of literature on QML applications in healthcare and neuroscience.While our findings demonstrate promising results for VQC in brain age regression and gender classification tasks, further research is warranted to explore its generalizability and integration into clinical practice for neurological research.
The limitation of this study is that, first, we generally could not demonstrate that our model outperforms deep-learning-based models in other previous studies [17][18][19][20][21][22][23].For instance, in the brain age prediction task, Wang et al. [36] examined a T1-weighted MRI dataset of 3688 dementia-free participants with a mean age of 66 years, utilizing a convolutional neural network (CNN) deep learning algorithm to predict brain age.They achieved a mean absolute error (MAE) of 4.45 years.Hwang et al. [18] explored the feasibility and clinical relevance of brain age prediction using axial T2-weighted images of healthy subjects with a deep CNN model.The CNN model was trained with 1530 scans, and the MAE evaluated the performance between the predicted age and the chronological age based on an internal and external test dataset.The model showed MAEs of 4.22 years in the internal test set and 9.96 years in the external test set.Mendes, S.L et al. [11] employed two public datasets, ABIDE-II and ADHD-200, comprising healthy controls (HC, N = 894), autism spectrum disorder (ASD, N = 251), and attention deficit hyperactivity disorder (ADHD, N = 357) individuals, for age prediction and gender classification tasks.They utilized T1-weighted sMRI scans and preprocessed gray and white matter images using Voxel-Based Morphometry (VBM), and subsequently trained models with 3D convolutional neural networks (CNNs).Their best-performing model, trained on the ADHD-200 dataset, achieved an MAE of 1.43 years and an R 2 score of 0.62 for age prediction on the test set.For gender classification, the model achieved an AUC-ROC of 0.85, with precision, recall, and F1-score values of 0.84, 0.81, and 0.83, respectively.Conversely, when using the ABIDE-II dataset, the age prediction model yielded an MAE of 1.63 and an R 2 score of 0.54, while the gender classification model achieved an AUC-ROC of 0.82, with precision, recall, and F1-score values of 0.87, 0.80, and 0.83, respectively.
As our study did not employ the same datasets as those mentioned above, a direct comparison might be challenging.However, it appears that the deep-learning-based studies cited above demonstrated higher performance metrics than ours, likely owing to commonalities in their methodologies.Specifically, many of these studies minimized or completely avoided the preprocessing steps, trained deep learning models directly on raw images, or used minimal transformations.In contrast, our study involved preprocessing to extract brain morphometry features, and the limited number of qubits required for VQCs hindered us from training the model using all the features, potentially leading to information loss.To address these challenges, hybrid approaches that combine classical and quantum machine learning [25,26] and employ techniques such as quanvolutional neural networks [37,38] or data reuploading [39] could potentially yield better results.In addition, our study did not demonstrate the clinical utility of age prediction and gender classification, which may require disease-specific or atypical data.Therefore, future research should focus on applying improved models to a broader range of applications, including clinical scenarios, to demonstrate their practical relevance.

Conclusions
In conclusion, our study compared quantum and classical machine learning algorithms for brain age regression and gender classification.We found that variational quantum circuits (VQCs) consistently outperformed or were comparable to classical algorithms across both tasks.Although VQCs consistently showed superior performance, limitations such as information loss due to preprocessing and qubit constraints were noted.Future research should explore hybrid approaches or advanced techniques to address these challenges and demonstrate their practical relevance in clinical scenarios.

Figure 1 .
Figure 1.Age and sex distribution across datasets: (A) the datasets exhibit bimodal-like age distribution (e.g., young and elderly).(B) Sex distribution across the datasets reveals a balanced representation of male and female samples.

Figure 1 .
Figure 1.Age and sex distribution across datasets: (A) the datasets exhibit bimodal-like age distribution (e.g., young and elderly).(B) Sex distribution across the datasets reveals a balanced representation of male and female samples.

Figure 2 .
Figure 2. Variational quantum circuit (VQC) architecture for brain age regression and gender classification.T1-weighted structural MRI data undergo segmentation and feature selection, resulting in 34 features.These features are normalized and reduced to 17 elements using principal component analysis (PCA).The 17 features are then fed into the VQC, where trainable operations (Rx, Ry, Rz) and CNOT operations are applied across 10 blocks.After measurements, the outputs are combined into a single layer for brain age prediction or gender classification.Note that the blue arrows represent the direction of forward processing, and the blue circles denote individual features.

Figure 2 .Table 3 . 13 Table 3 . 4 . 13 Table 3 .
Figure 2. Variational quantum circuit (VQC) architecture for brain age regression and gender classification.T1-weighted structural MRI data undergo segmentation and feature selection, resulting in 34 features.These features are normalized and reduced to 17 elements using principal component analysis (PCA).The 17 features are then fed into the VQC, where trainable operations (Rx, Ry, Rz) and CNOT operations are applied across 10 blocks.After measurements, the outputs are combined into a single layer for brain age prediction or gender classification.Note that the blue arrows represent the direction of forward processing, and the blue circles denote individual features.

13 Table 3 .
2024, 14, x FOR PEER REVIEW 6 of Quantum logic gates used in this study.

13 Table 3 .
Brain Sci.2024, 14, x FOR PEER REVIEW 6 of Quantum logic gates used in this study.

Figure 3 .
Figure 3. Relationship between training sample sizes and performance of classical (LR, BR, XGB, RF, SVR, MLP) and quantum machine learning (VQC) models for brain age predictions (plots display MAE (A,C), MSE (B,D), RMSE (E,G), and R 2 (F,H) values for both train (A,B,E,F) and test (C,D,G,H) sets against training size).

Figure 3 .
Figure 3. Relationship between training sample sizes and performance of classical (LR, BR, XGB, RF, SVR, MLP) and quantum machine learning (VQC) models for brain age predictions (plots display MAE (A,C), MSE (B,D), RMSE (E,G), and R 2 (F,H) values for both train (A,B,E,F) and test (C,D,G,H) sets against training size).
both the training and holdout test sets.The prediction performance varied across classification algorithms.Key metrics such as accuracy, precision, recall, and f1-score values are presented.The detailed results for the different training sample sizes are shown in Supplementary Tables S9-S13.

Figure 4 .
Figure 4. Relationship between training sample sizes and performance of classical (LR, KNN, XGB, RF, SVC, MLP) and quantum machine learning (VQC) models for gender predictions (plots display accuracy (A,C), precision (B,D), recall (E,G), and f1-score (F,H) for both train (A,B,E,F) and test (C,D,G,H) sets against training size).

Figure 4 .
Figure 4. Relationship between training sample sizes and performance of classical (LR, KNN, XGB, RF, SVC, MLP) and quantum machine learning (VQC) models for gender predictions (plots display accuracy (A,C), precision (B,D), recall (E,G), and f1-score (F,H) for both train (A,B,E,F) and test (C,D,G,H) sets against training size).

Table 1 .
Demographics of subjects included in this study.

Table 1 .
Demographics of subjects included in this study.

Table 2 .
The 34 selected features from MRI brain volume segmentation data.

Table 4 .
Age prediction performance of various machine learning regressors.

Table 4 .
Age prediction performance of various machine learning regressors.

Table 5 .
Gender prediction performance of various machine learning classifiers.

Table 6 .
Comparative study of IXI dataset for brain age prediction in the training data (N = 450) and prediction performance (N = 113).