1. Introduction
Alzheimer’s disease (AD) is described as a fatal degenerative dementia [
1]. It affects patients’ memory, thinking, and communication abilities, accounting for 60% to 70% of progressive cognitive impairment cases in the elderly [
2]. Currently, approximately 50 million individuals worldwide are affected by Alzheimer’s disease and related dementias. Owing to the rise in life expectancy, it is anticipated that by 2050, the global patient population will reach 139 million [
3,
4], which poses significant impacts on society, the patients themselves, and their families. Alzheimer’s disease progresses slowly, and there is still no effective treatment to prevent, halt, or reverse the disease [
5]. In this context, Artificial Intelligence is emerging as a promising tool for not only enhancing the efficiency and effectiveness of healthcare services, but also for providing optimal strategies to clinicians [
6]. Specifically, in early diagnosis, functional magnetic resonance imaging (fMRI) and deep learning methods play a critical role [
7]. Functional MRI is a neuroimaging technique that utilizes measurements of cerebral blood flow to visualize brain activity [
8]. This technique has been widely used for the automated diagnosis of brain disorders, thereby providing a rich source of information about the brain’s functional connectivity. Recent studies suggest that AD’s functional connectivity undergoes extensive changes, providing important clues for understanding the disease mechanism and searching for effective treatments.However, diagnostic accuracy remains a challenge, and there is a need for non-invasive biomarkers that can detect prodromal or early pathophysiological changes in AD symptoms. The blood oxygen level-dependent (BOLD) signal in fMRl, which is an indicator of basic underlying vascular and respiratory factors in the brain, has recently been investigated as a potential biomarker for AD. The variability of the BOLD signal, especially at cardiorespiratory frequencies, is found to increase among AD patients and may be associated with the impairment of the lymphatic clearance system responsible for clearing soluble proteins and metabolites from the central nervous system. Therefore, the investigation of BOLD signal variability could provide important insights into the pathophysiological changes of AD and contribute to the development of non-invasive biomarkers for early diagnosis and effective treatments. The study by Tuovinen et al. [
9] provides evidence of a non-invasive, highly sensitive biomarker for diagnosing incipient AD using BOLD signal variability measurements from fMRI, which could contribute to the development of effective treatments and early interventions for AD.
The data generated by fMRI are typically presented in the form of images, which represent the signal intensity of various brain regions at distinct time points. These images are commonly aligned with standard brain region templates and processed into adjacency matrices. Each matrix value signifies the connection strength between the corresponding brain regions, exhibiting promising classification results when utilizing deep learning methods such as CNN. However, since CNN’s convolution and pooling operations are based on the local similarity of Euclidean data, the brain networks, as represented by their adjacency matrices, exhibit non-Euclidean properties due to the complex, irregular, and high-dimensional structure of the underlying connectivity patterns. CNN cannot effectively capture all their properties. This is because CNN is fundamentally designed for Euclidean data and cannot effectively capture all properties inherent in non-Euclidean spaces. In contrast, a Graph Neural Network (GNN) excels at handling such complex structures. GNNs operate directly on the graph structure and can aggregate or ‘pool’ information from a node’s neighbors—the nodes it is directly connected to. This aggregation forms a new representation of the node, encapsulating information from its local neighborhood. As a result, a GNN is particularly well-suited for handling non-Euclidean data, and it has recently been widely used for brain network classification problems. However, GNNs have limited representation capabilities and cannot distinguish certain simple graph structures [
10]. Moreover, they struggle to explain classification results in a neuroscientifically interpretable manner [
11]. In recent years, GNNs have proven to be a potent tool for learning representations from graph data. However, the Graph Isomorphism Network (GIN), a specific variant of GNN introduced by Xu et al. [
10], has demonstrated superior performance. The GIN builds upon traditional GNNs by better capturing the rich topological information present in graph data. It employs a unique update rule that considers both node features and edge attributes, ensuring a more comprehensive representation of graph structures. Additionally, GIN is designed to effectively distinguish different graph structures—an ability not shared by all GNN models. This capacity is particularly crucial in tasks where understanding the detailed structure and relationships between nodes is vital. Most notably, GIN’s strength lies in its ability to preserve the property of graph isomorphism, meaning it can distinguish non-isomorphic graphs and recognize isomorphic ones. This feature is integral to its success in tasks such as chemical compound classification, social network analysis, and other applications where subtle variations in graph structure can have significant impacts. It is important to note that while GINs provide a powerful tool for graph data representation, they can be further enhanced by coupling with multi-task learning (MTL) [
12] methods. MTL is a machine learning approach that improves model performance by simultaneously learning multiple related tasks. In contrast to learning each task independently, multi-task learning allows models to share knowledge across different tasks, which helps them learn general features and patterns across tasks, thereby improving generalization capabilities. In the context of our research, incorporating MTL into our GIN model allows the model to learn from not only the graph structure, but also from the related tasks of age prediction and gender classification. This, in turn, can provide a more holistic view and a better understanding of Alzheimer’s disease, contributing to its improved classification and diagnosis. In neuroimaging research, age and gender have been proven to be related to the occurrence and progression of Alzheimer’s disease. Age is the greatest risk factor for Alzheimer’s disease, with the incidence rate increasing significantly with age [
13]. Gender differences may also affect brain structure and functional changes, thereby influencing the incidence and progression of Alzheimer’s disease [
14]. Therefore, by incorporating age prediction and gender classification as auxiliary tasks, we can better classify and diagnose Alzheimer’s disease. Although multi-task learning [
12] has also been successfully applied in Alzheimer’s disease classification research, previous studies have not adequately addressed the task conflict issue during training, particularly in terms of dynamic weight balancing [
15]. Each task may optimize model parameters in different directions, leading to interference between tasks and adversely affecting model performance.
An efficient classification method for Alzheimer’s disease was developed with the core motivation and objective of playing a crucial role in early diagnosis. Limitations of the use of traditional machine learning and graph neural networks in processing graph data were identified, and potential weight imbalance issues in multi-task learning during Alzheimer’s disease classification were acknowledged. The interpretability of the model was emphasized due to its paramount importance in the field of neuroscience, as it aids in understanding the underlying mechanisms of the disease and provides clues for finding effective treatments.
A new Dynamic Multi-Task Graph Isomorphism model for Alzheimer’s disease classification was developed. The main component of the model is a Graph Isomorphism Network (GIN), which classifies fMRI images of Alzheimer’s disease as its primary task. By adopting an improved GIN to analyze the brain functional network and learn node features on adjacency matrices, the network’s representational capability was significantly enhanced. Simultaneously, age prediction and gender classification were incorporated as auxiliary tasks to guide the classification of fMRI images of Alzheimer’s disease. The GradNorm algorithm was employed to dynamically adjust the weights of different tasks during model training, which accelerated the learning process, prevented interference between tasks, and improved learning efficiency.
The remaining parts are structured as follows.
Section 2 presents the related work.
Section 3 introduces our proposed methodology, including the main model, the network architecture employed, and the integration of multi-task learning.
Section 4 provides a comparison of our model with other models, as well as ablation experiments for the model itself.
Section 5 offers a discussion and conclusion.
5. Discussion and Conclusions
A DMT-GIN model is proposed for the classification of Alzheimer’s disease fMRI images. The novelty of this model is manifested in three aspects. Firstly, the node features of brain network data are learned by incorporating a GIN with a self-attention mechanism in the readout layer, effectively capturing the spatial information and topological structure in fMRI images. Secondly, the model adopts a multi-task training approach, outputting age and gender information as auxiliary tasks to guide the classification task of Alzheimer’s disease. Lastly, the GradNorm algorithm is used to dynamically allocate weights for different tasks, adaptively learning the correlations between tasks, thus improving overall performance. To enhance the interpretability of the model, the SHAP method is employed to identify the key brain regions that contribute significantly to the Alzheimer’s disease classification task.
The method was compared with seven other baseline models, and ablation experiments were conducted on the proposed framework for extensive performance evaluation. By observing the accuracy, recall rate, and AUC evaluation metrics in the experimental results, it becomes evident that DMT-GIN significantly outperforms other baseline models in terms of effectiveness.
Our research has certain limitations, the first being the limited scale of the dataset. We used some publicly available datasets for experimentation. However, these datasets are relatively small in size. To fully validate the performance of the model, future research could consider using larger-scale and more diverse datasets. While we have incorporated age and gender as auxiliary tasks in our model, we acknowledge the potential existence of other auxiliary tasks with higher informational value. These tasks could encompass areas such as occupation, family environment, genetic information, lifestyle, and certain medical conditions like smoking status, respiratory problems, hypertension, and diabetes, which are recognized as key risk factors for numerous health conditions. These variables have been successfully employed in past studies, as exemplified by the application of photoplethysmography signals for detecting cardiorespiratory disorders [
57]. Including such a diversity of auxiliary tasks could not only enrich our understanding of the subjects, but also enhance the predictive capabilities of our model. Therefore, future research could explore more auxiliary tasks with potential information value to further optimize model performance, drawing inspiration from such methods. Moreover, we recognize the potential value of incorporating anomaly detection methodologies into our future research. Given the complexity and nonlinearity of the data structures we are dealing with, anomaly detection could provide valuable insights and opportunities for performance optimization.