In this section, the different methods applied for the classification of AD based on medical imaging are reviewed in two categories: The first category analyzes machine learning approaches. The second category deals with the deep learning-based methods employed in the classification of AD.
2.1. Machine Learning-Based Methods
Traditional machine learning methods have various advantages with respect to the classification of AD. They require large amounts of data to understand the trends associated with Alzheimer’s detection. These insights have been utilized by researchers to address the issues associated with AD classification. The various machine learning algorithms used for AD classification include K-Nearest Neighbors (KNN), Decision Trees, Support Vector Machines (SVM), etc.
Machine learning uses handcrafted features for AD classification. For instance, Gao et al. developed a novel method that utilizes a gray-level co-occurrence matrix for feature extraction and an Extreme Learning Machine (ELM) for the classification of AD [
6]. A similar approach was proposed by Sudharsan et al. that uses Informative Vector Machine (IVM), Regularized Extreme Learning Machine (RELM), and SVM for the classification of AD [
7]. Additionally, Principal Component Analysis (PCA) was employed for feature selection and dimensionality reduction. Further, Yi et al. presented a method where morphometric and texture features were extracted and classification was performed by SVM with a Radial Basis Field (RBF) kernel [
8].
A multilayer ensemble decision tree was presented by Naganjaneyulu et al. for the classification of AD [
9]. A weighted feed-forward neural network along with an improved decision tree for feature selection were designed for the classification of AD. Another similar ensemble approach for early AD diagnosis was proposed by Rohini et al. Naive Bayes, KNN and SVM classifiers were combined for multiclass AD classification [
10]. The work proposed by Cabrera-León et al. used resampling techniques to solve class imbalance and compared the non-neural ensemble networks with the counter propagation network for AD classification [
11]. Another novel method based on a Gaussian discriminant analysis-based Computer-Aided Diagnosis (CAD) system was presented by Fang et al. [
12]. Feature selection methods based on variance analysis and incremental analysis were employed for AD screening.
Undoubtedly, machine learning methods are utilized in most AD classification methods. However, researchers have shifted to deep learning algorithms for better performance in AD classification using neuroimaging data. The main reason is that deep learning methods provide better accuracy on diverse data. Furthermore, machine learning methods require domain knowledge for proper feature selection, whereas in deep learning methods, the important features are automatically extracted for precise classification.
2.2. Deep-Learning-Based Methods
Several research studies have utilized various deep learning approaches for the classification of AD. Convolutional Neural Networks (CNN) are widely utilized in image-based disease diagnosis because of the following: (1) They can handle a large number of contextual features, including pathological information. (2) The processing is hierarchical and makes use of spatial relationships across the input image. (3) They are also computationally efficient due to their use of special convolution, pooling, and parameter sharing operations.
The pre-trained CNN models are used in transfer learning, where the learned parameters from one model can be used as input parameters in another model to make predictions. It is most effective when the target data is similar to the input data. A substantial amount of work has been performed in the field of AD classification using transfer learning. Deep neural network architectures such as AlexNet, VGG16, ResNet, etc. have been successfully used for AD classification. For instance, Bae et al. incorporated a modified ResNet-50 architecture for binary AD classification [
13]. Another similar approach presented by Yadav et al. employed the use of axial and sagittal slices of a brain scan in a custom 2D CNN architecture along with ResNet-50 for early AD classification [
14]. Similarly, Sun et al. proposed a modified ResNet-50 architecture that employs spatial transformer networks (STN) and a non-local attention mechanism for early AD diagnosis [
15]. Jain et al. proposed a method that uses the VGG-16 architecture for feature extraction. The final classification was performed using fully connected layers [
16]. Similar to the previous methods, Jiang et al. proposed a method using VGG-16 for transfer learning. Lasso algorithm was utilized for feature selection, and it employed SVM for AD classification [
17]. In an effort to improve accuracy, Kang et al. proposed a multi-modality approach that uses sMRI and Diffusion Tensor Imaging (DTI) images to detect AD using VGG-16 and an SVM classifier [
18]. In addition to the previous methods, Shanmugam et al. presented a comparison between GoogleNet, AlexNet and ResNet-18 for the classification of AD and MCI. ResNet-18 was observed to perform better than the other architectures [
19]. Another work based on the comparison of different pre-trained networks such as EfficientNetB0, DenseNet, ResNet-50, Xception, etc. for AD classification was proposed by Savaş [
20]. From the results, it was inferred that EfficientNetB0 performed slightly better than the other architectures. In a similar effort, Ashraf et al. examined second-generation neural networks and spiking neural networks for AD classification [
21]. It was inferred that DenseNet gave better results in terms of accuracy for the three-way classification of AD.
Convolutional neural networks are used for creating custom models because they are flexible and produce better results than pre-trained models. AbdulAzeem et al. developed a new five-layered customized CNN for the classification of AD. This work used data augmentation and adaptive thresholding for processing the images [
22]. Similarly, Spasov et al. developed a feature extractor sub-network based on grouped and separable convolutions to perform AD classification [
23]. Katabathula et al. proposed a dense CNN architecture that combines global shape representations with hippocampal segmentations for AD classification [
24]. Moreover, the combined score of demographic information, genetic status, and standard cognitive tests was used along with MRI images. Basaia et al. proposed another method that uses data augmentation and standard convolutional layers in a custom 3D CNN instead of max pooling layers [
25]. In contrast to the previous method, Li et al. presented an approach using residual blocks in the CNN for feature extraction to differentiate between the AD classes [
26]. Basheera et al. designed a custom CNN architecture with five convolution layers for GM atrophy-based AD classification [
27]. Later, they developed a novel CNN model to perform binary and multi-class classifications of AD [
28]. They have incorporated inception and residual blocks in the CNN model for deeper feature extraction for early AD classification. The gray matter segmentation from slices was performed using Enhanced Independent Component Analysis (ECIA) [
29]. Raju et al. designed a custom 3D CNN architecture to extract image features and adopted an SVM with an RBF kernel to perform AD classification [
30]. Another work using 3D CNN along with SVM was presented by Feng et al. for the classification of AD [
31]. Similarly, Shen et al. employed a method that used a custom CNN model to extract salient features. SVM was further employed to predict the chances of patients’ conversion from MCI to AD [
32].
As single-modality data can only characterize a few of the degenerative alterations linked to AD, the performance of the classifier may be limited. Hence, considerable research work has been performed to implement classification techniques integrating multi-modal information. For instance, Huang et al. designed a VGG-like network to perform multimodal AD classification [
33]. Another approach was proposed by Venugopalan et al., where stacked denoising auto-encoders were utilized to extract information from genetic and clinical data [
34]. Furthermore, the authors used KNN, decision trees, random forests and SVMs as classifiers. Similarly, Zhou et al. developed a novel approach that used GAN and a Fully Convolutional Network (FCN) on 1.5 T and 3 T MRI scans [
35]. To improve diagnostic performance, Yu et al. presented a novel Generative Adversarial Network (GAN) that uses 3D transposed convolution to generate MRI images [
36]. In contrast to the previous methods, Han et al. designed two approaches for three-way AD classification [
37]. In the first technique, the convolution module and the Cascade of Enhancement Nodes Broad Learning System (CEBLS) modules were combined for the classification of AD. In the subsequent method, the convolution module collected features while the Broad Learning System (BLS) module performed AD classification. To improve the results, Choi et al. presented a novel approach to utilizing a deep Ensemble Generalization Loss (EGL) for weight optimization to perform AD classification using an ensemble of many deep CNNs [
38]. In addition, Zeng et al. employed a novel Deep Belief Network (DBN) that uses dropout and zero masking strategies to enhance the stability and generalization ability of the model [
39]. Rashid et al. developed a novel architecture called the Biceph-Net module that is used in addition to 2D-CNN [
40]. This module was employed in extracting the intra-slice and inter-slice information of the 2D MRI.
The wide use of convolutional neural networks to detect AD based on neuroimaging data is continuously improving the classification performance and has scope for further improvement.