Advanced Integration of Machine Learning Techniques for Accurate Segmentation and Detection of Alzheimer’s Disease

Ali, Esraa H.; Sadek, Sawsan; El Nashef, Georges Zakka; Makki, Zaid F.

doi:10.3390/a17050207

Open AccessArticle

Advanced Integration of Machine Learning Techniques for Accurate Segmentation and Detection of Alzheimer’s Disease

¹

Doctoral School of Sciences and Technologies—EDST, Lebanese University, Beirut 1003, Lebanon

²

Computer Science Department, College of Science, Al-Nahrain University, Baghdad 10001, Iraq

³

College of Engineering and Technology, American University of the Middle East, Egaila 54200, Kuwait

⁴

College of Engineering and Information Technology, Alshaab University, Baghdad 10001, Iraq

^*

Author to whom correspondence should be addressed.

Algorithms 2024, 17(5), 207; https://doi.org/10.3390/a17050207

Submission received: 6 April 2024 / Revised: 28 April 2024 / Accepted: 2 May 2024 / Published: 10 May 2024

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Alzheimer’s disease is a common type of neurodegenerative condition characterized by progressive neural deterioration. The anatomical changes associated with individuals affected by Alzheimer’s disease include the loss of tissue in various areas of the brain. Magnetic Resonance Imaging (MRI) is commonly used as a noninvasive tool to assess the neural structure of the brain for diagnosing Alzheimer’s disease. In this study, an integrated Improved Fuzzy C-means method with improved watershed segmentation was employed to segment the brain tissue components affected by this disease. These segmented features were fed into a hybrid technique for classification. Specifically, a hybrid Convolutional Neural Network–Long Short-Term Memory classifier with 14 layers was developed in this study. The evaluation results revealed that the proposed method achieved an accuracy of 98.13% in classifying segmented brain images according to different disease severities.

Keywords:

Alzheimer’s disease (AD); magnetic resonance imaging (MRI); fuzzy C-means clustering; CNN-LSTM

1. Introduction

Alzheimer’s disease (AD) is a neurological disorder characterized by the degeneration of brain cells, leading to memory loss and impairment in performing daily tasks. Despite extensive research, the exact cause of AD remains elusive [1], making it challenging to develop a definitive cure. However, effective management strategies can significantly enhance the quality of life for individuals affected by AD. Symptoms usually progress slowly until they reach a point where daily activities and self-care become impossible. Many individuals require ongoing support for issues related to motor skills and language skills. The prevalence of AD is substantial, with approximately 5.3 million individuals being diagnosed with AD in 2015, and projections suggest this number could escalate to 16 million by 2050 [2]. Various image processing and machine learning techniques have been proposed for diagnosing and predicting the progression of AD. However, MRI and PET (positron emission tomography) are widely used biomedical imaging modalities that offer valuable insights into the pathology of AD. Biomedical information assumes various formats, which include images and videos, each encapsulating different features of medical records [3,4]. These data provide valuable insights into neuropathology, helping researchers and clinicians to gain a deeper understanding of various brain diseases. Currently, image processing and AI-enabled diagnosis of Alzheimer’s disease represents a leading edge of medical research and technology. The combination of artificial intelligence and healthcare has facilitated the adoption of new methods for the early detection of AD [5]. The Alzheimer’s Disease Neuroimaging Initiative (ADNI) database serves as a valuable resource for researchers, offering MRI scans of patients at various stages of AD progression. The data structure has four categories: Mild, Moderate, Very Mild, and Normal. Mild AD represents the transitional phase between age-related memory decline and significant dementia-related impairment, while Moderate AD is characterized by moderate cognitive and functional deficits. Patients in the Very Mild category experience rapid-onset dementia and exhibit challenges with language and judgment. Leveraging deep learning models enables the swift and accurate identification of disease-related features, facilitating early diagnosis and intervention [6].

Our study emphasizes the vital role of segmentation, feature extraction, and classification in neuroimaging research, particularly for diagnosing Alzheimer’s disease. The proposed Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM) algorithm integrates multiple layers to analyze MRI data, offering a novel approach compared to existing models that often utilize single-layer architectures or focus solely on MRI data. Additionally, CNN-LSTM demonstrates proficiency in handling diverse datasets and employs a multitasking learning approach, allowing the network to simultaneously analyze the progression and diagnosis of AD. Our research focuses on the following contributions:

The evaluation results indicate that all existing models achieved less than 90%. Notably, CNNs demonstrated superior performance due to their simple yet effective architecture, enabling rapid training and testing.
This study aimed to develop a lightweight hybrid model for enhanced performance. By combining CNN and LSTM models in parallel, we introduce a novel hybrid Deep Neural Network (DNN) scheme.
Utilizing convolutional kernels of different sizes enhances the network’s ability to capture essential features, Moreover, combining multiple features increases the representativeness. Therefore our hybrid model incorporates two smaller kernels (3 × 3 and 5 × 5) to replace the traditional large convolutional filter.
The integration of the enhanced Fuzzy C-means method with two different steps of watershed and feature extraction enhances model performance, achieving an average accuracy of 98.13%. Furthermore, our hybrid model significantly reduces the variable parameters, resulting in a faster computational speed.
This study’s comparative analysis with existing models, including state-of-the-art approaches, demonstrates the robust performance of our hybrid model.

This paper is organized as follows: Section 1.1 reviews the most recent research related to our research work. Section 2 presents the methodology of our work, which included preprocessing, developed AD brain segmentation techniques using Fuzzy C-means and watershed models, and postprocessing, and, finally, the methodology of the proposed hybrid CNN-LSTM model. Section 3 discusses the results and Section 4 concludes the paper.

1.1. Literature Review

In recent years, researchers have explored various approaches to diagnose and classify AD. This section provides a brief overview of some significant advancements reported in recent studies from the literature. In one recent study [7], MRI and PET scans were employed to differentiate between AD and Normal Cognition (NC) and prodromal Mild Cognitive Impairment (pMCI), as well as between single-modality MCI (sMCI) and NC. The study introduced a novel approach, initially employing a 3D CNN to extract fundamental features. Subsequently, instead of the conventional fully connected (FC) layer, FSBi-LSTM was used to enhance the spatial precision. A SoftMax classifier was then employed for feature classification. Additionally, the number of filters in the convolution layer was reduced to address overfitting concerns. Another study [8] used a spectral graph Convolutional Neural Network (graph-CNN) to analyze T1-weighted MRI data from the ADNI-2 cohort. The objective was to identify MCI and AD and predict the onset of AD in both ADNI-1 and an Asian cohort. The graph-CNN achieved notable accuracy in distinguishing controls vs. AD patients (85.8%) and Early MCI (EMCI) vs. AD (79.2%) within the ADNI-2 cohort, outperforming other deep learning methods. It accurately predicted the conversion from EMCI to AD (75%) and from Late MCI (LMCI) to AD (92%). Furthermore, the fine-tuned graph-CNN demonstrated promising accuracy in NC vs. AD classification in both cohorts (ADNI-1: 89.4%; Asian cohort: >90%). In another recent investigation [9], researchers developed a three-dimensional CNN aimed at AD detection. The model was trained using 1230 PET scans collected from 988 individuals, including 169 cases of AD, 661 cases of MCI, and 400 NC individuals sourced from the ADNI database. Preprocessing involved stripping and normalizing the raw scans to eliminate non-cerebral structures, reducing the computational complexity and processing time. The network achieved a comparable accuracy of 88.76% in NC/AD classification tasks. Moreover, in the research presented in [10], a transfer learning strategy leveraging a pre-trained AlexNet was presented for multiclass Alzheimer’s disease classification using MRI brain images from the OASIS database, achieving an accuracy rate of 92.85%. In [11], researchers introduced a multi-modal ensemble deep learning (DL) approach using a stacked CNN-BiLSTM to identify AD progression. This method involved extracting local and longitudinal features from each modality and incorporating background knowledge to extract the local features. Subsequently, all extracted features were fused for regression and classification tasks, resulting in an accuracy of 92.62%. In [12], a novel four-dimensional deep learning algorithm (C3d-LSTM) tailored for AD classification, specifically handling functional MRI (fMRI) data, was introduced. This model efficiently leverages spatial information by integrating multiple 3D CNN models to extract data from each region within a three-dimensional static picture sequence obtained from fMRI scans. Subsequently, the extracted features undergo processing using LSTM techniques to capture instantaneous information within the dataset. The outcomes underscore the effectiveness of the C3d-LSTM model in managing four-dimensional fMRI data and accurately discerning their spatiotemporal attributes for AD diagnosis. The investigation detailed in [13] introduces and assesses various deep learning models and architectures, encompassing both two- and three-dimensional CNNs and Recurrent Neural Networks (RNNs). One approach involves employing a 2D CNN on 3D MRI volumes by dividing each MRI scan into two 2D slices, disregarding interconnections among the slices. Alternatively, a CNN model can be preceded by an RNN, allowing the two-dimensional CNN + RNN model to comprehend connections across sequences of two-dimensional slices obtained from MRIs. Through the utilization of a 3D voxel-based technique coupled with transfer learning, the study achieved a classification accuracy rate of 96.88%. In [14], a Multiplan CNN technique was proposed and applied to 1500 MRI datasets sourced from the ADNI dataset for the classification of AD, MCI, and NC. The method incorporates the brain extract tool (BET2) to eliminate non-brain areas from the MRI scans. The suggested architecture relies on a sequential CNN approach to discern spatial structural data. Through experimentation, an overall classification accuracy of 93% was attained across the three classes. In [15], a 2DCNN method is introduced for the classification of AD and MCI utilizing 3312 MRI scans. BET2 is employed for skull stripping during the image preprocessing. The proposed model is built upon LENet-5, with modifications to the activation function (Leaky ReLU) and output function (sigmoid). Moreover, batch normalization is incorporated to enhance the stability of the learning process. This fine-tuned model achieved the highest accuracy of 84% in successfully classifying AD. In [16], the authors introduced a fine-tuned ResNet18 model designed to classify MCI, AD, and CN from MRI and PET data. Their fine-tuned model incorporated transfer learning and a weighted loss function to ensure balanced class weights. Furthermore, the mish activation function was utilized to enhance the classification accuracy. The model achieved a classification accuracy of 88.3%.

In the study presented in [17], two hybrid algorithms were employed to segment MR brain images of Alzheimer’s disease patients. These algorithms combined nature-inspired techniques, employing a fusion of particle swarm optimization and a genetic algorithm (PSO_GA), and a combination of the whale optimization algorithm and a genetic algorithm (WOA_GA). The objective was to enhance the performance of SVM and AdaSVM classifiers. The experimental results indicated that PSO_GA achieved the highest accuracy, albeit requiring more computational time compared to the WOA and WOA_GA methods. Notably, WOA_GA exhibited superior accuracy compared to the majority of the algorithms tested.

In the research in [18], voxel-based morphometry (VBM) was utilized for the early detection of Alzheimer’s disease using quantitative susceptibility mapping (QSM) and vector-based modeling. The voxel extraction algorithm was refined to improve its detection performance, and the revised index was compared with the traditional VBM-based index on specific voxels in QSM and VBM images. The index was calculated using a linear support vector device. The proposed improved index demonstrated an AUC of 0.94 between AD and NC, indicating its effectiveness in early detection.

In another study [19], MRI datasets from the Open-Access Series of Image Studies (OASIS) were employed, and a blend of K-means and watershed algorithms was applied to segment the hippocampus, a region of the brain affected by Alzheimer’s disease. The results illustrated that the integrated segmentation accurately identified and diagnosed this disease, showcasing the potential of such techniques in medical imaging analysis.

Our combined use of the Improved Fuzzy C-means, watershed, and CNN-LSTM techniques under the developed framework brings complementary advantages. Specifically, CNN-LSTM, with an improved structure containing few layers, allows the identification of Alzheimer’s disease abnormalities with the highest accuracy, while ImFCm-WS extracts useful features from MRI data. All of these efforts were integrated together to improve the accuracy and reliability of Alzheimer’s disease classification to support early identification and personalized medicinal approaches. The aim of the current research was to test the efficacy of the proposed method in AD classification and investigate its convenience in AD diagnosis. The reported results were then compared to the currently leading classification approaches according to precision, sensitivity, and specificity, and ours reached values of up to 98%. Collaboratively, we designed the current investigation to guide the development of AD diagnosis and refine the diagnosis of this disease through computational techniques.

2. Materials and Methods

This study utilized Improved Fuzzy C-means clustering (ImFCm) to segment brain tissue into gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) due to its effectiveness in segmenting homogenous-intensity regions in MRI scans. This research introduces a hybrid approach combining ImFCm with watershed segmentation, leveraging the strengths of both methods to enhance segmentation accuracy and improve classification performance. The outputs of these methods were integrated into an advanced hybrid model (CNN-LSTM) to enhance the accuracy and robustness of Alzheimer’s disease detection systems. The findings were validated using the ADNI database, which comprises approximately 6400 MRI brain scans categorized as Mild, Moderate, Very Mild, and Normal. The proposed approach is illustrated in Figure 1.

The figure showcases distinct blocks, labeled as A, B, C, and D. Block A delineates the ADNI dataset from AD, consisting of 6400 MRI scans. Block B delineates the preprocessing techniques, encompassing Bayes noise removal, CLAHE filtering, cropping, and normalization. Block C demonstrates the enhanced Fuzzy c-means clustering method coupled with watershed segmentation. Finally, block D includes the classification process, utilizing a hybrid approach that combines a CNN with LSTM.

2.1. Preprocessing

Preprocessing is one form of image enhancement tailored to the specific needs of a dataset and approach. It includes trimming and augmentation to improve a system’s performance. Cropping removes unnecessary elements from an image, while augmentation increases the quantity of images to enhance the outcomes of denoising algorithm training. In this paper, preprocessing involved converting images to grayscale, applying a Contrast Adaptive Histogram Equalization (CLAHE) filter and utilizing the Bayes wavelet transform (BWT) to remove unwanted noise and fine-tune the brightness and contrast levels. Additionally, cropping and normalizing were applied. Figure 1 illustrates the preprocessing phase in block B, and the used steps are presented below.

(a): Noise removal based on the Wavelet Transform

The wavelet transform is indeed widely used for compression and noise reduction in image processing. One common preprocessing step before applying the discrete wavelet transform (DWT) is to use a noise reduction tool on the original image. This helps in improving the quality of the image data before further analysis or processing, especially in scenarios where noise may degrade the accuracy or visual clarity of the image. This step aims to reduce noise from MRI data. Subsequently, the DB3 wavelet can be used to decompose the image. The noise standard deviation is then employed to determine the threshold(s) for wavelet detail coefficients. The choice of wavelet can include any of options provided by pywt.wavelist. In this case, the chosen type of wavelet was bior6.8. The method parameter determined the type of denoising to be performed. The soft thresholding setting was utilized to find the best match for the original image provided with additive noise. The Algorithm 1 for this technique is described below:

Algorithm 1: Noise removal

-: Convert the image into grayscale.
-: Transform the image into a floating-point representation.
-: Assign the standard deviation of the noise to sigma.
-: Compute the noise standard deviation of the image.
-: Specify the Bayes Shrink technique as a parameter.
-: Set the thresholding type to soft mode.
-: Configure the number of wavelet decomposition levels set to 3.
-: Choose Biorthogonal 6.8 as the wavelet type.
-: Rescale the predicted noise standard deviation if rescale_sigma = True.

(b): Contrast optimization based on Contrast Limited Adaptive Histogram Equalization (CLAHE)

CLAHE, or Contrast Limited Adaptive Histogram Equalization, is an image enhancement technique used to improve the contrast of an image. Unlike traditional histogram equalization methods that operate on the entire image, CLAHE works on small regions or tiles within the image. This localized approach prevents over-amplification of the contrast in regions with varying intensities, leading to more balanced and visually pleasing results. This technique is an enhanced version of Adaptive Histogram Equalization (AHE) with the added benefit of limiting contrast amplification, which helps in avoiding artifacts and maintaining natural-looking enhancements. Unlike AHE, CLAHE operates on tiled regions of the image rather than the entire image, thus eliminating artificial boundaries by blending neighboring tiles using bilinear interpolation. In our study, CLAHE was employed to enhance image contrast. When utilizing CLAHE, two parameters should be noted, the ClipLimit parameter, which determines the threshold for constraining contrast (with a default starting value of 3), and the tileGridSize parameter, which controls the size of each row and column of tiles (set to 8 × 8 in our case). This method’s Algorithm 2 is outlined below:

Algorithm 2: CLAHE

-

Normalize the input array (image) to preserve values within the specified range (0 to 255).

-

Convert the output to an 8-bit unsigned integer.

-

Create a CLAHE object with the provided parameters:

-: Set the clipLimit to 3.0 to limit the improvement of contrast.
-: Specify the tileGridSize parameter to establish the number of regions for contrast control. Set it to (8, 8).

(c): Cropping and Normalization

Cropping and normalizing MRI scans constitutes the final phase of the preprocessing stage, as depicted in Figure 1. Cropping involves removing irrelevant areas and surroundings from images, and is a common technique used in computer imaging. Normalization, on the other hand, aims to reduce the variation in intensity among pixel values. Normalization strategies include histogram stretching and contrast stretching. The Algorithm 3 for this process is outlined below:

Algorithm 3: Cropping and normalization

-: Define the constraints of the image’s contour and apply them to the variables x, y, w, h.
-: Crop the image using the filtered coordinates (y:y + h, x:x + w) to determine the revised dimensions of the image.
-: Normalize the cropped image employing the normalize min-max function, with initial parameters set at alpha = 0 and beta = 150, to regulate the intensity levels effectively.

2.2. Segmentation

In medical research and related fields, digital imaging plays a crucial role. Image segmentation techniques significantly contribute to precise diagnoses and treatment planning [20,21]. The human brain consists of three essential components: white matter, cerebrospinal fluid, and gray matter. Brain lesions exhibit diverse characteristics, making their extraction from MR scans challenging. This study focuses on segmenting brain lesions using feature-based clustering and proposes the use of soft clustering techniques.

Clustering involves grouping a set of objects into different categories, or clusters, based on the similarities in their features. While many clustering algorithms share common principles, they differ in how they assess similarity or distance and assign labels to clusters. Various strategies including discriminative clustering, hierarchical clustering, model-based clustering, fuzzy clustering, and density-based clustering, which address these challenges [21,22]. In our study, we enhanced the accuracy and efficiency of image analysis by combining Improved Fuzzy C-means (ImFCm) clustering with watershed segmentation. This innovative approach represents a significant advancement over current methodologies. The Improved Fuzzy C-means technique provides a robust foundation, improving the initial segmentation phase. Watershed segmentation contributes by defining boundaries and refining segments using gradient information, thereby enhancing the entire process.

(a): Improved Fuzzy C-means Clustering (ImFCm)

Fuzzy theory is instrumental in handling situations involving ambiguity or vagueness caused by incomplete, unreliable, or conflicting information. One of its applications is in the realm of data clustering through methods like Fuzzy c-means (FCM). FCM divides a dataset into N clusters, where each data point belongs to each cluster to a certain degree. For example, a point near the cluster center exhibits a high membership degree, while one farther away has a lower membership degree. This flexibility in membership degrees enables FCM to effectively handle complex and overlapping data structures. In the context of fuzzy clustering, unlike traditional clustering, where each sample is allocated to just a single cluster, fuzzy clustering permits a single sample to have associations with multiple clusters. The fundamental idea behind fuzzy clustering is that every element can be assigned to various clusters with diverse levels of membership. FCM (Fuzzy C-means) stands as a classic approach to fuzzy clustering. Our objective was to improve the method described in [23] by optimizing it using the FCM algorithm:

J_{m} = \sum_{i = 1}^{c} \sum_{k = 1}^{n} u_{i k}^{m} d_{i k}^{2} = \sum_{i = 1}^{c} \sum_{k = 1}^{n} u_{i k}^{m} ‖ x_{k} - v_{i} ‖^{2}

(1)

where m is a positive integer greater than one, u_ik indicates the level of membership of the k_th data sample in the i_th cluster, d_ik is the ratio of familiarity in the preceding n space, x_k signifies the k_th data sample, and v_i is the i_th cluster’s center.

In our study, we aimed to enhance the robustness of the Fuzzy C-means clustering method. Adjustments were made to specific components, as illustrated in Figure 2.

The first part involved filtering the image by computing a distance window and identifying steady groups across a sliding window traversing the entire image. Initially, a padding was created based on half of the kernel to encompass the image borders during sliding. The image mean was computed using the padding, leveraging the cv2.copyMaskBorder function to incorporate edges during the sliding process. The sliding window algorithm was determined, specifying parameters such as the neighbor effect and window size, while considering the identification of steady pixel groups for clustering. The Algorithm 4 for detecting stable clusters operates by identifying pixels with Gaussian filter values below or equal to the square root of the mean of the squared differences between the Gaussian values and the window size. The outcome of this segment yielded a filtered image, which was then utilized later in part three for computing fuzziness.

The second part involved determining the histogram of the image using the CLAHE filter to enhance the image contrast and compute the intensity distribution of the improved image. This segment was utilized for calculating the centroid of the cluster.

In the third part, several critical functions were included. Firstly, the membership function was initialized, indicating the degree of pixel belongingness to each cluster. The function for computing the centroid of clusters involves dividing the numerator by the denominator. The numerator constitutes the summation of the degree of fuzziness multiplied by the histogram and intensity with the power of membership, while the denominator is the summation of the histogram multiplied by the power of membership. Finally, the function for computing the weights depends on the centroid function. It comprises a division of the numerator by the denominator, wherein the numerator calculates the absolute differences of the powered intensity and computed cluster centroids, and the denominator calculates the summation of absolute differences powered to the degree of fuzziness.

Algorithm 4: ImFCm

Step 1: Initialize the following parameters:
-
Number of bits. Number of clusters.
-
Degree of fuzziness.
-
Maximum iteration count.
-
Epsilon threshold for convergence check.
Step 2: Image Filtering Procedure:
-
Generate a padded image using a sliding window with dimensions (kernel_size/2, kernel_size/2).
-
Compute the mean based on the padded mask.
-
Pad the resulting mean image to create borders.
-
Utilize a sliding window to account for neighbor effects and kernel size.
-
Determine center coordinates using the spatial distance window with (Minkowski distance):
-
Des_win = ((abs (win_size_y-center_coordinate_y)) ** p + abs ((win_size_y-center_coordinate_y) ** p)) ** (1/p), where p = 2.
-
Identify the stable group matrix using a Gaussian filter.
-
Obtain the final filtered image using the formula:
-
Final_image = sum (weighted_coefficients * old_window)/sum (weighted_coefficients)
-
Perform Contrast Limited Adaptive Histogram Equalization (CLAHE).
Step 3: Weight Initialization: Initialize a two-dimensional matrix based on the number of clusters and gray levels to compute weights.
Step 4: Compute Cluster Centroids:
-
Calculate the X and Y values as follows:
-
X = sum (histogram * number of gray levels) * power (weight * number of fuzziness)
-
Y = sum (histogram) * power (weight * number of fuzziness)
-
Z = X/Y
Step 5: Weight Computation Method:
-
Set power = −2/number of fuzziness.
-
Calculate the X value using the formula: X = (gray levels-centroid values) * power.
-
Compute Y as: Y = sum (gray levels-centroid values) * power. Determine Z as: Z = X/Y.
Step 6: Check Convergence: Determine whether the absolute maximum value of (step 5–step 2) is less than the epsilon threshold. If so, stop; otherwise, proceed to step 4.

The block diagram depicted in Figure 2 outlines the procedure for conducting the Improved Fuzzy C-means clustering.

(b): Watershed Segmentation:

Watershed segmentation relies on image morphology and operates as a region-based technique. It necessitates selecting at least one marker, often called a “seed” point, within each object in the image, with the background being treated as a distinct entity. These markers are typically chosen either by an operator using application-specific knowledge or automatically through an automated process. Once these markers are identified, they undergo growth via morphological watershed transformation, allowing for accurate delineation and segmentation of distinct regions within the image. One significant method in this domain relies on the concept of morphological watersheds. In the context of watershed segmentation, an image is depicted in three dimensions, with the (x, y) coordinates representing the x- and y-axes, and the intensity depicted along the z-axis. This approach conceptualizes an image as a topographical surface, where pixel intensities correspond to peak elevations. Each intensity level represents a distinct elevation plane. The topographical analogy categorizes image points into three groups: regional minima, catchment basins, and watershed lines. The catchment basin group encompasses locations where a hypothetical water droplet would ultimately converge to a single regional minimum. Conversely, watershed lines denote locations where a water droplet might potentially travel to more than one regional minimum [24].

Let us consider the M1, M2,… MR regional minima of an image g(x, y). We denote an array of points beneath the horizontal axis with a value of n as T[n], where n ranges from the image’s least intensity to its greatest intensity. This can be expressed mathematically as follows:

T [n] = \{(s, t) | g (s, t) < n\}

(2)

Let Cn(Mi) represent a collection of regions in the catchment basin that are flooded at plane n and associated with the regional minimum Mi. This can be computed as follows:

C_{n} (M_{i}) = C (M_{i}) T [n]

(3)

C(Mi) represents the set of catchment basin points linked with the regional minimum Mi. The union of all flooded catchment basins at a specific stage n is represented as C[n]:

C [n] = [C_{n} (M_{i})]

(4)

The marker-controlled watershed segmentation method employs the ImFCm technique twice, initially using the specified marker values between (10–90) to capture inner brain features. Subsequently, this process is repeated with the specified marker values between (10–200) to emphasize outer brain details. All three images are then combined to generate the final image. The Algorithm 5 outlined below delineates the steps of this technique.

Algorithm 5: Watershed segmentation

-: Utilize OTSU’s binarization filter to estimate the objects present in the image.
-: Apply morphological opening to eliminate any white noise present in the image, and perform morphological closing to address small holes within the objects.
-: Employ the dilate method to create a separation between the background and the image.
-: Utilize distance transform and thresholding techniques to isolate the foreground from the background.
-: Determine the unknown areas by subtracting the foreground from the background. These areas lacking clarity will be assigned zero values in the markers.
-: Label the regions of the foreground using the connected components method as markers, and increment them by one to ensure all background regions are marked as ones.
-: Employ the distance values obtained from step 5 and the markers from step 6 as input parameters for the watershed method to generate the final segmentation map.

Figure 3 illustrates the watershed method in a block diagram.

2.3. Postprocessing

Gamma correction serves as a data augmentation technique involving the adjustment of the gamma value to modify the image intensity. Gamma, a non-linear function, is employed for encoding and decoding an image’s intensity or brightness. The gamma ratio is modified in gamma correction to change the overall illumination of the image. This method proves highly beneficial when dealing with images exhibiting minimal contrast or suboptimal lighting conditions [25]. Gaussian noise, a distortion introduced into the image through random numbers from a Gaussian distribution, is another approach. As it increases, the noise introduced during image acquisition and preprocessing can help the model become less susceptible to variations in image quality. The extent of noise applied to the image can be adjusted by altering the standard deviation of the Gaussian distribution. A higher standard deviation implies that more noise can be incorporated into the image [26]. In our study, we propose these two methods as postprocessing techniques for the dataset, addressing issues of lost contrast and brightness. Additionally, they introduce a subtle blurring factor to smooth pixel value changes and soften the image edges. The optimal coefficients and filter kernel size for gamma correction, which were utilized to achieve optimal segmentation results, are as follows: a gamma value of 2 and an alpha value of 0.5. Additionally, a Gaussian kernel with a size of 3 and a sigma value of 0.2 were employed. Figure 4 visually represents the final segmentation map generated using our method, showcasing the effectiveness of these parameters in the segmentation process, postprocessing techniques including gamma correction and Gaussian blur are applied to enhance the images, as illustrated in Figure 4.

2.4. Classification

In recent years, extensive research has been conducted on deep learning across various domains, notably in computer vision, image processing, authentication systems, and speech recognition. CNNs represent a kind of classification model [27]. In contrast to a traditional Neural Network, a CNN possesses the ability to learn intricate features, rendering it highly effective in tasks such as image classification, object recognition, and medical image analysis. The fundamental premise of a CNN lies in its capability to extract localized features from input data at higher levels and propagate them to lower layers to construct more complex feature representation. A CNN architecture typically comprises three layers, as illustrated in Figure 2 [28]: (i) the convolution layer, responsible for feature extraction; (ii) the pooling layer, which reduces dimensionality; and (iii) the fully connected layer, which handles classification and transforms two-dimensional matrices into one-dimensional vectors [29]. In the convolutional layer, a learnable filter extracts features from an input image.

Within this network, non-linearity is managed through the activation function, which induces a non-linear transformation of the neuron’s inputs. In the binary classifier, the sigmoid function is employed in the output layer, yielding probabilities of a data point belonging to a specific class within the range of 0 to 1, as determined in Equation (5). A Rectified Linear Unit (ReLU) is utilized for all hidden layers due to the limitations of the sigmoid, producing zero results for negative input values. Consequently, the neurons remain inactive, thereby expediting computation and training, as expressed in Equation (6).

f_{s i g m o i d} = \frac{1}{1 + \exp (- x)}

(5)

f_{R e l u} = m a x (0, x)

(6)

In the proposed CNN multi-classifier, the SoftMax function was utilized [30], which computes the probability of data values belonging to certain classes.

Our research employed a Long Short-Term Memory (LSTM) network as the classification layer, which was fed by a Convolutional Neural Network (CNN). The LSTM architecture, denoted as LSTM [31,32], tackles challenges faced by traditional RNNs like gradient vanishing and exploding. Unlike standard RNN units, LSTM employs memory units as a technique. Its crucial departure from RNNs lies in its incorporation of a cell memory, facilitating long-term memory encoding. LSTM efficiently retrieves and integrates data spanning previous time intervals up to the present moment, with three gateways: a “forget” gate as an input, and an output gateway. The current input is xt, while the initial and modified cell memories are ct−1 and ct, respectively. The current and previous outputs are ht and ht−1. The LSTM model architecture is shown in Figure 5.

The operational principle of the input gate in LSTM is outlined by the following equations:

i_{t} = σ (W_{i} . [h_{t - 1}, x_{t}] + b_{i}),

(7)

{\hat{∁}}_{t} = \tan (W_{i} . [h_{t - 1}, x_{t}] + b_{i}),

(8)

∁_{t} = f_{t} ∁_{t - 1} + i_{t} {\hat{∁}}_{t}

(9)

In the context described, Equation (7) quantifies the integration level of information from ht−1 and xt. Subsequently, the previous embedded state ht−1 and the present data point xt are subject to a tanh activation function, represented by a layer, which is mathematically expressed in Equation (8). The output obtained from this layer is denoted as

{\hat{∁}}_{t}

. Equation (9) merges data from the current input,

{\hat{∁}}_{t}

, and the previous long-term memory,

{\hat{∁}}_{t}

−1, to adjust the current cell memory,

{\hat{∁}}_{t}

. Symbols Wi and bi represent weight and the bias of the input gateway within the LSTM network.

To elaborate on how LSTM contributes to improving the model’s understanding of disease progression, let us break down each equation into its constituent parts and explain the significance of each term. We will particularly focus on the input gate, forget gate, and output gate in controlling information flow within the LSTM layer.

In summary, the LSTM layer enhances the model’s understanding of disease progression by selectively retaining relevant information over time. The input gate, forget gate, and output gate mechanisms within the LSTM equations enable controlled information flow, facilitating the model’s ability to capture long-term dependencies and dynamics in disease progression patterns.

This work analyzed MRI brain scans from individuals diagnosed with Alzheimer’s disease, classifying them into four groups: Normal, Mild, Moderate, and Severe. Figure 6 depicts the proposed CNN-LSTM architecture comprising 14 layers.

The architecture includes four convolutional layers, each with two convolutions, in sequence with pooling and dropout layers for regularization. Next, there is one fully connected layer, followed by an LSTM layer and one classifier layer. Before inputting into the CNN model, the original images, sized at 176 × 208 pixels, were resized to 200 × 200 pixels. Filters were adjusted during both upward and downward movements within the CNN layers to detect features. The batch size was set to 32, and the model underwent training for 50 epochs. Following the completion of all epoch rounds, the network selected the model with the highest classification performance and conducted a final classification for the test set, thereby establishing the model’s correct and accurate classification rate.

Table 1 provides a breakdown of the CNN model’s internal structure, while Table 2 specifies the hyperparameters utilized. The optimizer utilized during the training phase was Adam, with a learning rate of 0.001.

3. Results

3.1. Experimental Dataset

The T1-weighted images used in this study were obtained from the ADNI database. Out of the available ADNI samples, a total of 6400 were selected for analysis after excluding a few samples with incorrect information. This dataset includes 3140 normal samples, 896 samples of early Mild Cognitive Impairment, 64 samples of Moderate Cognitive Impairment, and 2240 samples of Severe Impairment. The images are in PNG format with a resolution of 176 × 208 pixels.

3.2. Evaluation Metrics

The evaluation of machine and deep learning recognition systems to assess the feasibility of accurately diagnosing AD relies on several performance metrics, including the accuracy (Acy), sensitivity (Sny)/recall, specificity (Spy), precision (Prn), and F1 score. Different performance indicators offer diverse insights into the detection model. All of these metrics can be defined as shown in the following Table 3.

3.3. Experimental Results

In our Python code development, we utilized the Spyder IDE, operating within Python version 3.8.18 and IPython version 8.12.2. Our computational resources were anchored on an Intel Core i5 7200 CPU, featuring four logical processors from the seventh generation family, and equipped with a memory capacity of 4096 MB. For tasks involving classification and rapid assessments of the results, we leveraged Kaggle’s GPU P100, offering an extensive memory capacity of up to 73.1 GB and a substantial processor power. This GPU acceleration facilitated efficient training and evaluation of the deep learning model, ensuring high performance and enhancing scalability for computational tasks. In this section, we delve into a detailed presentation and discussion of our most significant experimental results from the segmentation and classification tasks. Additionally, we provide comparisons with both traditional methods and state-of-the-art models to showcase the effectiveness and advancements of our approach.

(a): Comparison of Results of Traditional FCM and ImFCm

Convergence was achieved at the 35th iteration using the ImFCm algorithm, surpassing the maximum iteration threshold. As the iteration proceeded, the cost ratio decreased significantly, from 385.01 to 0.049. This substantial reduction indicates that the algorithm approached an optimal solution, demonstrating the efficiency of our technology in delivering results rapidly. In contrast, the standard FCM algorithm reached convergence at the 70th iteration, starting from a higher cost value of 4907.9 and necessitating more time compared to the proposed method. Figure 7 illustrates the cost values for five different MRI scans, analyzed using both FCM and ImFCm techniques. This comparison highlights the improved performance and efficiency of the ImFCm algorithm.

(b): CNN-LSTM Results with Traditional FCM

Traditional FCM clustering was used for segmentation, followed by a CNN for AD classification. Figure 8 displays the training and validation results from the segmented brain MRI dataset. The classification task achieved a test accuracy of 50% over 50 epochs. One of the curves in the figure shows a red line for validation loss and a blue line for training loss, while another curve depicts a red line for training accuracy and a blue line for validation accuracy. According to this technique, the figure suggests that both the training and validation accuracies stabilized after the first initial 5 epochs, with the validation accuracy reaching 0.48 and remaining constant from epoch 5 to epoch 50. By epoch 50, the loss had decreased, with the validation loss converging to approximately 1.04 for both training and validation.

(c): CNN-LSTM Results with ImFCm-WS

Figure 1 illustrates the structure of the proposed ImFCm-WS-CNN-LSTM methodology, which outlines a systematic processing framework. Initially, Bayes noise reduction was performed to enhance the acquired MRI data, followed by CLAHE for contrast adjustment before transforming them into grayscale. ImFCM was then employed to cluster the images. Subsequently, watershed segmentation was conducted twice with specified marker values (10–90) to capture inner brain features, and again with marker values (10–200) to reveal outer brain details. The resulting images were combined to produce the final segmentation. Finally, postprocessing steps including gamma correction and Gaussian blur were applied to enhance the images. The final segmentation map from the dataset was then fed into the hybrid CNN-LSTM classifier after multiple rounds of tuning to determine the appropriate structure and hyperparameters, as detailed in Table 1 and Table 2. The accuracy and loss curves derived from the classification outcomes obtained from both the training and validation sets, once the image data were segmented using the improved ImFCm-WS algorithm and processed with the classification model outlined in this article, are presented in Figure 9. This figure illustrates that the training approach’s accuracy and loss converged by the third epoch, indicating high training and testing accuracy levels. The training accuracy was approximately 100%, with a loss of around 7.2176 × 10⁻⁶, while the validation accuracy was close to 97.8%, with a validation loss of approximately 0.1826. Observations from the figures show that the disparity between the training accuracy and validation accuracy, as well as between the training loss and validation loss, is minimal. Therefore, 50 epochs are deemed sufficient for training and validating our model. Figure 9 also indicates that after six epochs, the accuracy for both training and validation remained consistent. Based on the results obtained, we can conclude that the proposed method enhances the efficiency of model training and validation within fewer epochs.

By combining two robust segmentation techniques and two hybrid classification models, the updated model reveals enhanced details. The attributes acquired from these hybrid methods diverge; their integration strengthens the features, thereby improving the categorization results. Table 4 displays the classification report for the training model for each class. The precision values for the Mild, Normal, Moderate, and Very Mild instances are 97%, 100%, 99%, and 97%, respectively. The recall rates for these classes are 99%, 87%, 98%, and 99%, respectively. Lastly, the F1 scores stand at 98%, 93%, 98%, and 98%. The outcomes of the adapted model, as utilized in this study, consistently demonstrate outstanding performance, underscoring the effectiveness of employing sophisticated MRI segmentation to enhance AD diagnostic classification. Following thorough training, the system was tested using a separate testing set, comprising images not included in the training process. With our recommended segmentation technique, the CNN model achieved a 98.13% accuracy and effectively operated on the MRI data.

(d): Comparison with Other Classification Models

The metrics for comparison were calculated based on the formulae presented in Table 3. The effectiveness of the classifiers was evaluated through several criteria: accuracy (Acy), sensitivity (Sny) or recall, specificity (Spy), precision (Prn), and the F1 score. These metrics were derived for several classifiers including a Support Vector Machine (SVM), a Light Gradient-Boosting (LGBM) model, and an ensemble SVM and LGBM model. Table 5 displays the performance metrics for these classifiers, highlighting that our proposed model achieved the highest accuracy of 98.13%. Moreover, it outperformed in all other evaluative measures, indicating a superior performance across the board.

(e): Comparative Examination with Other Research Models

Our proposed methodology is validated by comparing its results with previous research endeavors focusing on identifying the earliest signs of Alzheimer’s disease using a four-class categorization system, presented in Table 6. The suggested ImFCm-Ws-CNN-LSTM model exhibits an enhanced accuracy, precision, and F1 score, proficiently discerning the four different types of Alzheimer’s disease with 96.25% accuracy, 98.0% precision, and a 97% F1 score.

4. Limitations and Challenges

While addressing the high accuracy achieved in classifying segmented brain images in our research, it is crucial to acknowledge and address the several challenges and limitations encountered during our study. These include dealing with data imbalances, enhancing model interpretability, and diversifying datasets by incorporating different image modalities and real patient cases.

Data imbalances pose a significant challenge in training machine learning models as they can lead to biased predictions. To overcome this, future research should focus on strategies such as data augmentation, oversampling, or using specialized algorithms designed to handle imbalanced datasets.

Enhancing model interpretability is another key aspect. While achieving high accuracy is important, understanding why a model makes certain predictions is equally crucial, especially in medical applications like Alzheimer’s disease diagnosis. Techniques such as model explainability methods, feature importance analysis, and visualization tools can aid in interpreting model decisions.

Diversifying datasets by including various image modalities (e.g., MRI and CT scans) and real patient cases from different demographics and conditions can improve the robustness of the model. This ensures that the model generalizes well across diverse scenarios and demographics, making it more reliable and applicable in real-world clinical settings.

By addressing these challenges and limitations, future research can further enhance the accuracy, interpretability, and robustness of machine learning models for brain image classification in Alzheimer’s disease diagnosis.

5. Conclusions

This study presents a novel approach to identify the progression of Alzheimer’s disease (AD) by integrating feature extraction using the ImFCM-Ws technique and optimizing a hybrid CNN-LSTM architecture. The standard ADNI dataset was utilized to train and evaluate this model for classifying different stages of Alzheimer’s disease. Upon analysis, the CNN-LSTM incorporating ImFCM-Ws features outperformed alternative methods on the ADNI dataset, achieving an impressive accuracy of 98.20%.

This methodology can effectively identify brain regions associated with Alzheimer’s disease and provide a valuable decision-supporting tool for physicians in assessing the severity of the illness based on the level of dementia. Furthermore, the robustness of the proposed method lies not only in its high accuracy but also in its potential for real-time clinical applications.

By providing a precise categorization of Alzheimer’s disease stages, healthcare professionals can make informed decisions regarding patient care and treatment strategies. Moreover, the integration of ImFCM-Ws features with CNN-LSTM architecture enhances the interpretability of the results, allowing clinicians to understand the underlying neurobiological processes contributing to Alzheimer’s disease progression. Ultimately, this innovative approach holds promise for improving early diagnosis and intervention, potentially leading to better outcomes and quality of life for individuals affected by Alzheimer’s disease.

Author Contributions

E.H.A., S.S., G.Z.E.N. and Z.F.M. all contributed to this work and agreed to publish this manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

This submission contains original results and figures that have not been previously published or are being considered by another publisher. Data are contained within the article.

Conflicts of Interest

There are no conflicts of interest acknowledged by any of the authors.

Abbreviations

Definition	Abbreviation
Magnetic Resonance Imaging	MRI
Alzheimer’s Disease Neuroimaging Initiative	ADNI
Convolutional Neural Network–Long Short-Term Memory	CNN-LSTM
Alzheimer’s disease	AD
positron emission tomography	PET
Deep Neural Network	DNN
Normal Cognition	NC
prodromal Mild Cognitive Impairment	pMCI
single-modality MCI	sMCI
fully connected	FC
deep learning	DL
Recurrent Neural Network	RNN
Improved Fuzzy C-means clustering	ImFCm
gray matter	GM
white matter	WM
cerebrospinal fluid	CSF
Contrast Adaptive Histogram Equalization	CLAHE
Bayes wavelet transform	BWT
discrete wavelet transform	DWT
Adaptive Histogram Equalization	AHE
Rectified Linear Unit	Relu
accuracy	Acy
Fuzzy C-means	FCM
sensitivity	Sny/recall
specificity	Spy
precision	prn
true positive	TP
true negative	TN
false positive	FP
false negative	FN

References

Beheshti, I.; Demirel, H.; Alzheimer’s Disease Neuroimaging Initiative. Feature-ranking-based Alzheimer’s disease classification from structural MRI. Magn. Reson. Imaging 2016, 34, 252–263. [Google Scholar] [CrossRef] [PubMed]
Belleville, S.; Fouquet, C.; Duchesne, S.; Collins, D.L.; Hudon, C.; The CIMA-Q group. Consortium for the early identification of Alzheimer’s disease-Quebec detecting early preclinical Alzheimer’s disease via cognition, neuropsychiatry, and neuroimaging: Qualitative review and recommendations for testing. J. Alzheimer’s Dis. 2014, 42, S375–S382. [Google Scholar] [CrossRef] [PubMed]
Bavkar, S.; Iyer, B.; Deosarkar, S. Detection of alcoholism: An EEG hybrid features and ensemble subspace K-NN based ap-proach. Lect. Notes Comput. Sci. 2019, 11319, 161–168. [Google Scholar]
Ali, E.H.; Sadek, S.; Makki, Z.F. A Review of AI techniques using MRI Brain Images for Alzheimer’s disease detection. In Proceedings of the 2023 Fifth International Conference on Advances in Computational Tools for Engineering Applications (ACTEA), Zouk Mosbeh, Lebanon, 5–7 July 2023; pp. 76–82. [Google Scholar]
Shanmugavadivel, K.; Sathishkumar, V.; Cho, J.; Subramanian, M. Advancements in computer-assisted diagnosis of Alzheimer’s disease: A comprehensive survey of neuroimaging methods and AI techniques for early detection. Ageing Res. Rev. 2023, 91, 102072. [Google Scholar] [CrossRef] [PubMed]
Alzheimer’s Disease Neuroimaging Initiative (ADNI). Available online: http://adni.loni.usc.edu/ (accessed on 25 July 2021).
Feng, C.; Elazab, A.; Yang, P.; Wang, T.; Zhou, F.; Hu, H.; Xiao, X.; Lei, B. Deep learning framework for Alzheimer’s Disease diagnosis via 3D-CNN and FSBi-LSTM. IEEE Access 2019, 7, 63605–63618. [Google Scholar] [CrossRef]
Wee, C.-Y.; Liu, C.; Lee, A.; Poh, J.S.; Ji, H.; Qiu, A. Cortical graph neural network for AD and MCI diagnosis and transfer learning across populations. NeuroImage Clin. 2019, 23, 101929. [Google Scholar] [CrossRef] [PubMed]
Islam, J.; Zhang, Y. Understanding 3D CNN behavior for Alzheimer’s disease diagnosis from brain PET scan. arXiv 2019, arXiv:1912.04563. [Google Scholar]
Maqsood, M.; Nazir, F.; Khan, U.; Aadil, F.; Jamal, H.; Mehmood, I.; Song, O.Y. Transfer Learning Assisted Classification and Detection of Alzheimer’s Disease Stages Using 3D MRI Scans. Sensors 2019, 19, 2645. [Google Scholar] [CrossRef] [PubMed]
El-Sappagh, S.; Abuhmed, T.; Islam, S.R.; Kwak, K.S. Multimodal multitask deep learning model for Alzheimer’s disease progression detection based on time series data. Neurocomputing 2020, 412, 197–215. [Google Scholar] [CrossRef]
Li, W.; Lin, X.; Chen, X. Detecting Alzheimer’s disease based on 4D fMRI: An exploration under deep learning framework. Neurocomputing 2020, 388, 280–287. [Google Scholar] [CrossRef]
Ebrahimi, A.; Luo, S.; Initiative, F.T.A.D.N. Convolutional neural networks for Alzheimer’s disease detection on MRI images. J. Med. Imaging 2021, 8, 024503. [Google Scholar] [CrossRef] [PubMed]
Angkoso, C.V.; Tjahyaningtijas, H.P.A.; Purnomo, M.H.; Purnama, I.K.E. Multiplane convolutional neural network (Mp-CNN) for Alzheimer’s disease classification. Int. J. Intell. Eng. Syst. 2022, 15, 329–340. [Google Scholar]
Heising, L.; Angelopoulos, S. Operationalising fairness in medical AI adoption: Detection of early Alzheimer’s disease with 2D CNN. BMJ Health Care Inform. 2022, 29, e100485. [Google Scholar] [CrossRef] [PubMed]
Oktavian, M.W.; Yudistira, N.; Ridok, A. Classification of Alzheimer’s disease using the convolutional neural network (CNN) with transfer learning and weighted loss. arXiv 2022, arXiv:2207.01584. [Google Scholar]
Agarwal, P.; Dutta, A.; Agrawal, T.; Mehra, N.; Mehta, S. Hybrid Nature-Inspired Algorithm for Feature Selection in Alzheimer Detection Using Brain MRI Images. Int. J. Comput. Intell. Appl. 2022, 21, 2250016. [Google Scholar] [CrossRef]
Sato, R.; Kudo, K.; Udo, N.; Matsushima, M.; Yabe, I.; Yamaguchi, A.; Tha, K.K.; Sasaki, M.; Harada, M.; Matsukawa, N.; et al. A diagnostic index based on quantitative susceptibility mapping and voxel-based morphometry may improve early diagnosis of Alzheimer’s disease. Eur. Radiol. 2022, 32, 4479–4488. [Google Scholar] [CrossRef] [PubMed]
Holilah, D.; Bustamam, A.; Sarwinda, D. Detection of Alzheimer’s disease with segmentation approach using K-Means Clustering and Watershed Method of MRI image. J. Phys. Conf. Ser. 2021, 1725, 012009. [Google Scholar] [CrossRef]
Minaee, S.; Boykov, Y.Y.; Porikli, F.; Plaza, A.J.; Kehtarnavaz, N.; Terzopoulos, D. Image Segmentation Using Deep Learning: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 3523–3542. [Google Scholar] [CrossRef]
UmaMaheswari, S.; SrinivasaRaghavan, V. Lossless mecal image compression algorithm using etrolet transformation. J. Ambient Intell. Humaniz. Comput. 2021, 12, 4127–4135. [Google Scholar] [CrossRef]
Ali, E.H.; Sadek, S.; Makki, Z.F. Novel Improved Fuzzy C-means Clustering for MR Image Brain Tissue Segmentation to Detect Alzheimer’s Disease. In Proceedings of the 2023 International Conference on Computer and Applications (ICCA), Cairo, Egypt, 27–28 February 2023; pp. 1–6. [Google Scholar]
Ruspini, E.H.; Bezdek, J.C.; Keller, J.M. Fuzzy clustering: A historical perspective. IEEE Comput. Intell. Mag. 2019, 14, 45–55. [Google Scholar] [CrossRef]
Singh, T.; Saxena, N.; Khurana, M.; Singh, D.; Abdalla, M.; Alshazly, H. Data clustering using moth-flame optimization algorithm. Sensors 2021, 21, 4086. [Google Scholar] [CrossRef] [PubMed]
Zhong, A.; Li, X.; Wu, D.; Ren, H.; Kim, K.; Kim, Y.-G.; Buch, V.; Neumark, N.; Bizzo, B.; Tak, W.; et al. Deep metric learning-based image retrieval system for chest radiograph and its clinical applications in COVID-19. Med. Image Anal. 2021, 70, 101993. [Google Scholar] [CrossRef] [PubMed]
Sirazitdinov, I.; Kholiavchenko, M.; Kuleev, R.; Ibragimov, B. Data augmentation for chest pathologies classification. In Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019; pp. 1216–1219. [Google Scholar]
Ullah, A.; Ahmad, J.; Muhammad, K.; Sajjad, M.; Baik, S.W. Action recognition in video sequences using deep bi-directional LSTM with CNN features. IEEE Access 2018, 6, 1155–1166. [Google Scholar] [CrossRef]
Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional neural networks: An overview and application in radiology. Insights Into Imaging 2018, 9, 611–629. [Google Scholar] [CrossRef] [PubMed]
Mahmud, M.; Kaiser, M.S.; McGinnity, T.M.; Hussain, A. Deep learning in mining biological data. Cogn. Comput. 2021, 13, 1–33. [Google Scholar] [CrossRef] [PubMed]
Jain, R.; Jain, N.; Aggarwal, A.; Hemanth, D.J. ScienceDirect Convolutional neural network-based Alzheimer’s disease classification from magnetic resonance brain images. Cogn. Syst. Res. 2019, 57, 147–159. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef]
Staudemeyer, R.C.; Morris, E.R. Understanding lstm—A tutorial into long short-term memory recurrent neural networks. arXiv 2019, arXiv:1909.09586. [Google Scholar]

Figure 1. Proposed method (A) Dataset, (B) Pre-processing, (C) Segmentation, (D) Post-Processing, (E) Classification Model.

Figure 2. Segmentation phase one (ImFCm).

Figure 3. Watershed segmentation.

Figure 4. Final segmentation map.

Figure 5. LSTM.

Figure 6. Proposed CNN model structure.

Figure 7. Cost and time for FCM and ImFCm.

Figure 8. The accuracy and loss curves of classification for traditional FCM.

Figure 9. Accuracy and loss curves of the classification results.

Table 1. CNN model characteristics.

Layer	Image Dimensions	No. of Filters	Size of Filter	Pooling Layer Size	Para.
conv2d (Conv2D)	(200, 200)	64	5 × 5	2 × 2	2432
conv2d_1 (Conv2D)	(200, 200)	64	5 × 5	2 × 2	25,632
MaxPooling2D	(100, 100)	64		2 × 2	0
Dropout	(100, 100)	64			0
conv2d_2 (Conv2D)	(100, 100)	128	3 × 3		18,496
conv2d_3 (Conv2D)	(100, 100)	128	3 × 3		36,928
max_pooling2d_1	(50, 50)	128		2 × 2	0
dropout_1	(50, 50)	128			0
flatten (Flatten)	(None, 160,000)				0
dense (Dense)	(None, 256)				40,960,256
lstm(LSTM)	(None, 256)				264,192
dropout_2	(None, 256)				0
dense_1 (Dense)	(None, 4)				1028

Table 2. Values of the hyperparameters.

Hyperparameter	Value
Split data	3840 train, 1281 validate
Dropout	0.3, 0.3, 0.5
Batch size	32
Learning rate	0.001
Num. of epochs	50

Table 3. Metric details.

Metric	Explanation	Math. Exp.
Accuracy (Acy)	Calculated by dividing the number of correct predictions by the total number of predictions made.	$A_{c y} = \frac{T_{p} + T_{N}}{T_{p} + F_{p} + T_{N} + F_{N}}$ TPs and TNs represent true positives and true negatives, respectively. FPs and FNs denote false positives and false negatives, respectively.
Sensitivity (Sny)	The sensitivity metric indicates the model’s effectiveness in detecting AD patients.	$S_{n y} = \frac{T_{p}}{T_{p} + F_{N}}$
Specificity (Spy)		$S_{p y} = \frac{T_{N}}{T_{N} + F_{p}}$
Precision (Prn)	Precision, on the other hand, evaluates the reliability of the diagnosis, or the proportion of individuals identified by the system who have been significantly impacted by the disease.	$P_{r n} = \frac{T_{p}}{T_{p} + F_{p}}$
F1	The F1 score of the simulation is described as the harmonic mean of sensitivity and precision.	$F 1 = 2 \times (\frac{S_{n y} \times P_{r n}}{S_{n y} + P_{r n}})$

Table 4. Report of classification.

Class Name	Precision	Recall	F1 Score
Mild	0.97	0.99	0.98
Normal	1.00	0.87	0.93
Moderate	0.99	0.98	0.98
Very Mild	0.97	0.99	0.98
Accuracy	98.13%	for predictions

Table 5. Report of classification comparisons.

Class Name	Accuracy	Class Name	Precision	Recall	F1 Score
SVM	96.56	Mild	0.95	0.98	0.96
		Normal	1.00	0.81	0.90
		Moderate	0.97	0.98	0.97
		Very Mild	0.97	0.94	0.96
LGBM	96.17	Mild	0.98	0.98	0.98
		Normal	1.00	0.31	0.48
		Moderate	0.96	0.98	0.97
		Very Mild	0.95	0.95	0.95
Ensemble SVM-LGBM	95	Mild	0.98	0.95	0.97
		Normal	0.33	0.50	0.40
		Moderate	0.95	0.95	0.97
		Very Mild	0.98	0.92	0.95
Proposed Model	98.13	Mild	0.97	0.99	0.98
		Normal	1.00	0.87	0.93
		Moderate	0.99	0.98	0.98
		Very Mild	0.97	0.99	0.98

Table 6. A summary of current studies implementing techniques using DL.

Ref.	Year	Image	Dataset	Classifier	Acc	Others
[6]	2019	MRI-PET	ADNI	3DCNN-FSBi-LSTM	94.82	Multiclass
[7]	2019	MRI	ADNI	CNN	89.4	AD vs. NC
[8]	2019	PET	ADNI	3D-CNN	88.76	AD vs. NC
[9]	2019	MRI	OASIS	AlexNet	92.85	Multiclass
[10]	2020	MRI-PET	ADNI	Stacked CNN-BiLSTM	92.62	Multiclass
[11]	2020	MRI	ADNI	C3d-LSTM	97	AD vs. NC
[12]	2021	MRI	ADNI	3D-CNN	96.88	Multiclass
[13]	2021	MRI	ADNI	2D-CNN	93	Multiclass
[14]	2022	MRI	ADNI	2D-CNN-LeNet-5	88.7	Multiclass
Proposed	2024	MRI	ADNI	CNN-LSTM	98.13	Multiclass

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ali, E.H.; Sadek, S.; El Nashef, G.Z.; Makki, Z.F. Advanced Integration of Machine Learning Techniques for Accurate Segmentation and Detection of Alzheimer’s Disease. Algorithms 2024, 17, 207. https://doi.org/10.3390/a17050207

AMA Style

Ali EH, Sadek S, El Nashef GZ, Makki ZF. Advanced Integration of Machine Learning Techniques for Accurate Segmentation and Detection of Alzheimer’s Disease. Algorithms. 2024; 17(5):207. https://doi.org/10.3390/a17050207

Chicago/Turabian Style

Ali, Esraa H., Sawsan Sadek, Georges Zakka El Nashef, and Zaid F. Makki. 2024. "Advanced Integration of Machine Learning Techniques for Accurate Segmentation and Detection of Alzheimer’s Disease" Algorithms 17, no. 5: 207. https://doi.org/10.3390/a17050207

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advanced Integration of Machine Learning Techniques for Accurate Segmentation and Detection of Alzheimer’s Disease

Abstract

1. Introduction

1.1. Literature Review

2. Materials and Methods

2.1. Preprocessing

2.2. Segmentation

2.3. Postprocessing

2.4. Classification

3. Results

3.1. Experimental Dataset

3.2. Evaluation Metrics

3.3. Experimental Results

4. Limitations and Challenges

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI