Integrating Demographics and Imaging Features for Various Stages of Dementia Classification: Feed Forward Neural Network Multi-Class Approach

Background: MRI magnetization-prepared rapid acquisition (MPRAGE) is an easily available imaging modality for dementia diagnosis. Previous studies suggested that volumetric analysis plays a crucial role in various stages of dementia classification. In this study, volumetry, radiomics and demographics were integrated as inputs to develop an artificial intelligence model for various stages, including Alzheimer’s disease (AD), mild cognitive decline (MCI) and cognitive normal (CN) dementia classifications. Method: The Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset was separated into training and testing groups, and the Open Access Series of Imaging Studies (OASIS) dataset was used as the second testing group. The MRI MPRAGE image was reoriented via statistical parametric mapping (SPM12). Freesurfer was employed for brain segmentation, and 45 regional brain volumes were retrieved. The 3D Slicer software was employed for 107 radiomics feature extractions from within the whole brain. Data on patient demographics were collected from the datasets. The feed-forward neural network (FFNN) and the other most common artificial intelligence algorithms, including support vector machine (SVM), ensemble classifier (EC) and decision tree (DT), were used to build the models using various features. Results: The integration of brain regional volumes, radiomics and patient demographics attained the highest overall accuracy at 76.57% and 73.14% in ADNI and OASIS testing, respectively. The subclass accuracies in MCI, AD and CN were 78.29%, 89.71% and 85.14%, respectively, in ADNI testing, as well as 74.86%, 88% and 83.43% in OASIS testing. Balanced sensitivity and specificity were obtained for all subclass classifications in MCI, AD and CN. Conclusion: The FFNN yielded good overall accuracy for MCI, AD and CN categorization, with balanced subclass accuracy, sensitivity and specificity. The proposed FFNN model is simple, and it may support the triage of patients for further confirmation of the diagnosis.


Introduction
With an increasingly aging global population, the incidence of dementia is rapidly increasing.In 2016, there were 47 million people living with dementia worldwide.This figure is projected to increase to more than 131 million by 2050 [1].The most common cause of dementia is Alzheimer's disease (AD), which accounts for approximately 40% of all dementia cases.With recent pharmacological advancements, drug therapies for ameliorating the progression of AD [2] and improved preventive measures and therapies for AD have been developed.The early detection and accurate diagnosis of the prodromal stage of dementia, i.e., mild cognitive impairment (MCI), are crucial to reduce mortality, improve the quality of life and extend the lifespan of patients with dementia.MRI magnetization-prepared rapid gradient-echo (MPRAGE) imaging is a downstream imaging modality, which captures high tissue contrast with superior spatial resolution in a short scan time [3].The three-dimensional application of whole-brain scans has been extensively used for AD diagnosis and disease progression monitoring.It provides detailed structural images of the brain, allowing physicians to visualize and assess the brain abnormalities associated with dementia.It is easily available and plays a crucial role in dementia diagnosis.
One of the MRI MPRAGE image applications is brain volumetric analysis.A significant volume reduction in the medial temporal lobe, including the hippocampus, precuneus, posterior cingulate, amygdala, parahippocampal gyrus and entorhinal cortex, is a signature for AD patients [4][5][6][7][8].Through detailed hippocampal volume assessment [9,10], sub-regional corpus callosum atrophy [11,12] and connectivity-based segmentation of amygdala nuclei [13], AD can be effectively diagnosed from a cognitive normal (CN) state.Recent developments in automatic brain regional volume segmentation have improved the segmentation accuracy and can handle large amounts of data effectively.This allows for the comprehensive analysis of yearly MRI MPRAGE images for disease monitoring.Previous studies suggested that AD progression can be predicted based on the rate of volume reduction by monitoring the hippocampal volume change [14,15].However, for the prodromal stage of AD, which is MCI, the brain regional volume change is subtle and cannot be easily detected by the naked eye.The diagnosis of MCI from AD requires either supplements with an up-stream imaging modality or extensive experience and knowledge from clinical experts.Neither of them is commonly available in memory clinics.
In recent years, the radiomics analysis of MRI MPRAGE has been widely applied in medical imaging.It is a novel technique that incorporates gray-level invariant features (GLIFs) into a data classification algorithm.It has the potential to reveal disease heterogeneity characteristics, which are related to the gray-level matrixes.This method has been adopted for cancer prognosis and recurrence prediction [16][17][18], the prediction of distant metastasis [19] and treatment response [20].In view of dementia classification, an exploratory study was conducted by Li et al. 2020 using pure radiomics, and 55.9-56% accuracy was achieved in diagnosing preclinical AD.However, the accuracy improved to 76.1% when combined with other high-frequency features [21].
Biological differences and aging are other perspectives on dementia development.Previous studies suggested that women in many cohorts have a higher risk of developing AD [22,23].Also, a higher incidence of dementia in elderly individuals is observed around the world, and the prevalence ranges from 5 to 7%, even after age standardization [24].Including age and sex as parameters in the prediction model may have a positive impact on discriminating AD, MCI and CN in different perspectives.
During the past two decades, many studies have applied artificial intelligence to dementia classification using traditional classifiers, including logistic regression [25,26], decision tree (DT) [27], random forest [28][29][30], naïve Bayes [31], K-nearest neighbor [32], support vector machines (SVMs) [33][34][35][36][37][38] and ensemble classifier (EC) [39].With the improvements in computer processing power, more studies have focused on using discriminative approaches, such as neural networks, in recent years.It is a branch that simulates the human brain, both in terms of structural and learning patterns.Compared to the traditional classifier, it allows for the processing of complicated high-level information by connecting a large number of inputs [40].In addition, a multiple-layer neural network can capture complex non-linear relationships in data, as well as learning the relevant features automatically.The feed-forward neural network (FFNN) is one of the most popular neural networks being employed.It processes information from the input layer, through hidden layers to the output layer in one direction, without any feedback connections.It has only a few hidden layers, which requires less computation power to process, and is able to provide a good classification with a smaller dataset when compared to deep learning models.Previous studies showed good accuracies in identifying AD from CN (>85%) and MCI from CN (>80%) [41][42][43][44].However, most studies relied on a single dataset to train and test the model.The models were not tested against unseen data, which may affect the generalizability of the built model and limit its application in clinical settings.Also, a binary classifier, i.e., to classify AD from CN or MCI from CN, was employed in most studies.In real-world scenarios, patients' images were retrieved from multiple stages.The classification may be required to fit patients' images into several binary classifiers to confirm the diagnosis.Instead of training and managing the multiple binary classifiers of each class, a multi-class classifier is designed to handle multiple classes simultaneously.Although it is more challenging to train and yields lower accuracy [45], the deployment of the multi-class model provides only one output for various stages of diseases.It is simple and efficient to precisely identify these diseases.
In this study, we aimed to develop a reliable artificial intelligence multi-class model to classify AD, MCI and CN using patient demographics and MRI imaging features.The first objective was to use various combinations of demographics and image features to build the models using FFNN and various traditional artificial intelligence algorithms, including DT, EC and SVM, as a comparison.The second objective was to compare the classification performances of the FFNN with those of the above-developed models to identify the algorithm that could provide a more accurate classification of AD, MCI and CN.

Materials and Methods
In this study, two cohorts of patients were used to build and validate the artificial intelligence models.For each patient, the demographics were recorded.In addition, the brain regional volumes, as well as radiomics from the whole brain, were retrieved from the MRI images as image features.The patients' demographics, brain regional volumes and radiomics were integrated as inputs for model building using FFNN, DT, EC and SVM algorithms.The model classification performances were analyzed.

Patient Dataset
The datasets used in this study are the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu)[46] and the Open Access Series of Imaging Studies (OASIS) database (oasis-brains.org)[47].The use of the above datasets was approved by the institutional review board at each site, and all participants provided written consent.All eligible participants underwent brain MRI MPRAGE scanning and clinical diagnosis with demographics collected.

ADNI Dataset
There were 25 memory centers from the USA which joined the ADNI project.A total of 582 images were collected from 25 centers.Further, 406 images (70% of all images) from 21 memory centers were partitioned as the training dataset, and 176 images (30% of all images) from the remaining 4 centers were used as validation datasets.The distribution of images is listed in Table 1.

OASIS Dataset
An independent cohort dataset (OASIS dataset) was collected from the Washington University Knight Alzheimer Disease Research Center.The entire OASIS dataset consists of 1552 patients.Thus, 176 patients, 28 AD, 91 MCI and 57 CN, were picked randomly.The total number of patients and the distribution of subclasses were the same as the testing dataset from ADNI 4 centers.This was to ensure the result of testing using dataset from ADNI 4 centers and that using the OASIS dataset would not be influenced by the number of patients and its subclass distribution.

Brain Segmentation and Regional Volume Analysis
FreeSurfer v7.1.0image analysis suite was employed to perform brain segmentation and volumetric analysis.The procedures and algorithms employed were documented in previous publications [48][49][50][51][52][53][54] and are freely available from the website (http://surfer.nmr.mgh.harvard.edu/(accessed on 22 January 2023)).Forty-five brain regional volumes were obtained.Details of the brain regions are listed in Table 2 and illustrated in Figure 1.
Table 2. Details for the 45 brain regional volumes.
Reorientation of images was performed for each of the MPRAGE MRI images by SPM12 software [55].The individualized whole-brain mask template was fused onto the MPRAGE image for brain regional configuration, which is shown in Figure 2. Further, 3D slicer software (The Slicer Community; V.4.11.20210226) with the PyRadiomics extension (Computational Imaging and Bioinformatics Lab, Harvard Medical School) was employed for the radiomics feature extraction [56].One hundred and seven radiomics features were extracted within the whole brain from the MRI MPRAGE image for every patient.The definition of radiomics features was subdivided into eight classes [57], which are listed in Table 3.

Radiomics Features
Reorientation of images was performed for each of the MPRAGE MRI images by SPM12 software [55].The individualized whole-brain mask template was fused onto the MPRAGE image for brain regional configuration, which is shown in Figure 2. Further, 3D slicer software (The Slicer Community; V.4.11.20210226) with the PyRadiomics extension (Computational Imaging and Bioinformatics Lab, Harvard Medical School) was employed for the radiomics feature extraction [56].One hundred and seven radiomics features were extracted within the whole brain from the MRI MPRAGE image for every patient.The definition of radiomics features was subdivided into eight classes [57], which are listed in Table 3.

Demographics
The MRI MPRAGE dataset and patients' demographics were retrieved from the ADNI and OASIS website.The demographics of age and sex were recorded.

Integration of Patients' Demographics and Image Features
The patients' demographics, brain regional volumes and radiomics were integrated in the following 5 groups, which were used as inputs for building the artificial intelligence models: radiomics only with 107 features (R only), radiomics and patents' demographics with 109 features (RD), volumes only with 45 features (V only), volumes and patients' demographics with 47 features (VD) and volumes, radiomics and patients' demographics with 154 features (VRD).Details are listed in Table 4.

Demographics
The MRI MPRAGE dataset and patients' demographics were retrieved from the ADNI and OASIS website.The demographics of age and sex were recorded.

Integration of Patients' Demographics and Image Features
The patients' demographics, brain regional volumes and radiomics were integrated in the following 5 groups, which were used as inputs for building the artificial intelligence models: radiomics only with 107 features (R only), radiomics and patents' demographics with 109 features (RD), volumes only with 45 features (V only), volumes and patients' demographics with 47 features (VD) and volumes, radiomics and patients' demographics with 154 features (VRD).Details are listed in Table 4.

Model Building
Patients from the ADNI dataset of 21 centers were used to build the models.The 5 groups of features obtained in Section 2.5 were used as input to build the models.
The proposed FFNN was built using Matlab ® (R2021a) Neural Network toolbox.The neural network training employed Levenberg-Marquardt as a training algorithm with the random data division method.It had 5 layers, including 1 input layer, 3 hidden layers and 1 output layer.The input layers were the 5 groups of features in Section 2.5.The 3 hidden layers included 50 nodes in the first layer, 30 nodes in the second layer and 20 nodes in the last hidden layer for processing.In each hidden layer, the weight (w) and bias (b) are valued as a single vector, as shown in Figure 3.The FFNN is trained to fit input data; then, its weight and bias values are formed (+) into a vector (curve in the diagram) and fitted to the next layer.The output layer gave the result of classification.In this model, three subclasses, either AD, MCI or CN, were classified.
was set to 50, and no training time limit was applied.The three-layered FFNN was trained using mean squared error performance function and a regularization value of 0.01.This was the early stopping-based optimization, which was used to stop training when the performance function and a regularization value of 0.01 were achieved.Details of the FFNN model building are listed in Figure 3.In addition, the Matlab Classification Learner toolbox was employed to build the models using traditional artificial intelligence algorithms, including DT, EC and SVM, as a comparison.Hyper-parameter tuning was employed, with Bayesian optimization as optimizer, expected improvement per second plus as the acquisition function, the maximum iterations set as 30 and no training time limit applied, in DT, EC and SVM model building, so as to reduce the instability and provide simple models [58].
To improve the generalizability of the built models and avoid overfitting, 10-fold cross-validation was employed during each of the model-building processes.The dataset was divided into ten groups with an equal number of samples.The first neural network training process used the initial nine groups as training data and the remaining group as testing data.The second training process continued with another nine groups as training data and the rest as testing data.This process was undertaken 10 times.The performance of each model was the average result computed in these 10 rounds of training [59].
The performance of each model was assessed in terms of the overall accuracy, the classification ability of each subclass, i.e., MCI, AD and CN by class accuracy, sensitivity and specificity.

Model Testing and Data Analysis
Each model was tested using two independent cohorts of patients, including patients from the 4 centers of the ADNI dataset and those from the OASIS dataset.The performance of each model was assessed considering the overall accuracy, the classification ability of each subclass, i.e., MCI, AD and CN by class accuracy, sensitivity and specificity.

Results
We used five groups of features (R only, V only, RD, VD and VRD) to build models using four algorithms (FFNN, DT, EC and SVM); as a result, 20 models were built.Firstly, During model building, the hyper-parameter optimization algorithm was employed to control the learning process so as to optimally solve the problem.The maximum epoch was set to 50, and no training time limit was applied.The three-layered FFNN was trained using mean squared error performance function and a regularization value of 0.01.This was the early stopping-based optimization, which was used to stop training when the performance function and a regularization value of 0.01 were achieved.Details of the FFNN model building are listed in Figure 3.
In addition, the Matlab Classification Learner toolbox was employed to build the models using traditional artificial intelligence algorithms, including DT, EC and SVM, as a comparison.Hyper-parameter tuning was employed, with Bayesian optimization as optimizer, expected improvement per second plus as the acquisition function, the maximum iterations set as 30 and no training time limit applied, in DT, EC and SVM model building, so as to reduce the instability and provide simple models [58].
To improve the generalizability of the built models and avoid overfitting, 10-fold cross-validation was employed during each of the model-building processes.The dataset was divided into ten groups with an equal number of samples.The first neural network training process used the initial nine groups as training data and the remaining group as testing data.The second training process continued with another nine groups as training data and the rest as testing data.This process was undertaken 10 times.The performance of each model was the average result computed in these 10 rounds of training [59].
The performance of each model was assessed in terms of the overall accuracy, the classification ability of each subclass, i.e., MCI, AD and CN by class accuracy, sensitivity and specificity.

Model Testing and Data Analysis
Each model was tested using two independent cohorts of patients, including patients from the 4 centers of the ADNI dataset and those from the OASIS dataset.The performance of each model was assessed considering the overall accuracy, the classification ability of each subclass, i.e., MCI, AD and CN by class accuracy, sensitivity and specificity.

Results
We used five groups of features (R only, V only, RD, VD and VRD) to build models using four algorithms (FFNN, DT, EC and SVM); as a result, 20 models were built.Firstly, the value of the integration of multiple features was assessed through a performance evaluation using the same algorithm, with various features included in building the models.Secondly, the performance of the proposed FFNN was evaluated for various stages of dementia classification.

Dataset Demographics
The ADNI dataset comprises patients from 25 centers.Further, 406 patients from 21 centers (ADNI 21 centers) were selected to build the model, and 176 patients from the remaining 5 centers (ADNI 5 centers) were used to test the model.Another independent dataset from the OASIS database was used, with 176 patients used for secondary validation on the models built in Section 2.6.The demographics of the study cohort are shown in Table 5.When comparing models built using the same model-building algorithm, those models built using volumes performed better, with higher overall accuracy, accuracy in characterizing MCI, AD and CN, sensitivity and specificity when compared to those models built using radiomics.Including demographics as features for either volumes or radiomics improved the overall accuracy when compared to the use of volume or radiomics alone.However, in SVM algorithms, the specificity of AD classification was zero when using VD or VRD features.Overall, in all models, the integration of volumes, radiomics and demographics attained the highest overall accuracy, balanced sensitivity and specificity, as well as the best accuracy in classification in MCI, AD and CN.The results are listed in Table 6.

Performance Evaluation of FFNN when Compared to Traditional Classifiers
The results from Section 3.2 suggest that the models using features from volumes, radiomics and demographics achieved the highest overall accuracy when compared to those built from either volumes or radiomics alone.Thus, we focused on analyzing models using all three features.In Table 7 e, it can be seen that the performance of FFNN was the best when compared to traditional classifiers.FFNN showed 76.57% and 73.14% overall accuracy in tests for patients from ADNI 4 centers and the OASIS database, respectively.In particular, the FFNN model attained good sensitivity and specificity.Our previous study suggested that structural MRI images aided in differentiating AD and MCI from CN using artificial intelligence [30].However, that study was limited to binary classification, i.e., differentiating AD from CN, AD from MCI or MCI from CN.In clinical situations, CN may progress to MCI and then to AD in a matter of years.Multi-class classification is more useful considering three stages of disease.For two decades, brain regional volumes have been employed to diagnose AD from CN. Hippocampal atrophy is a widely used biomarker for the diagnosis of AD, but the low sensitivity and specificity limit its application as a confirmation of diagnosis [60].Sørensen and his team suggested using other imaging features, including cortical thickness, hippocampal shape and its texture for the differential diagnosis of MCI from AD, and they achieved a classification accuracy of 62.7% for CN from AD and MCI [61].Similar results were obtained by Koikkanlainen and his team, where 74% of AD could be accurately classified from other types of dementia using structural MRI [62].Both authors suggested that other features might be required to attain higher accuracy in classification.Our results demonstrated that, using the volumes of only 45 brain regions, the overall classification accuracy achieved was 73.14% and 68% (EC) in validation for patients from ADNI 4 centers and OASIS, respectively.The results are similar to those obtained in previous studies.In subclass classification, however, the sensitivity was under 70% in AD and CN for all four algorithms.This suggested that the models built using brain regional volumes alone were unsatisfactory in identifying AD from CN.
In recent years, radiomics has been employed in the classification of AD, MCI and CN.Du and his team used radiomics features of the hippocampus for diagnosing early-onset and late-onset AD, which achieved 77% and 78%, respectively.However, their sample size was small, with only 144 patients included in training (36 patients in each group) and another 60 patients (15 patients in each group) for testing [63].The limited sample size may restrict the generalizability of the classification model.Our results demonstrated that, using radiomics as the only feature for model building, the overall accuracy achieved 40.57% (SVM) to 51.43% (FFNN) in tests using patients from ADNI 4 centers and 35.43% (SVM) to 58.29 (EC) for patients from OASIS, respectively.The models built using radiomics alone were well below satisfactory.
To improve the classification accuracy, previous studies suggested building the model using multiple image features.Li and his team included 30,128 image features, including 24,910 features from structural MRI, 4988 features from functional MRI and 200 features from MRI Diffusion Tensor Imaging (DTI).They achieved overall accuracy of 90.2% and sensitivity and specificity of 79.8% and 86%, respectively [21].The current study included only structural MRI image features, and, by adding patients' demographics to the brain regional volumes and radiomics, the overall accuracy improved to 76.57% and 73.14% (FFNN) in tests for patients from ADNI 4 centers and OASIS, respectively.Also, the accuracy of MCI, AD and CN was 78.29%, 89.71% and 85.14% in tests for patients from ADNI 4 centers, which remained consistently high in tests for OASIS patients (74.86%, 88% and 83.43%).
In addition to the overall accuracy, the sensitivity and specificity, which refer to a model's ability to classify patients with AD as AD, and to classify patients who were not AD as MCI and CN, respectively, were balanced.This illustrated that the model demonstrated high capability in classifying the corresponding groups accurately.To address the issue of an imbalanced subclass dataset, we used the precision and F1 score to evaluate the models.Precision was used to measure how many predictions for one group made by the model were correct.Recall was used to measure the number of one-class samples present in the dataset that were correctly identified by the model.The F1 score combines precision and recall using their harmonic mean.The high F1 score illustrates maximized precision and recall simultaneously.In the current study, FFNN demonstrated the highest precision and F1 score when compared to other algorithms, which demonstrated that the FFNN model can concurrently attain high precision and high recall, indicating well-balanced performance.

The Value of the Feed-Forward Neural Network in Classification of AD, MCI and CN
In this study, the FFNN showed the best performance in terms of accuracy, specificity and sensitivity for dementia classification when compared to other traditional algorithms.FFNN is a multi-layer artificial neural network, with a connection between the input layer, hidden layers and output layers.The training process allows information to move in one direction, from the input layer and hidden layers to the output layer, without looping (backpropagation) [64].This simulates the thinking process of a physician in clinical decision making and diagnosis confirmation, based on information from patients' demographics and imaging features.
The FFNN networks built in this study were relatively small in view of network training.There were only five layers (one input layer, three hidden layers and one output layer).The processing time is within 2 min when running on most computers in the clinical setting.The Levenberg-Marquardt algorithm used in FFNN offered significant accuracy, with fewer errors during the training, validation and testing phases [65].

The Value of Multi-Classes in Classification of AD, MCI and CN
Previous studies achieved good classification accuracies; for example, Zhang et al. 2019 achieved 96% accuracy in discriminating AD from CN, with sensitivity and specificity of 89% and 98%, respectively [66].In addition, Mendoza-Leon and his team developed an auto-encoder model, which achieved accuracy of 90%, with sensitivity and specificity of 85% and 95%, respectively, in discriminating AD from CN [67].Ning and his team demonstrated over 95% accuracy in classifying AD from CN [43].A previous study from our team also achieved excellent classification accuracy, with 99.92% in differentiating MCI from CN, 99.86% to differentiate MCI from AD and 99.94% to differentiate AD from CN.However, these models were binary classifiers [30].In real-world scenarios, however, patients can be taken from either stage.The multi-class model is a one-stop model, which can differentiate AD, MCI and CN distinctively.Technically, Borchert et al. 2023 highlighted that building a multi-class classifier model is more challenging than a binary classifier in view of the machine learning algorithm, and it usually yielded lower accuracy, sensitivity and specificity [68].Compared to similar studies-one conducted by Moore and his team, where their model achieved accuracy of 99%, 59% and 29% in CN, MCI and AD, respectively [69], and another study conducted by Cárdenas-Pẽna and his team, where their model achieved 71.4%, 53.4% and 75.1% in CN, MCI and AD, respectively [42]-our proposed FFNN with VRD features yielded 83.43% 74.86% and 88% accuracies in CN, MCI and AD, respectively, in the OASIS test dataset.The balanced accuracies in various stages demonstrated that the model has the capability to classify all three stages of disease with satisfactory results, leading to a precise stage classification in real-world scenarios.

The Value of Testing against Independent Cohort of Patients
Another important asset of the current study is the use of an independent dataset for validation.Compared to those studies using cross-validation or other similar methods, the use of an independent dataset demonstrated much lower accuracy [70].For instance, Cohen et al. 2019 achieved accuracies of 93.1%, 82.3% and 88.6% in CN, MCI and AD, respectively [44]; however, their algorithm was not tested against unseen data.A review study concluded that, when compared to studies using cross-validation alone, studies using an unseen dataset for validation usually reported lower accuracy, especially when using a local population [68].Recent studies have addressed the risk of overfitting for models built using a single dataset [71,72] and suggested conducting model validation using an independent dataset to report the model accuracy.In the current study, we reported accuracies for both validations: validation by an independent part of the same dataset (i.e., test 1) and by an unseen independent dataset (i.e., OASIS dataset).Our proposed FFNN with VRD features yielded 85.14%, 85.71% and 78.29% in CN, MCI and AD, respectively, in the Test 1 dataset, and 83.43%, 74.86% and 88% accuracies in CN, MCI and AD, respectively, in the OASIS test dataset.Both results were satisfactory and indicate the model's capability to generalize to new data.

Potential Clinical Application and Development of the Proposed Model
In the current study, the brain regional volumes and radiomics were retrieved from the MRI images manually using the chosen software.With the improved computer power and database management, script encoding is available.The retrieval of brain regional volumes and radiomics can be carried out after image acquisition in the image storage database.Together with the demographics obtained from the patient management system, the obtained features can be passed to the proposed neural network as the input for dementia classification.The predicted diagnosis from the neural network may help to triage AD and MCI patients from the CN and lead to a higher priority for clinicians to determine the diagnosis.

Main Findings of Study
In this study, we utilized the brain regional volumes, radiomics retrieved from MPRAGE MRI images and patients' demographics to build a classification for dementia patients; further, we evaluated the performance of the networks built in terms of overall accuracy, subclass accuracy, sensitivity and specificity.The proposed FFNN model using all three types of features demonstrated the best distinguishing ability and achieved very good performance in dementia classification.

Study Limitations and Future Directions
In this study, two cohorts of neurodegenerative patients from a public database were used for model development and testing.The sample size was relatively small, even though it consisted of balanced samples in various groups.Small sample sizes provide less reliable estimates of the underlying data distribution, meaning that the developed model may miss subtle data patterns present in the data [70].Further study is recommended using another independent local cohort of patients with a larger sample size to verify the proposed model.
Radiomics of MPRAGE MRI images and demographic data were used as input to develop the classification model.In future studies, the model can be improved by incorporating image features from various imaging modalities, e.g., PET/CT with 18F-Flumetemetamol as a radionuclide for an amyloid study [73], T2-weighted MRI imaging for white matter hyper-intensity [74], arterial spin labeling MRI imaging for cerebral blood flow study [75] and resting state functional MRI imaging for interhemispheric functional connectivity [76], so as to develop a more comprehensive model.
Furthermore, other clinical parameters, such as the Montreal Cognitive Assessment (MoCA) result, plasma amyloid-β level [77,78], can be included as input to develop or modify the networks, so as to improve the classification capabilities with more relevant parameters.

Conclusions
This study established a feed-forward neural network model by integrating image features and demographics for various stages of dementia classification.The FFNN yielded good overall accuracies for MCI, AD and CN classification, with balanced subclass accuracy, sensitivity and specificity.The proposed FFNN model is simple and can be operated using a general-purpose computer in radiology departments.The application can be used as a reliable classification tool to prioritize patients with AD or MCI from CN.It may support the triage of patient for further testing, which shortens the diagnosis confirmation pathway.

Figure 2 .
Figure 2. Individualized whole-brain mask (the green region) was used to quantify the whole brain for the retrieval of 107 radiomics features.

Figure 2 .
Figure 2. Individualized whole-brain mask (the green region) was used to quantify the whole brain for the retrieval of 107 radiomics features.

Figure 3 .
Figure 3. Details of the feed-forward neural network.

Figure 3 .
Figure 3. Details of the feed-forward neural network.

Table 1 .
Images collected from ADNI and OASIS databases.

Table 3 .
Eight classes of radiomics features.

Table 3 .
Eight classes of radiomics features.

Table 4 .
Integration of patients' demographics and image features.

Table 4 .
Integration of patients' demographics and image features.

Table 5 .
Demographics of the ADNI and OASIS datasets.

Table 6 .
Various features employed for model building using 4 algorithms.Training was the result when building the model using patients from ADNI 21 centers; Test 1 was the result when testing the model by patients from ADNI 4 centers; Oasis was the result when testing the model by patients from OASIS database.The red fonts highlight results over 70%.(a) Various features employed for model building using SVM.(b) Various features employed for model building using ensemble classifier (EC).(c) Various features employed for model building using decision tree (DT).(d) Various features employed for model building using feed-forward neural network (FFNN).

Table 7 .
Five groups of features employed for model building using 4 algorithms.Training was the result when building the model using patients from ADNI 21 centers; Test 1 was the result when testing the model by patients from ADNI 4 centers; Oasis was the result when testing the model by patients from OASIS database.The red fonts highlighted the result over 70%.(a) Model performance using radiomics only in various model-building algorithms.(b) Model performance using RD in various model-building algorithms.(c) Model performance using volumes only in various model-building algorithms.(d) Model performance using VD in various model-building algorithms.(e) Model performance using VRD in various model-building algorithms.