Differential Expression Analysis of Blood MicroRNA in Identifying Potential Genes Relevant to Alzheimer’s Disease Pathogenesis, Using an Integrated Bioinformatics and Machine Learning Approach

: Alzheimer’s disease (AD) is a neurodegenerative disease characterized by cognitive and functional impairment. Recent research has focused on the deregulation of microRNAs (miRNAs) in blood as the potential biomarkers for AD. As such, a differential expression analysis of miRNAs was conducted in this study using an integrated framework that utilized the advantages of statistical and machine learning approaches. Three miRNA candidates that showed the strongest signiﬁcance and correlation with each other, namely hsa-miR-6501-5p, hsa-miR-4433b-5p, and hsa-miR-143-3p, were identiﬁed. The roles and functions of the identiﬁed differentiated miRNA candidates with AD development were veriﬁed by predicting their target mRNAs, and their networks of interaction in AD pathogenesis were investigated. Pathway analysis showed that the pathways involved in contributing to the development of AD included oxidative phosphorylation, mitochondrial dysfunction, and calcium-mediated signalling. This study supports evidence that the miRNA expression changes in AD and indicates the need for further study in this area.


Introduction
Alzheimer's disease (AD) is the most common neurodegenerative disease that causes a dementing syndrome.It is clinically recognized by cognitive dysfunction, such as memory loss and behavioural changes that significantly impact functional ability [1].AD is characterized pathologically by the abnormal accumulation of extracellular amyloid-β peptide (Aβ) plaques and intraneuronal neurofibrillary tangles (NFTs) composed of hyperphosphorylated tau protein in the brain [2,3].The abnormal accumulation of these proteins is thought to lead sequentially to neuroinflammation, neuronal cell death, synaptic dysfunction, and finally, cognitive impairment [2].AD appears to be genetically dichotomous, with rare mutations in amyloid precursor protein (APP), presenilin 1 (PSEN1), and presenilin 2 (PSEN2) associated with early-onset familial AD, and apolipoprotein E4 (APOE4) polymorphism associated with an increased risk of late-onset AD [4].
The diagnostic accuracy for AD has increased with the use of new neuroimaging modalities, such as amyloid or tau positron emission tomography (PET) scans, and the evaluation of biomarkers in cerebrospinal fluid (CSF), obtained via lumbar puncture [5].However, these procedures are not suitable for the screening of normal populations as they are prohibitively expensive or invasive [5].Hence, attention has been drawn to the application of blood-based biomarkers, which is comparably more accessible and welltolerated in regular clinical practice, to investigate and identify AD [5,6].
MicroRNAs (miRNAs), which circulate in the peripheral blood system, may be potential biomarkers for AD.The emergence of next-generation sequencing (NGS) technology, such as RNA sequencing (RNA-seq), of small RNAs enables the reading of thousands or millions of miRNA molecules, lending an understanding of their roles in neurodegenerative diseases for investigation.miRNAs are small (approx.18-25 nucleotides long), non-coding RNA molecules that regulate posttranscriptional gene expression by binding to the 3untranslated regions of messenger RNAs (mRNAs).The changes in expression of a miRNA can repress the translation of many mRNAs (gene silencing), influencing the amounts and functions of numerous proteins.A miRNA can target multiple mRNAs, including mRNAs that exert contradicting effects within the same molecular pathway [7].Several miRNAs that regulate the synthesis of activity-mediated proteins, affecting the underlying processes of cognitive function and disease risk/progression in AD, have previously been identified.miRNAs are abundant and stable in human bodily fluids, including the blood and the CSF, as compared to mRNAs, making miRNAs easier to evaluate and study [8].
Studies investigating the possibility of miRNAs as biomarkers for AD suggested that the dysregulation of miRNAs in blood may be able to reflect the pathological process of neuronal impairment that occurs in AD [5,6,9].Aberrant expressions of miRNAs have been identified in AD, such as miR-101, miR-20a, and miR-17, which appear to negatively regulate the expression of APP [10,11].Others, such as miR-22-3p and miR-340, were found to significantly alleviate Aβ levels in AD [11], whereas miR-107 levels were found to be negatively correlated with APOE4 [10][11][12][13].The suppression of miR-203 was also found to downregulate APOE4 and tau in mice [11].
The analysis of complex and highly heterogeneous AD expression data requires strong computational power to untangle the network of interactions between the miRNAs and to select the most likely candidates with the highest sensitivity and specificity in relation to AD [14].The "curse of dimensionality", caused by the presence of large variables but a small sample size in a dataset, often poses the biggest challenge in the analysis of AD data [15].The unbalanced ratio of the variables to the number of samples gives rise to the problem of overfitting and can increase false-positive results [14].Although some statistical methods have been reported to perform well with such data comprising smaller sample sizes and high biological replicates [16,17], machine learning methods are deemed to be more reliable in solving data overfitting problems [18].Feature-selection methods and crossvalidation steps carried out during the analysis reportedly perform well at removing noise and outliers in the dataset, while avoiding overfitting caused by the high dimensionality of gene expression data [19].
In most conventional studies, the genes of interest are evaluated through their expression values in a case-control study, where a set of genes with expression that varied in one class, as compared to others, is selected.Numerous statistical models and tests have been developed with the aim of identifying the most significant set of candidates.However, statistical methods only focus on univariate comparison, and the importance of the gene-gene relationship is often neglected.On the contrary, other than predicting outcomes for classes to improve the performance of a model, machine learning can be used to select relevant features by looking into the intrinsic intervariable relationships of the genes.
In AD studies, machine learning methods have been applied to select differential miRNA biomarkers that exhibit similar structural and functional patterns [20].The recent trend of machine learning in miRNA expression studies in AD mainly focuses on selecting a small set of miRNAs from a group of differentiated miRNAs to obtain more precise and reliable results of association [20,21].
The present case-control study focuses on investigating the differential miRNAs in the peripheral blood of Malaysian AD patients.The population in Malaysia is multi-ethnic and exposed to multicultural environments.Hence, this may result in differences not seen in the findings of studies on monoethnic populations, such as Caucasian, African, and Chinese.The present study started with a data-integration framework that applied statistical and machine learning techniques to identifying potential miRNA candidates that demonstrate differential expression in AD patients as compared with controls.Problems caused by the high dimensionality of the dataset were minimized by conducting a two-step machine learning method in which supervised feature-selection and unsupervised clustering were carried out.In addition, the potential roles of the miRNA candidates in AD pathogenesis were correlated with the functions of their respective targeted mRNAs (genes) by carrying out miRNA target gene prediction.The pathways involved with the identified miRNAs and genes, together with their roles in AD, were discussed in an attempt to reach a more complete understanding of AD development.
The remainder of this paper is structured as follows: Section 2 discusses the existing methods used in the study of AD and the research gaps that need to be filled.Section 3 explains the materials used and the integration framework proposed in this study.Section 4 presents the results, and the findings are discussed in Section 5. Finally, conclusions are drawn and the challenges of this study are highlighted in Section 6.

Related Works
Previous studies have proposed the application of integrated statistical and machine learning models for the identification of potential miRNA candidates.Lugli et al. (2015) carried out a series of statistical and machine learning analyses to measure differential miRNAs and successfully identified seven miRNAs that showed significant differences in AD.The study compared the performance of several machine learning algorithms; however, the machine learning approaches were not used as a part of the differential miRNA expression analysis, but rather to evaluate how robust each algorithm is [22].Furthermore, a study of 14 miRNAs with differential expression in an AD group, as compared to normal controls, was conducted [23].Similar to aforementioned studies, the proposed methods of statistical and machine learning approaches were applied here to carry out different tasks in our study: statistical methods for differential miRNA expression and machine learning for prediction performance.
The lack of the utilization of machine learning techniques in the analysis of differential miRNA expression data, especially in AD-related fields, represents a research gap that needs to be addressed.

Subjects
A total of 12 subjects were recruited from the Memory Clinic and the Geriatric Clinic, University of Malaya Medical Centre (UMMC), Kuala Lumpur, Malaysia, for the present study.Blood samples were collected from the subjects, including eight patients diagnosed with AD and four normal controls.All of the subjects were over 65 years of age at the time of recruitment and had been assessed by a geriatrician with experience in dementia care.The selection criteria used in recruiting the subjects are included in the Supplementary Materials, Table S1.The subjects' details and the corresponding sample IDs used in this study are included in the Supplementary Materials, Table S2.The study protocol was approved by the Medical Research Ethics Committee, UMMC, with the approval number 2020114-9193.
The study was carried out according to the framework illustrated in Figure 1.
The study was carried out according to the framework illustrated in Figure 1.
Figure 1.The framework for the differential expression analysis of miRNA in the blood of AD patients and that of normal controls.miRNA sequencing was conducted on the samples after appropriate preparation.The raw count data were analysed using bioinformatics.Differential miRNA expression analysis was carried out using two independent approaches, i.e., statistical and machine learning.Differentially expressed miRNAs (DEMis) were subjected to miRNA target gene prediction, followed by the evaluation of enriched pathways.

Sample Preparation
A quantity of 6 ml of blood was collected from each subject in a BD Vacutainer EDTA blood collection tube.A series of routine investigations, extraction, and centrifuging were conducted, and the samples were stored at −80° C until further processing.The details of the procedure are listed in the Supplementary Materials.The DNA and other blood contaminants in the samples were eliminated, and the quantity and purity of the RNA samples were measured using Nanodrop.

Small RNA-Sequencing Analysis
Small RNA libraries were constructed from the RNA samples using a NEXTflex Illumina Small RNA-seq Kit v3 (Bioo Scientific), following the manufacturer's protocol.The libraries were loaded and sequenced on the Illumina NovaSeq 6000, and more than 10 M (1.5 Gb) reads were obtained from each library.The raw reads were first quality-checked, and low-quality bases were trimmed from the 3′ end.Subsequently, the reads were dynamically trimmed for an adapter sequence by using cutadapt [24].Clean reads were then mapped against the reference genome (H_sapien) using Bowtie [25].The matched reads were aligned to identify mature miRNAs in miRbase v22.The count data were used for further bioinformatics analysis.The framework for the differential expression analysis of miRNA in the blood of AD patients and that of normal controls.miRNA sequencing was conducted on the samples after appropriate preparation.The raw count data were analysed using bioinformatics.Differential miRNA expression analysis was carried out using two independent approaches, i.e., statistical and machine learning.Differentially expressed miRNAs (DEMis) were subjected to miRNA target gene prediction, followed by the evaluation of enriched pathways.

Sample Preparation
A quantity of 6 ml of blood was collected from each subject in a BD Vacutainer EDTA blood collection tube.A series of routine investigations, extraction, and centrifuging were conducted, and the samples were stored at −80 • C until further processing.The details of the procedure are listed in the Supplementary Materials.The DNA and other blood contaminants in the samples were eliminated, and the quantity and purity of the RNA samples were measured using Nanodrop.

Small RNA-Sequencing Analysis
Small RNA libraries were constructed from the RNA samples using a NEXTflex Illumina Small RNA-seq Kit v3 (Bioo Scientific, Austin, TX, USA), following the manufacturer's protocol.The libraries were loaded and sequenced on the Illumina NovaSeq 6000, and more than 10 M (1.5 Gb) reads were obtained from each library.The raw reads were first quality-checked, and low-quality bases were trimmed from the 3 end.Subsequently, the reads were dynamically trimmed for an adapter sequence by using cutadapt [24].Clean reads were then mapped against the reference genome (H_sapien) using Bowtie [25].The matched reads were aligned to identify mature miRNAs in miRbase v22.The count data were used for further bioinformatics analysis.

Differential Expression Analysis (i) Statistical approach: EdgeR
Raw counts from the miRNA sequencing dataset were filtered to exclude miRNAs with low expressed counts (<10 counts for every sample).The resulting counts were first scaled according to the library size, followed by normalization using a method known as the trimmed mean of M-values (TMM) [26].The normalization was based on the log-expression ratio of the read count data [26].Differential miRNA expression analysis was carried out to compare the AD and control groups, based on a linear model generalized by negative binomial distribution in edgeR [27].A p-value of <0.05 was considered significant and was applied as the threshold for selecting the top differentially expressed miRNA (DEMi) candidates.DEMi candidates with log 2 fold-change (FC) values that were >0.5 were considered as upregulated, and those with log 2 FC < −0.5 were considered as downregulated.
(ii) Machine learning approach Step 1: Hybrid carss-SVMRFE feature-selection Feature-selection was performed using the normalized miRNA dataset (with lowquality reads filtered out) so as to filter out the uninformative genes and to select a subset of genes with the most relevant features.The input expression data were first normalized and log 2 -transformed according to the trimmed mean of M-values (TMM) in edgeR, which minimized the difference in the miRNAs with low expression counts, creating a fitted dispersion with a weaker bias effect.
The present study implemented supervised feature-selection using a hybrid filterwrapper approach based on the absolute correlation-adjusted regression survival scores (carss) and multiple support vector machine recursive feature elimination (MSVM-RFE) in packages mlr3 and e1071, R [28,29].
First, the filter method carss was used to select informative variables based on the measurements of the correlation between the "decorrelated" variables, while considering the target outcome (AD/normal control).Subsequently, MSVM-RFE was conducted as the wrapper method to select miRNA subsets that could improve the results for subsequent analysis.A sequential backward elimination procedure was applied in MSVM-RFE recursively (k = 5), and feature-ranking scores were calculated in each fold.The average ranking was computed for each feature, and the best feature subset was selected.At the end of this step, the top 50 ranked features (miRNAs) were selected, and those proceeded to the next step.
Step 2: Principal component analysis (PCA) with self-organizing map (SOM) Next, SOM [30] was performed using the top 50 miRNAs that were selected in Step 1. SOM is an unsupervised clustering method of neural networks which groups and captures the input pattern of the gene expression data in terms of learning rules and then organizes it to reflect the clustering in the final layer [31].Therefore, the output of a SOM contains clusters, with each cluster containing features of similar characteristics, and the high dimensionality of the data is reduced.In this study, a SOM with a map size of 2 × 2, with hexagonal topology, was applied.Hierarchical clustering (HC) was then applied to the resulting SOM cluster to further define the clusters.The miRNAs were clustered according to their expression values, without the predefined knowledge of the dependent class labels [32].The outcomes were visualized using PCA to observe the gene expression patterns of the clusters resulting from the SOM.PCA is a method that has the ability to reduce the dimensionality of the data while compressing the complexity of the data [33,34].This technique was applied in this study to improve the interpretability of the SOM results.

miRNA Target Gene Prediction
miRNA target gene prediction was performed using DIANA-microT-CDS v5.0 [35].The prediction threshold was set to 0.7 (sensitive), and the keyword "Alzheimer" was inserted into the queries to identify potential gene targets that are related to AD.

miRNA Pathway and Gene Ontology (GO) Analysis
By utilizing the target genes predicted in the previous step, DIANA-miRPath v3.0 [36] was used to carry out miRNA pathway analysis to discover the possible pathways involved in AD pathogenesis.The target genes were enriched with KEGG pathway [37] and GO analysis [38].GO terms, including the biological process, the molecular functions, and the cellular functions, were investigated.The significant threshold of p-value <0.05 was corrected according to the false discovery rate (FDR).Additionally, the species "Human" was specified in the query.Significant and common pathways were selected using gene union tools.Furthermore, networks showing the interactions between the miRNAs and the target genes in specific pathways were depicted using Cytoscape [39].

Results
Unfortunately, one sample (AD 8) failed during the construction of the small RNA libraries and had to be excluded as analysis of the miRNA concentration of this sample by the Small RNA bioanalyzer produced an inconclusive outcome, thus leaving a total of 11 samples to be entered into the study.As a result, a total of 420 mature miRNAs were included in the analysis after reads of a low quality were removed.

Statistical Approach: edgeR
In the differential miRNA analysis using edgeR, 12 DEMi candidates (5 upregulated and 7 downregulated) were identified between the AD and normal control groups, with a significant threshold p-value of <0.05 (Table 1).

Machine Learning Approach
The original, mature miRNAs (n = 420) were first filtered using a supervised hybrid filter-wrapper approach as the feature-selection method.As the result, top 50 ranked miRNAs were identified, and an unsupervised machine learning approach using SOM was performed.
The input data were presented in the 2 × 2 feature output space, which consisted of four neurons in total.A total of 50 features of the data were clustered into 4 neurons.The mean distance was calculated based on the position of each neuron.Following that, additional subclustering was subsequently carried out, using HC on the 2 × 2 feature output space to split the four neurons into two clusters, as shown in Figure 2. Figure 2 illustrates the subclusters of the four neurons generated in SOM.Of the four neurons, three had higher connectivity with one another (pink), which indicated that these neurons were located in the same cluster.In contrast, the remaining cluster contained only one neuron (black).
Next, the result was visualized using PCA to produce a more interpretable view of the miRNA clustering.Figure 3A shows the distribution of the miRNAs in the two clusters, as extended from Figure 2. Figure 3B shows the distribution pattern of the samples, indicated by the pointing of the arrows that originate from the centre point.All of the AD samples were located on the right side of the plot, indicating higher values in these samples.Corresponding to the pattern of miRNA clustering in Figure 3A, the miRNAs in SOM cluster 1 (red dots) showed a similar distribution to the AD cohorts in Figure 3B.Hence, the member miRNAs in SOM cluster 1 in Figure 3A were identified, and 24 miRNAs were selected as DEMi candidates.Figure 2 illustrates the subclusters of the four neurons generated in SOM.Of the four neurons, three had higher connectivity with one another (pink), which indicated that these neurons were located in the same cluster.In contrast, the remaining cluster contained only one neuron (black).
Next, the result was visualized using PCA to produce a more interpretable view of the miRNA clustering.Figure 3A shows the distribution of the miRNAs in the two clusters, as extended from Figure 2. Figure 3B shows the distribution pattern of the samples, indicated by the pointing of the arrows that originate from the centre point.All of the AD samples were located on the right side of the plot, indicating higher values in these samples.Corresponding to the pattern of miRNA clustering in Figure 3A, the miRNAs in SOM cluster 1 (red dots) showed a similar distribution to the AD cohorts in Figure 3B.Hence, the member miRNAs in SOM cluster 1 in Figure 3A were identified, and 24 miRNAs were selected as DEMi candidates.Figure 2 illustrates the subclusters of the four neurons generated in SOM.Of the four neurons, three had higher connectivity with one another (pink), which indicated that these neurons were located in the same cluster.In contrast, the remaining cluster contained only one neuron (black).
Next, the result was visualized using PCA to produce a more interpretable view of the miRNA clustering.Figure 3A shows the distribution of the miRNAs in the two clusters, as extended from Figure 2. Figure 3B shows the distribution pattern of the samples, indicated by the pointing of the arrows that originate from the centre point.All of the AD samples were located on the right side of the plot, indicating higher values in these samples.Corresponding to the pattern of miRNA clustering in Figure 3A, the miRNAs in SOM cluster 1 (red dots) showed a similar distribution to the AD cohorts in Figure 3B.Hence, the member miRNAs in SOM cluster 1 in Figure 3A were identified, and 24 miRNAs were selected as DEMi candidates.

Integrated Bioinformatics and Machine Learning Approach
The DEMi candidates identified using the machine learning approach were compared with the DEMi candidates identified using edgeR so as to identify common DEMis.A Venn diagram of this comparison is shown in Figure 4.

Integrated Bioinformatics and Machine Learning Approach
The DEMi candidates identified using the machine learning approach were compared with the DEMi candidates identified using edgeR so as to identify common DEMis.A Venn diagram of this comparison is shown in Figure 4. Five common DEMis (hsa-miR-6501-5p, hsa-miR-1296-5p, hsa-miR-1307-3p, hsa-miR-4433b-5p, and hsa-miR-143-3p) were identified in this study.

Target Gene Prediction
Target gene prediction, which was carried out subsequently, identified the mRNAs associated with the five common DEMis.Notably, only three DEMis, hsa-miR-6501-5p, hsa-miR-4433b-5p, and hsa-miR-143-3p, were predicted to be related to AD and thus were selected as the DEMi signatures for this study (Table 2).

Target Gene Prediction
Target gene prediction, which was carried out subsequently, identified the mRNAs associated with the five common DEMis.Notably, only three DEMis, hsa-miR-6501-5p, hsa-miR-4433b-5p, and hsa-miR-143-3p, were predicted to be related to AD and thus were selected as the DEMi signatures for this study (Table 2).

miRNA Pathway and Gene Ontology (GO) Analysis
Next, KEGG pathways involved with the identified AD-related target genes were identified (Table 3).Interaction networks of the three identified DEMi signatures with the target genes and their corresponding pathways are illustrated in Figure 5.The results indicate that the significantly enriched pathways of the three DEMi signatures and their respective target genes are involved in six pathways, which are the AD, oxidative phosphorylation, circadian entrainment, amphetamine addiction, long-term potentiation, and oxytocin pathways.In the GO analysis, 12 significant enriched GO terms were identified (Table 4), with the suggestion that the DEMi signatures and target genes were mainly related to the generation of precursor metabolites and energy, mitochondrion, calcium-mediated signalling, and protein metabolic processes.Notably, hsa-miR-4433b-5p and hsa-miR143-3p showed common enrichment to the terms relating to Aβ, which is one of the important pathological indicators for AD.The interaction networks of the three identified DEMi signatures with the target genes and their corresponding pathways are illustrated in Figure 6.In the GO analysis, 12 significant enriched GO terms were identified (Table 4), with the suggestion that the DEMi signatures and target genes were mainly related to the generation of precursor metabolites and energy, mitochondrion, calcium-mediated signalling, and protein metabolic processes.Notably, hsa-miR-4433b-5p and hsa-miR143-3p showed common enrichment to the terms relating to Aβ, which is one of the important pathological indicators for AD.The interaction networks of the three identified DEMi signatures with the target genes and their corresponding pathways are illustrated in Figure 6.
Table 4. GO terms associated with the miRNAs and the respective target genes.Although gene GRIN2B was selected as one of the target genes for hsa-miR-143-3p in relation to AD, it was identified as being involved in neither the pathways, nor in the GO analysis.Similarly, a family member of GRIN2B, named GRIN2C, was identified in the KEGG pathway analysis but not in the GO analysis.Although gene GRIN2B was selected as one of the target genes for hsa-miR-143-3p in relation to AD, it was identified as being involved in neither the pathways, nor in the GO analysis.Similarly, a family member of GRIN2B, named GRIN2C, was identified in the KEGG pathway analysis but not in the GO analysis.

Discussion
The present study aimed to provide new insight into AD by studying miRNAs in Malaysians.Although AD is the most common type of dementia and is known to have a strong association with the accumulation of Aβ and phosphorylated tau protein, the mechanisms involved in the pathogenesis of this disease are still uncertain and may be related to environmental, genetic, cultural, and other factors [2,4].
Among these five commonly identified DEMis, only three DEMis (hsa-miR-6501-5p, hsa-miR-4433b-5p, and hsa-miR-143-3p) were predicted to have target genes related to AD (see Table 2).Although the role of hsa-miR-6501-5p in AD is ambiguous, two target genes, ATP5E and PPP3CA, were predicted to be involved in AD-related pathways.
Hsa-miR-4433, of which hsa-miR-4433b-5p is a member, has been identified as regulating glial cells and neuroimmune systems, indicating the participation of this miRNA in neurodegenerative disease [40].Hsa-miR-4433b-5p has also previously been associated with neurodegenerative diseases such as AD, Parkinson's disease (PD), and frontotemporal dementia (FTD) [41].It is negatively correlated with lipids, where the formation of Aβ is involved in the cholesterol-metabolism regulation pathway [42].In relation to AD, GRIN2C, CALM3, and NCSTN are downstream target genes of hsa-miR-4433b-5p.
Hsa-miR-143-3p has been suggested as a possible AD biomarker in review studies [43,44].In our study, hsa-miR-143-3p was downregulated in the plasma of AD patients, which is consistent with the findings seen in another study using an AD cell-culturing model [45].The overexpression of hsa-miR-143-3p has been observed to attenuate tau phosphorylation, decrease APP levels, and reduce Aβ accumulation [45].Another AD cell model, however, found that the inhibition of hsa-miR-143-3p fostered neuronal survival and indirectly slowed down AD progression, which was an upregulated expression in the serum of AD patients [43].That finding was contradictory to that of the present study, which is probably due to the different sample types used.Several genes that are related to AD, those being GRIN2B, EIF2AK3, NDUFB9, NDUFA9, PSEN1, MAPK1, NDUFS1, RYR3, BACE1, and COX4I1, have been identified as the target genes for hsa-miR-143-3p.
The roles and functions of the target genes in AD pathogenesis are summarized in Table 5.
Table 5. List of target genes and their roles and functions as related to AD.

Gene
Roles and Functions as Related to AD

•
OXPHOS dysfunction increases the level of reactive oxygen species (ROS) and oxidative stress, which subsequently leads to neuronal damage in the AD brain [47].

Protein phosphatase 3 catalytic subunit alpha (PPP3CA)
• A catalytic subunit of calcineurin, which is involved in the calcium signalling and inflammatory pathways related to AD [48].

•
Dysregulation of PPP3CA was observed in the AD brain through its involvement with oxidative stress and pathological cellular dysfunction losses [48,49].

Glutamate Ionotropic Receptor NMDA Type Subunit 2C (GRIN2C)
• Takes part in glutamate-mediated neurotoxicity, which stimulates the progressive decline of cognitive function in AD patients [50].

Calmodulin 3 (CALM3)
• An indicator for calcium signalling dysfunction, where lower expressions were detected in AD patients as compared to normal controls [51,52].• Nicastrin is one of the subunits of γsecretase that plays an important role in the amyloidogenic pathways of AD pathogenesis [53].
Glutamate Ionotropic Receptor NMDA Type Subunit 2B (GRIN2B) • Expresses in the brain regions that are predominantly affected in AD [55].

•
Involved in synaptic functioning, where its dysfunction leads to neuronal damage and cognitive impairment [55].

Eukaryotic translation initiation factor 2 alpha kinase 3 (EIF2AK3)
• Encodes for PERK protein, which is involved in cognitive activities such as learning and memory [56].

•
Involved in the oxidative phosphorylation pathway in AD [58].

•
Encodes protein presenilin 1, which is one of the subunits of γsecretase that plays an important role in the amyloidogenic pathway of AD pathogenesis [60,61].

•
The deposition of Aβ causes the increase of RYR3 expression [66].

•
The upregulation of RYR3 may form a protection for the neurons and against the impact of Aβ in the late stage of AD [67,68].

•
Shows high expression in AD patients as compared to normal controls, including in the plasma [70].

•
The inhibition of BACE1 serves as the target for the study of AD drug candidates [71].

Cytochrome c oxidase subunit 4I1 (COX4I1)
• Involved in the mitochondrion electron transport chain, a crucial mechanism in cellular metabolism and the electron transport chain [72].

•
The cleavage of APOEε4 inhibits the COX gene, leading to mitochondrial dysfunction [73].
The roles and functions of the DEMi signatures and their respective target genes further corroborate the results of the KEGG pathway and GO analysis (see Tables 3 and 4).Pathways related to oxidative phosphorylation, mitochondrial dysfunction, and calciummediated signalling are particularly highlighted in the present study.The interaction of the genes is demonstrated in Figure 7. Defects in oxidative phosphorylation, mitochondrial mechanisms, and calcium signalling are interconnected in a cascade sequence and ultimately lead to neurodegeneration in AD.Failure in oxidative phosphorylation causes the deregulation of ATP-synthase activities in mitochondria and contributes to the elevation of oxidative stress and cell death of neuronal mechanisms [74,75].Damage to mitochondrial function has been postulated as being the fundamental feature of AD pathogenesis.The alteration of mitochondrial mechanisms causes the impairment of energy metabolism in AD, especially in the brain, which consumes a high level of energy, and eventually leads to neuronal cell death [76,77].Dysregulation of calcium homeostasis is closely connected to Aβ in AD.Aβ has been reported to trigger intracellular calcium deregulation, which probably elevates reactive oxygen species (ROS), suppresses ATP production in mitochondria, and finally contributes to neurodegeneration in AD [74,78,79].Hence, the accumulation of intracellular calcium leads to neuronal death, and subsequent learning and memory impairment has been proposed [80].
The major limitation of this study is the sample size, which was unfortunately limited by budgetary constraints.Difficulties in the persuasion of patients or their caregivers to consent to the study were also occasionally encountered.With the limitation of the sample size in this study, it is clear that further investigation is required as there appear to be important revelations that may, in the future, provide much needed insight into AD.Defects in oxidative phosphorylation, mitochondrial mechanisms, and calcium signalling are interconnected in a cascade sequence and ultimately lead to neurodegeneration in AD.Failure in oxidative phosphorylation causes the deregulation of ATP-synthase activities in mitochondria and contributes to the elevation of oxidative stress and cell death of neuronal mechanisms [74,75].Damage to mitochondrial function has been postulated as being the fundamental feature of AD pathogenesis.The alteration of mitochondrial mechanisms causes the impairment of energy metabolism in AD, especially in the brain, which consumes a high level of energy, and eventually leads to neuronal cell death [76,77].Dysregulation of calcium homeostasis is closely connected to Aβ in AD.Aβ has been reported to trigger intracellular calcium deregulation, which probably elevates reactive oxygen species (ROS), suppresses ATP production in mitochondria, and finally contributes to neurodegeneration in AD [74,78,79].Hence, the accumulation of intracellular calcium leads to neuronal death, and subsequent learning and memory impairment has been proposed [80].
The major limitation of this study is the sample size, which was unfortunately limited by budgetary constraints.Difficulties in the persuasion of patients or their caregivers to consent to the study were also occasionally encountered.With the limitation of the sample size in this study, it is clear that further investigation is required as there appear to be important revelations that may, in the future, provide much needed insight into AD.Nevertheless, the study has addressed technical concerns regarding the problem of overfitting in the analysis of a limited sample size through cross-validation in MSVM-RFE.

Conclusions
This study presents preliminary findings on the differential miRNA expression in AD patients against normal controls in Malaysian subjects, providing some insight into the complex AD pathogenetic pathway.An integrative approach that combined a statistical approach, edgeR, and a two-step machine learning framework was conducted to support the analysis of data in this study.Three miRNAs, hsa-miR-6501-5p, hsa-miR-4433b-5p, and hsa-miR-143-3p, were identified as showing correlations between each other.Their biological roles in AD were indicated by predicting the target mRNAs of each respective miRNA, and pathway analysis suggested their relationships in the disease pathogenesis.Overall, the identified miRNAs, together with the target genes, were identified as being involved in pathways related to oxidative phosphorylation, mitochondrial dysfunction, and calcium-mediated signalling.Although the findings are consistent with the literature, they nonetheless represent the miRNA expression changes within a dataset characterized by a small sample size, and thus require further validation.This study provides further insight related to AD pathogenesis from the miRNA perspective, collected from the Malaysian population, which may potentially help in improving the diagnosis and treatment of this disease in the future.

Figure 1 .
Figure 1.The framework for the differential expression analysis of miRNA in the blood of AD patients and that of normal controls.miRNA sequencing was conducted on the samples after appropriate preparation.The raw count data were analysed using bioinformatics.Differential miRNA expression analysis was carried out using two independent approaches, i.e., statistical and machine learning.Differentially expressed miRNAs (DEMis) were subjected to miRNA target gene prediction, followed by the evaluation of enriched pathways.

Figure 4 .
Figure 4. Venn diagram of the DEMi candidates identified using edgeR and the machine learning approach.

Figure 4 .
Figure 4. Venn diagram of the DEMi candidates identified using edgeR and the machine learning approach.

Figure 5 .
Figure 5. DEMi signatures and target genes enriched in KEGG pathways.Enriched pathways are indicated by the nodes, and the interactions between the pathways and genes are represented by lines.The different colours of the nodes depict the different functional groups of the pathways.

Figure 5 .
Figure 5. DEMi signatures and target genes enriched in KEGG pathways.Enriched pathways are indicated by the nodes, and the interactions between the pathways and genes are represented by lines.The different colours of the nodes depict the different functional groups of the pathways.

Figure 6 .
Figure 6.DEMi signatures and target genes enriched with GO terms.Enriched GO terms are indicated by the nodes, and the interactions between the terms and genes are represented using lines.The different colours of the nodes depict the different functional groups of the GO terms.

Figure 6 .
Figure 6.DEMi signatures and target genes enriched with GO terms.Enriched GO terms are indicated by the nodes, and the interactions between the terms and genes are represented using lines.The different colours of the nodes depict the different functional groups of the GO terms.

Figure 7 .
Figure 7. KEGG Alzheimer's disease pathway (hsa05010): The selected target genes are highlighted by yellow boxes.The genes involved in oxidative phosphorylation, mitochondrial dysfunction, and the calcium signalling pathways are indicated by purple boxes.The gene GRIN2B was not identified as being involved in this pathway and is therefore excluded from the figure.

Figure 7 .
Figure 7. KEGG Alzheimer's disease pathway (hsa05010): The selected target genes are highlighted by yellow boxes.The genes involved in oxidative phosphorylation, mitochondrial dysfunction, and the calcium signalling pathways are indicated by purple boxes.The gene GRIN2B was not identified as being involved in this pathway and is therefore excluded from the figure.

Table 2 .
DEMis and the predicted target genes related to AD.

Table 2 .
DEMis and the predicted target genes related to AD.

Table 3 .
KEGG pathways associated with the DEMi signatures and the respective target genes.

Table 4 .
GO terms associated with the miRNAs and the respective target genes.