Machine Learning and rs-fMRI to Identify Potential Brain Regions Associated with Autism Severity

Igor D. Rodrigues; Emerson A. de Carvalho; Caio P. Santana; Guilherme S. Bastos

doi:10.3390/a15060195

,

and

¹

Institute of Systems Engineering and Information Technology, Federal University of Itajubá, Itajubá 37500-903, Brazil

²

IFSULDEMINAS, Computer Department, Machado 37750-000, Brazil

^*

Author to whom correspondence should be addressed.

^†

Current address: Department of Computer Engineering and Industrial Automation, University of Campinas, Campinas 13083-970, Brazil.

Algorithms2022, 15(6), 195;https://doi.org/10.3390/a15060195

This article belongs to the Special Issue Algorithms for Biomedical Image Analysis and Processing

Version Notes

Order Reprints

Abstract

Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized primarily by social impairments that manifest in different severity levels. In recent years, many studies have explored the use of machine learning (ML) and resting-state functional magnetic resonance images (rs-fMRI) to investigate the disorder. These approaches evaluate brain oxygen levels to indirectly measure brain activity and compare typical developmental subjects with ASD ones. However, none of these works have tried to classify the subjects into severity groups using ML exclusively applied to rs-fMRI data. Information on ASD severity is frequently available since some tools used to support ASD diagnosis also include a severity measurement as their outcomes. The aforesaid is the case of the Autism Diagnostic Observation Schedule (ADOS), which splits the diagnosis into three groups: ‘autism’, ‘autism spectrum’, and ‘non-ASD’. Therefore, this paper aims to use ML and fMRI to identify potential brain regions as biomarkers of ASD severity. We used the ADOS score as a severity measurement standard. The experiment used fMRI data of 202 subjects with an ASD diagnosis and their ADOS scores available at the ABIDE I consortium to determine the correct ASD sub-class for each one. Our results suggest a functional difference between the ASD sub-classes by reaching 73.8% accuracy on cingulum regions. The aforementioned shows the feasibility of classifying and characterizing ASD using rs-fMRI data, indicating potential areas that could lead to severity biomarkers in further research. However, we highlight the need for more studies to confirm our findings.

Keywords:

ABIDE; ASD; autism spectrum disorder severity classification; fMRI; machine learning

1. Introduction

Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized mainly by social impairments, commonly followed by communication challenges or restricted and repetitive patterns of behavior [1]. ASD is a substantially heterogeneous disorder in which two diagnosed subjects may have a completely different set of symptoms. Some researchers estimated that approximately one in 44 children aged eight years are in the spectrum [2]. Despite a possible gender bias regarding diagnosis, ASD seems to be a sex-related disorder, with a male-to-female ratio close to 3–4:1 [2,3,4]. Current research points to ASD as a primarily hereditary disorder. Approximately 80–83% of ASD cases are due to genetic inheritance. Close to 17–20% are due to environmental risk factors, including problems during the gestation period and the parents’ age [5,6,7].

Children and adolescents with an ASD diagnosis have medical expenses up to 6.2 times greater than those with typical development (TD), with general costs from 8.4 to 9.5 times greater than the average [8]. In addition to medical expenses, intensive behavioral interventions needed for ASD treatment have costs from USD 40,000 to USD 60,000 per child per year [9]. Moreover, most ASD individuals live in low- or middle-income countries and receive no proper support from health or social care systems, suffering from the high costs of (1) proprietary tools for diagnosis; (2) evidence-based intervention techniques, and (3) training of parents and professionals to conduct the ASD treatment process [10].

Early diagnosis and proper interventions are critical factors in reversing the impairments generated by ASD in children. Unfortunately, there are no low-cost automated tests to identify the disorder. Instead, the ASD diagnosis is performed through clinical observation, which is challenging to accomplish in young children, especially in the early years of life [11]. Early treatments may result in improved cognitive, behavioral, and social functioning, allowing, for a subset of people, an evolution that may lead to healthy adult life, as well as significant long-term societal cost reductions [12]. However, most technological tools proposed to assist the ASD intervention process showed some common limitations [13].

It is critical to comprehend the severity of each individual with ASD to plan personalized treatments and conduct more effective intervention processes. Nowadays, there are many protocols used to support diagnosis, such as the Autism Diagnostic Observation Schedule (ADOS), Autism Diagnostic Interview—Revised (ADI-R), and Social Communication Questionnaire (SCQ). However, ADOS is currently one of the most used worldwide. ADOS divides ASD classification between autism—the ones with more severe symptoms—autism spectrum—the ones with less severe symptoms—and as non-spectrum—those diagnosed outside of the spectrum [14].

An ADOS diagnosis consists of standard evaluation on three main domains: communication, social relations, and behavior. Each domain has a set of tasks to be evaluated, with different total scores. The ADOS diagnosis comprises four modules for a specific range of ages and language skills, each with different cut-offs for each of the three classes [15]. Furthermore, current ASD diagnosis is performed by trained professionals, with the help of tools such as ADOS, which has both sensibility and specificity above

80 %

[16]. It is important to note that the current ADOS version mainly used is the ADOS-2 [14], but due to our available samples, we used the ADOS in its classic version.

The last decade was marked by research looking for methods to take advantage of the recent evolution of machine learning (ML) to build automated ASD diagnosis processes [17,18,19,20]. The first works in this field date from mid-2010 [21]. Since then, there has been an increase in the number of papers and improved outcomes. Many of these works used magnetic resonance imaging (MRI) and ML combined, aiming for a positive or negative ASD diagnosis by classifying subjects between ASD and TD [21], as in [18,19,20,22,23,24].

One of MRI’s advantages is that it is a non-invasive procedure, being a prevalent method to scan the brain in living human beings [25]. There are two main uses for brain MRI: (1) the structural scan, which scans brain tissues and assesses their differences; and (2) the functional scan, which tracks the oxygen flow in the brain. This second method is usually called functional MRI (fMRI) and allows the indirect measurement of brain activities in regions of interest (ROIs). From the measured oxygen levels, it is possible to determine which regions are more activated than others [25,26].

There are many tasks applied to a subject for an fMRI scan; they range from resting state to very narrow activities, such as watching a video. The resting-state fMRI is usually called rs-fMRI, which is a means to delimitate the activity for scan acquisition. However, the other activities, in general, do not have a specific nomination. The rs-fMRI is easiest to apply and is also easy to compare between multiple studies, as it is easier to reproduce in the same setup than any other activity.

Additionally, other medical images are also combined with ML to diagnose ASD, as is the case of electroencephalograms (EEG), which try to measure brain activity by scanning magnetic signals originating from the brain. There are many different setups, but as in fMRI, many papers using resting-state scans are available, such as [27,28,29,30]. However, some other setups, such as during the ADOS test [31] or while watching videos [32], are also available. However, there are few EEG data with ASD diagnosis that are publicly available.

Meanwhile, on the fMRI side, some universities have worked together and created the Autism Brain Imaging Data Exchange (ABIDE) [33], an initiative that makes available more than 2000 brain fMRI scans for research purposes. In addition, all fMRI subjects gave consent to use their images. This initiative facilitates autism investigation by providing access to a database that otherwise would not be easily acquired. Moreover, the pre-processed data available on ABIDE I PREPROCESSED also contribute in this sense.

Therefore, we take into account the following true propositions: (1) early diagnosis and interventions lead to better outcomes for autism treatment, as well as long-term cost reduction; (2) ADOS scores allow a rating of the ASD severity; (3) promising results of ML techniques classifying ASD vs. neurotypical through the use of rs-fMRI; and (4) the ADOS scores and ASD rs-fMRI data available at ABIDE. This work aims to investigate the functional differences between autism spectrum and autistic individuals, looking for potential brain regions that may be associated with autism severity. We used ML applied to brain segments from rs-fMRI data to classify individuals from the two groups to identify these regions, selecting the ones with the greatest differences as potential biomarkers that should be more deeply investigated in future works.

The remainder of this paper is structured as follows: Section 2 presents the methodology employed. Section 3 and Section 4 present and discuss our results, while Section 5 concludes this work.

2. Methodology

This section presents this work’s methodology. It starts by describing the materials used in Section 2.1, followed by a presentation of the ADOS sub-classes for ASD classification in Section 2.2 and the region selection process in Section 2.3. Then, we explain both the ML used to classify the samples in Section 2.4 and the validation process in Section 2.5. Finally, we present the final data source in Section 2.6 and the accuracy, sensitivity, and sensibility cut-off points in Section 2.7.

2.1. Materials

In this work, we used the rs-fMRI data provided by ABIDE [33]. The ABIDE I consortium currently offers 1100 rs-fMRI scans from subjects with and without ASD diagnosis. Since our work was not an ASD vs. TD classification, all rs-fMRI data of neurotypical subjects were discarded, leaving 505 preprocessed fMRI scans from subjects with ASD diagnosis. From these ASD data, only 202 had information concerning ADOS scores for communication, social interaction, and repetitive behavior, which are essential data in our classification approach. Thus, the final data comprised 202 ASD subjects.

The original data from fMRI are 3D images over time. Therefore, applying an atlas and a preprocessing pipeline is necessary to transform the 3D images into matrices representing the brain regions (columns) and their respective activities over time (rows). The preprocessing pipeline also removes noises and other undesirable artifacts, which allows better results.

2.1.1. Automated Anatomical Labeling (AAL)

An atlas is a brain mapping that allows us to evaluate brain activity through its regions. We used the AAL atlas [34] available at ABIDE, as it is the most used atlas in the literature for ASD classification using fMRI and ML [21], reaching meaningful outcomes in [18,20,35,36,37].

In its third version, AAL segments the human brain into 116 ROIs. A detailed explanation of these regions can be seen in [34]. Table 1 presents the AAL’s labels.

Table 1. Automated anatomical labeling (ID and name).

2.1.2. Preprocessing Pipeline

Different machines across multiple sites acquired the fMRI data available at ABIDE. Moreover, some sites used different total time acquisition. Thus, some rs-fMRI scans have more frames than others.

The ABIDE offers 884 preprocessed rs-fMRI scans in four pipeline options:

Connectome Computation System (CCS);
Configurable Pipeline for the Analysis of Connectomes (CPAC);
Data Processing Assistant for rs-fMRI (DPARSF);
Neuroimaging Analysis Kit (NIAK).

These pipelines have different methods and sequences to manage fMRI data, removing noise such as head motion, skull, and magnetic interference. We only used the DPARSF pipeline in this work [26,38,39]. The criteria used for choosing DPARSF were analogous to those employed in the atlas definition process. Except for works where the authors create their preprocessing pipeline, DPARSF is the prevailing pipeline in a number of papers [21], reaching meaningful outcomes in ASD classification using rs-fMRI and ML [37,40,41,42].

The DPARSF final product is a matrix

(X, Y)

, where X is the number of columns, and Y is the number of rows. Each table column represents one ROI, according to the chosen atlas, and each table row represents the elapsed time during the scan. The number of rows (Y) could differ for each fMRI, even using the same atlas. However, the X value must be the same for all fMRI using the same atlas. For example, in a DPARSF matrix, a value (

X_{i}, Y_{j}

) represents the oxygen level of ROI i at time j.

2.2. ADOS Classification

We used the ADOS standard division for ASD diagnosis to investigate any functional differences in the severity of ASD. The ADOS standard division has previously defined cut-off points to classify subjects as autistic, ASD, or non-ASD. Table 2 shows the maximum scores and the ASD and autism cut-off points for each module (ASD score groups according to the individual’s age) and domain areas. For each ADOS module, the first line indicates the maximum value; the second line shows the ASD cut-off point, and the third line indicates the autism cut-off point, according to the domain area.

Table 2. ADOS maximum score and cut off points for ASD [15].

We adopted the cut-off points from [15] to determine into which class a given subject should be classified, based on their scores available on ABIDE. This way, if a subject scored in at least one domain above the “autism cut-off”, they were classified as Class 2 (autism). If the subject did not score above the “autism cut-off” but had at least one domain scoring above the “ASD cut-off”, they were classified as Class 1 (ASD). We classified the remaining subjects as non-ASD, discarding them. Table 3 and Table 4 show the ABIDE subjects’ distribution according to the ADOS class; the complete phenotypes of each subject are available on [33].

Table 3. ASD subjects group.

Table 4. Autistic subjects group.

Table 5, Table 6 and Table 7 present the phenotype information of the selected subjects.

Table 5. Sex distribution.

Table 6. Age distribution.

Table 7. FIQ distribution.

2.3. Region Selection

We grouped the ROIs from AAL by macro regions, considering the region name. The result was a set of regions (SoRs) (e.g., precentral left and right as one SoRs, angular left and right as one SoRs). This process resulted in 35 SoRs containing the ROIs grouped by brain region. We also included one SoRs with all the ROIs.

Table 8 presents the resulting SoRs, where the set ID is the SoRs’ identification, and the RoIs IDs match the RoIs used in Table 1.

Table 8. SoRs IDs and their respective RoIs IDs from AAL.

This approach aimed: (1) to simplify the SVC classification; and (2) to give a more generic location of the functional differences between ASD classes in a manner that would allow better comparison between existing studies that use different atlases.

2.4. SVC Classifying Algorithm

We used a supervised learning method, support vector machine (SVM), specifically the C-Support Vector Classification (SVC), to check the differences between ASD sub-classes. This method has three steps: training, validation, and test [43,44].

Based on an in-depth systematic review and meta-analysis available in [21], we selected SVM as our ML method. SVM was the most used AI tool for solving ASD classification problems, showing some reliable results when applied in similar situations [18,20,37,45,46]. The second most used method was the artificial neural network (ANN) [21]. Both approaches have similar results in the literature, with SVM slightly better in terms of sensitivity [21]. As our goal was to find potential regions of a biomarker, and due to the complexity of the problem, we decided to adopt SVM given its more direct comparison, facilitating the interpretability of the results. We used the SVM from the scikit-learn library available at [47].

SVM creates a multidimensional plane, where each object (in our case, each subject) will be positioned according to the selected features’ value. First, the sample part used for training will determine a curve to split the plane, as shown in Figure 1, where each area corresponds to one class. Then, the validation sample part will verify the accuracy of the curve, and this process will be repeated until the SVM reaches the best angle given the features, training sample, and validation sample. After this, the test sample is used to measure the SVM generalization.

Figure 1. Classification curve generated by SVM with two features.

We hypothesized that higher accuracy would reflect the existence of an interpretative way to differ each class. In other words, SoRs with higher accuracy potentially contain the regions where classes are more distinct regarding the features used. These findings can highlight the areas to consider for further investigations on functional brain activity and ASD severity.

As the main goal was to find regions where there is a functional brain difference in the ASD severity level, and there is a lack of data about SVM setups in previews works on fMRI related to ASD investigations, as observed in [21], we chose a few educated-guess setups in our experiment. The setup was related to the variables gamma, coef0, kernel, class_weight, degree, and max_iter.

The gamma delimitates how close the final classification should be regarding the training sample, with more significant values given to more rigid solutions and lower values to given more flexible solutions.

The coef0 is an independent value related to the scale of the sample. Meanwhile, the kernel is the mathematical equation used to solve the problem, and the ones available from [47] are linear, poly, rbf, sigmoid.

The class_weight option considers the size of each class in the training step, adjusting the weight accordingly. For example, regarding training, if Class 1 has three subjects and Class 2 has nine subjects, Class 1 will weigh three while Class 2 will weigh one. This process is meant to avoid the algorithm taking into account only the dominant class from training, which can jeopardize the SVM’s generalization capacity.

The degree will define the curve degree of the equation that splits the SVM classification plane. Finally, max_iter is the total training iterations allowed to be used by the algorithm, stopping the training when the value is reached, regardless of the gain.

Here, we used the following values for each variable:

gamma = [2,4],
coef0 = [1.0],
kernel= [poly],
class_weight= [balanced],
degree = [2,3],
max_iter = [400000].

2.5. Validation Process

We performed a k-fold cross-validation model to validate our process [48,49,50]. We selected k = 10, which is recommended for samples larger than 200 objects. The SVM automatically split the sample into training and test; in this case, we used the standard 70% to training and 30% for test. Therefore, the 9 folds were sent to the SVM and then split into 7/3 for training and test, and then applied in the 10th fold for validation; the process was repeated until all 10 folds were used as the validation sample.

We adopted the following division criteria to avoid bias noise:

Amount of subjects of a specific ADOS subclass in each fold, avoiding any fold having only subjects of the same subclass. For example, a fold without autistic subjects could bias the SVC always to answer ASD due to the lack of autistic subjects on training or validation.

We first divided our sample into two groups, ASD and autistic, one for each ADOS subclass. Then, we ordered them by subject ID, and for each group, we designated one subject at a time for each fold:

{S u b j e c t 1 t o F o l d 1, S u b j e c t 2 t o F o l d 2, S u b j e c t n t o F o l d (n mod 10)}

.

Thus, each fold had a balanced subclass distribution at the end of this process. Given our sample’s limitations, this process aimed to produce the most adaptive learning for our SVC.

2.6. Final Data Source

The resultant data were composed of two files for each subject. The first file contained a matrix where each column represented one of the 116 ROIs from the AAL atlas, and each row represented a picture of the brain over time. The second file was a vector with the subject’s phenotype data, including the ADOS score. Since the first row of each fMRI placed the ROI label, we removed it from the file sent to the SVM.

SVM only accept vectors as its input. Therefore, we converted the resulting matrix from DPARSF into a vector. We considered two conversion options: (1) construct a vector from the matrix where the matrix position

(X_{i}, Y_{j})

is placed on the vector position

(Z_{i + i * j})

; and (2) acquire the maximum, minimum, median, and average values for each ROI from each SoRs and create a vector

(Z_{a^{m a x}}, Z_{a^{m i n}}, Z_{a^{m e d}}, Z_{a^{a v g}}, \dots, Z_{b^{m a x}}, Z_{b^{m i n}}, Z_{b^{m e d}}, Z_{b^{a v g}})

, where a and b are, respectively, the first and the last ROI ID of a SoRs.

Both conversion options have advantages and drawbacks. The first option has the simplest preprocessing but a more significant need for computer power for the SVC to process all data. On the other hand, the second option has the drawback of a preprocessing pipeline, which will acquire the data from each subject to transform in the four values mentioned above, with loss of information due to transformation. However, due to the size reduction, the SVC requires less computer power to analyze all the data from all subjects. Thus, aiming for better scalability and facilitating human understanding of the results, we chose the second option for this paper.

2.7. Accuracy, Sensitivity, and Specificity Restrictions, and Post-Hoc Tests

We imposed restrictions on the minimum accuracy, sensitivity, and sensibility required to consider a functional difference between the two ASD sub-classes. The cut-off point was 60%, based on values achieved by other ASD vs. non-ASD classification studies [22,23,24,51,52,53]. Thus, we discarded results with accuracy (ACC), specificity (SPC), or sensibility (SNS) less than 60%.

Finally, we applied three post-hoc tests on the features from the SoRs that achieved the cut-off: addition of phenotype data, t-test, and p-value. The addition of phenotype data aimed to investigate the effect of sex, age, and FIQ on SVM accuracy for each SoRs, while t-test and p-value aimed to investigate the separability of the sample used, to investigate how they differed from both groups.

3. Results

This section presents the results of our ASD vs. autism classification experiments. All SoRs can be seen in Table 8 and each ROI used by these sets can be seen in Table 1. In this paper, we used specificity (SPC) related to the ASD classification and sensitivity (SNS) associated with the autistic classification.

Our experiments worked with a total of 202 subjects, which comprised 36 with ASD and 166 with autism, according to the ADOS scores. Table 9 shows the SoRs with the ACC, SNS, and SPC greater than or equal to 60%.

Table 9. SoRs above the required threshold.

ACC ranged from

60.9 %

(SoRs 27) to

73.8 %

(SoRs 11). SNS ranged from

60.8 %

(SoRs 27) to

76.5 %

(SoRs 11). SPC ranged from

60.0 %

(SoRs 27) to

70.8 %

(SoRs 30). This shows the existence of a non-random separation when considering five brain regions.

The t-test of each feature allows us to understand the difference between the ASD and autistic groups. The t-test results are a statistical difference between any two given groups, and positive values mean that the group 1 average is larger than group 2, while negative values mean that the group 2 average is larger than group 1. Table 10 shows the t-test result for each feature on each SoRs for which SVM had above threshold results, and the positive values mean that the ASD group average is larger than the autistic group for that feature, while negative values mean that the autistic group average is higher.

Table 10. The t-test for features on SoRs with values above required threshold.

Furthermore, reinforcing the t-test result, the p-value (scale [0,1]) of each feature from SoRs above the required threshold is plotted in Table 11. The higher p-value was 0.96 for the mean on ROI 4 (Frontal Sup. Orb. Left), the third ROI from SoRs 1, with high values indicating a risk of not being able to distinguish the two groups from each other. On the other hand, lower values indicate a high possibility of discerning the two groups using the feature. The lower p-value was 0.02 from the min on ROI 72 (Putamen Left), the first ROI from SoRs 27. The SoRs 1 has a mean p-value of 0.45 (0.43 STD), while SoRs 11 has a mean p-value of 0.32 (0.14 STD); for SoRs 23, 27, and 30, the mean p-value is 0.53 (0.51 STD), 0.30 (0.24 STD), and 0.30 (0.24 STD), respectively. Therefore, SoRs 11 has the lowest p-value STD and one of the lowest p-value means, which indicates a high probability of containing the largest set of features to classify ASD severity. It is worth noting that these values reflect only our sample and should not be used as a diagnostic tool as further research is needed to either confirm or deny our findings.

Table 11. p-values for features on SoRs with values above required threshold.

Moreover, we performed other trials adding phenotype information (age, sex, and full IQ). We used the same features and added the phenotype data in the vector sent to the ML algorithm. We executed the test for the three phenotypes together, one at a time, and all combinations of two phenotypes. We used the same process for the main experiment; the results that reached the threshold defined in Section 2.7, as well as the ACC gain, using the phenotype for each SoRs are shown in Table 12. However, as shown by [21], these features did not show a significant improvement, if any, in the sample.

Table 12. Results adding phenotype data to the SoRs.

Finally, we show the mean result for each of the features with high ACC both for ASD and autistic in Table 13 and Table 14, respectively.

Table 13. Mean values for features on SoRs from ASD sample.

Table 14. Mean values for features on SoRs from autistic sample.

4. Discussion

This paper assessed brain functional differences between ASD and autism using rs-fMRI and SVM classification (SVC). The measure used to distinguish ASD from autism was the ADOS score and cut-off points, as seen in Table 2.

Our results highlight some brain regions that potentially can distinguish functional differences between both groups (ASD vs. autism). The main finding in distinguishing the two ASD sub-classes reached up to

73.8 %

accuracy (SoRs 11). These results need to be taken with caution due to the limitations mentioned and given its Matthews Correlation Coefficient of 0.31 (scale [−1,1]), which is better than a random selection but still not ideal. However, our results show a promising path to investigate the functional difference between both ASD sub-classes.

The best ACC was reached for SoRs 11, consisting of the cingulate gyrus (cingulum), and both left and right sides of the brain for the anterior, median, and posterior. We can conjecture that brain regions such as the cingulum (

73.8 %

ACC,

76.5 %

SNS,

60.8 %

SPC) and angular (SoRs 23) (

66.3 %

ACC,

67.4 %

SNS,

60.8 %

SPC) have the potential to differentiate the severity of ASD subjects taking into consideration the ACC reached on this experiment. These SoRs applied together with methods such as ADOS may in the future allow professionals to classify individuals. The frontal lobe (SoRs 1) (

64.9 %

ACC,

65.7 %

SNS,

63.3 %

SPC) also should be considered for further investigations as it shows reasonable ACC.

Our results support previous studies [54,55,56] that point to the cingulum region functions differences between ASD vs. TD. Likewise, [19,40] detected the thalamus as a key region for classifying ASD vs. TD, and [57,58,59,60] pointed to the frontal lobe as a region where ASD vs. TD can be differentiated from each other. Angular (SoRs 23) [61,62], Heschl (SoRs 30) [63,64], and putamen (SoRs 27) [65,66] also have consistently been linked to ASD.

Since these brain regions are commonly pointed to as an ASD vs. TD differential, we can also suppose, based on our results, that such regions have the potential to describe areas where functional activity may be a biomarker for ASD severity, supporting previous investigations [64]. Therefore, we can presume the potential functional difference between subjects from the ASD group and the autism group using these ROIs.

5. Conclusions

Firstly, and most importantly, the field lacks sample data to strengthen the recent outcomes. We believe that all published studies have insufficient samples to ensure definitive conclusions on ML applied to fMRI for ASD diagnoses. For example, the ADOS used hundreds of thousands of subjects to validate its algorithm, while the sum of all subjects from all published papers regarding ML applied to fMRI (discounting the subjects duplicated for multiple studies) is not even close to this value. Therefore, any claim to solve the issue tends to be premature. Nevertheless, it is mandatory to research possible biomarkers while waiting for more available data to validate the findings.

We investigated the functional brain activity difference between ADOS ASD sub-classes (autism and ASD) using fMRI data from subjects previously diagnosed and available at ABIDE. The differences between each ASD sub-class were the ADOS score and cut-off points. We applied these data to train an ML classification algorithm (SVC) to classify the disorder severity, investigating the existence of functional brain differences across regions between both ASD sub-classes.

Our main contribution was the identification of five SoRs that potentially have discriminating patterns for ASD severity. Additionally, the suggested use of SoRs can help to improve investigations by allowing more clarity in interpreting and comparing the results, aiming to enable physicians to look up the same markers found by the ML. In this same aspect, opting to explore approaches using features more easily observed by human analyses, such as the maximum, minimum, mean, and standard deviation from each ROI, is also another contribution. These contributions can improve further research to give tools for physicians to utilize these signals when evaluating a subject, more than simply finding an ML to aid the ASD evaluation.

Our findings are consistent with previous studies on autism and brain development, bringing a promising approach to evaluating ASD subtypes. A computational aid system could improve medical diagnosis by delivering more tools for physicians’ evaluation, reducing analysis ambiguity. Further research, applied to a younger sample, can allow a computational system to assess individuals early, before the most severe symptoms begin. Distinguishing the severity of a subject can help in intervention selection, and earlier diagnosis can help set proper interventions to improve the individual’s quality of life.

Our study limitations lie mainly in the reduced sample size, which may not generalize our outcomes for all populations. However, we can speculate about these functional differences between the ASD subtypes.

Another limitation of the study was the mean age of the subjects (≃ 16 years old), which does not correspond to early diagnosis. Therefore, an additional experiment with younger subjects will be required to improve the results’ reliability.

For further works, an increase in the available subjects, including younger ones, would help to raise the accuracy as it would help to clarify how many of our results can be generalized to all populations. In addition, the research community would benefit from more available fMRI data with the respective phenotype data (such as ADOS score, age at scan, sex, FIQ), allowing more accurate investigations.

Author Contributions

Conceptualization, I.D.R., E.A.d.C., C.P.S. and G.S.B.; methodology, I.D.R. and E.A.d.C.; software, I.D.R.; validation, I.D.R., E.A.d.C. and C.P.S.; formal analysis, I.D.R.; investigation, I.D.R. and E.A.d.C.; resources, I.D.R. and G.S.B.; data curation, I.D.R.; writing—original draft preparation, I.D.R.; writing—review and editing, I.D.R., E.A.d.C. and C.P.S.; visualization, I.D.R. and E.A.d.C.; supervision, G.S.B. and E.A.d.C.; project administration, G.S.B.; funding acquisition, G.S.B. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brazil (CAPES)—Finance Code 001, and the Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG) (APQ-01565-18).

Institutional Review Board Statement

Ethical review and approval were waived for this study due to the use of publicly available, previously published data.

Informed Consent Statement

This study use of publicly available, previously published data from ABIDE.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

American Psychiatric Association. DSM-5: Diagnostic and Statistical Manual of Mental Disorders; Artmed Editora: Porto Alegre, RS, Brazil, 2014. [Google Scholar]
Maenner, M.J.; Shaw, K.A.; Bakian, A.V.; Bilder, D.A.; Durkin, M.S.; Esler, A.; Furnier, S.M.; Hallas, L.; Hall-Lande, J.; Hudson, A.; et al. Prevalence and characteristics of autism spectrum disorder among children aged 8 years—Autism and developmental disabilities monitoring network, 11 sites, United States, 2018. MMWR Surveill. Summ. 2021, 70, 1–16. [Google Scholar] [CrossRef]
Maenner, M.J.; Shaw, K.A.; Baio, J.; Washington, A.; Patrick, M.; DiRienzo, M.; Christensen, D.L.; Wiggins, L.D.; Pettygrove, S.; Andrews, G.; et al. Prevalence of autism spectrum disorder among children aged 8 years—Autism and developmental disabilities monitoring network, 11 sites, United States, 2016. MMWR Surveill. Summ. 2020, 69, 1–12. [Google Scholar] [CrossRef] [PubMed]
Loomes, R.; Hull, L.; Mandy, W.P.L. What is the male-to-female ratio in autism spectrum disorder? A systematic review and meta-analysis. J. Am. Acad. Child Adolesc. Psychiatry 2017, 56, 466–474. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bai, D.; Yip, B.H.K.; Windham, G.C.; Sourander, A.; Francis, R.; Yoffe, R.; Glasson, E.; Mahjani, B.; Suominen, A.; Leonard, H.; et al. Association of genetic and environmental factors with autism in a 5-country cohort. JAMA Psychiatry 2019, 76, 1035–1043. [Google Scholar] [CrossRef] [PubMed]
Sandin, S.; Lichtenstein, P.; Kuja-Halkola, R.; Hultman, C.; Larsson, H.; Reichenberg, A. The heritability of autism spectrum disorder. JAMA 2017, 318, 1182–1184. [Google Scholar] [CrossRef] [PubMed]
Carvalho, E.A.; Santana, C.P.; Rodrigues, I.D.; Lacerda, L.; Bastos, G.S. Hidden Markov Models to Estimate the Probability of Having Autistic Children. IEEE Access 2020, 8, 99540–99551. [Google Scholar] [CrossRef]
Shimabukuro, T.T.; Grosse, S.D.; Rice, C. Medical expenditures for children with an autism spectrum disorder in a privately insured population. J. Autism Dev. Disord. 2008, 38, 546–552. [Google Scholar] [CrossRef] [Green Version]
Amendah, D.; Grosse, S.; Peacock, G.; Mandell, D. The economic costs of autism: A review. Autism Spectr. Disord. 2011, 168, 1347–1360. [Google Scholar]
Durkin, M.S.; Elsabbagh, M.; Barbaro, J.; Gladstone, M.; Happe, F.; Hoekstra, R.A.; Lee, L.C.; Rattazzi, A.; Stapel-Wax, J.; Stone, W.L.; et al. Autism screening and diagnosis in low resource settings: Challenges and opportunities to enhance research and services worldwide. Autism Res. 2015, 8, 473–476. [Google Scholar] [CrossRef] [Green Version]
Brazil’s Ministry of Health. Diretrizes de Atenção à Reabilitação da Pessoa com Transtorno do Espectro Autista (TEA); Brazil’s Ministry of Health: Brasilia, Brazil, 2014. [Google Scholar]
Hazlett, H.C.; Gu, H.; Munsell, B.C.; Kim, S.H.; Styner, M.; Wolff, J.J.; Elison, J.T.; Swanson, M.R.; Zhu, H.; Botteron, K.N.; et al. Early brain development in infants at high risk for autism spectrum disorder. Nature 2017, 542, 348. [Google Scholar] [CrossRef]
Alves, F.J.; De Carvalho, E.A.; Aguilar, J.; De Brito, L.L.; Bastos, G.S. Applied behavior analysis for the treatment of autism: A systematic review of assistive technologies. IEEE Access 2020, 8, 118664–118672. [Google Scholar] [CrossRef]
McCrimmon, A.; Rostad, K. Test Review: Autism Diagnostic Observation Schedule, (ADOS-2) Manual (Part II): Toddler Module. J. Psychoeduc. Assess. 2014, 32, 88–92. [Google Scholar]
Lord, C.; Risi, S.; Lambrecht, L.; Cook, E.H.; Leventhal, B.L.; DiLavore, P.C.; Pickles, A.; Rutter, M. The Autism Diagnostic Observation Schedule–Generic: A standard measure of social and communication deficits associated with the spectrum of autism. J. Autism Dev. Disord. 2000, 30, 205–223. [Google Scholar] [CrossRef] [PubMed]
Falkmer, T.; Anderson, K.; Falkmer, M.; Horlin, C. Diagnostic procedures in autism spectrum disorders: A systematic literature review. Eur. Child Adolesc. Psychiatry 2013, 22, 329–340. [Google Scholar] [CrossRef]
Ghiassian, S.; Greiner, R.; Jin, P.; Brown, M. Learning to classify psychiatric disorders based on fMR images: Autism vs healthy and ADHD vs healthy. In Proceedings of the 3rd NIPS Workshop on Machine Learning and Interpretation in NeuroImaging, Chico, CA, USA, 5 December 2013; pp. 9–10. [Google Scholar]
Mahanand, B.S.; Vigneshwaran, S.; Suresh, S.; Sundararajan, N. An enhanced effect-size thresholding method for the diagnosis of Autism Spectrum Disorder using resting state functional MRI. In Proceedings of the 2016 Second International Conference on Cognitive Computing and Information Processing (CCIP), Mysuru, India, 12–13 August 2016; pp. 1–6. [Google Scholar] [CrossRef]
Iidaka, T. Resting state functional magnetic resonance imaging and neural network classified autism and control. Cortex 2015, 63, 55–67. [Google Scholar] [CrossRef] [PubMed]
Bi, X.A.; Wang, Y.; Shu, Q.; Sun, Q.; Xu, Q. Classification of Autism Spectrum Disorder Using Random Support Vector Machine Cluster. Front. Genet. 2018, 9, 18. [Google Scholar] [CrossRef] [PubMed]
Santana, C.P.; Carvalho, E.A.D.; Rodrigues, I.D.; Bastos, G.S.; Souza, A.D.D.; Brito, L.L.D. rs-fMRI and machine learning for ASD diagnosis: A systematic review and meta-analysis. Sci. Rep. 2022, 12, 6030. [Google Scholar] [CrossRef]
Chaitra, N.; Vijaya, P.A. Comparing univalent and bivalent brain functional connectivity measures using machine learning. In Proceedings of the 2017 Fourth International Conference on Signal Processing, Communication and Networking (ICSCN), Chennai, India, 16–18 March 2017; pp. 1–5. [Google Scholar]
Abraham, A.; Milham, M.P.; Di Martino, A.; Craddock, R.C.; Samaras, D.; Thirion, B.; Varoquaux, G. Deriving reproducible biomarkers from multi-site resting-state data: An Autism-based example. NeuroImage 2017, 147, 736–745. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zu, C.; Gao, Y.; Munsell, B.; Kim, M.; Peng, Z.; Zhu, Y.; Gao, W.; Zhang, D.; Shen, D.; Wu, G. Identifying High Order Brain Connectome Biomarkers via Learning on Hypergraph. In Proceedings of the Machine Learning in Medical Imaging, Athens, Greece, 17 October 2016; Wang, L., Adeli, E., Wang, Q., Shi, Y., Suk, H.I., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 1–9. [Google Scholar]
Heeger, D.J.; Ress, D. What does fMRI tell us about neuronal activity? Nat. Rev. Neurosci. 2002, 3, 142–151. [Google Scholar] [CrossRef] [PubMed]
Yan, C.; Zang, Y. DPARSF: A MATLAB toolbox for “pipeline” data analysis of resting-state fMRI. Front. Syst. Neurosci. 2010, 4, 13. [Google Scholar] [CrossRef] [Green Version]
Grossi, E.; Olivieri, C.; Buscema, M. Diagnosis of autism through EEG processed by advanced computational algorithms: A pilot study. Comput. Methods Programs Biomed. 2017, 142, 73–79. [Google Scholar] [CrossRef]
Ibrahim, S.; Djemal, R.; Alsuwailem, A. Electroencephalography (EEG) signal processing for epilepsy and autism spectrum disorder diagnosis. Biocybern. Biomed. Eng. 2018, 38, 16–26. [Google Scholar] [CrossRef]
Kang, J.; Han, X.; Song, J.; Niu, Z.; Li, X. The identification of children with autism spectrum disorder by SVM approach on EEG and eye-tracking data. Comput. Biol. Med. 2020, 120, 103722. [Google Scholar] [CrossRef] [PubMed]
Peya, Z.J.; Akhand, M.; Ferdous Srabonee, J.; Siddique, N. EEG Based Autism Detection Using CNN Through Correlation Based Transformation of Channels’ Data. In Proceedings of the 2020 IEEE Region 10 Symposium (TENSYMP), Dhaka, Bangladesh, 5–7 June 2020; pp. 1278–1281. [Google Scholar] [CrossRef]
Jayawardana, Y.; Jaime, M.; Jayarathna, S. Analysis of temporal relationships between ASD and brain activity through EEG and machine learning. In Proceedings of the 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI), Los Angeles, CA, USA, 30 July–1 August 2019; pp. 151–158. [Google Scholar]
Bajestani, G.S.; Behrooz, M.; Khani, A.G.; Nouri-Baygi, M.; Mollaei, A. Diagnosis of autism spectrum disorder based on complex network features. Comput. Methods Programs Biomed. 2019, 177, 277–283. [Google Scholar] [CrossRef] [PubMed]
Craddock, C.; Benhajali, Y.; Chu, C.; Chouinard, F.; Evans, A.; Jakab, A.; Khundrakpam, B.S.; Lewis, J.D.; Li, Q.; Milham, M.; et al. The Neuro Bureau Preprocessing Initiative: Open sharing of preprocessed neuroimaging data and derivatives. In Proceedings of the Neuroinformatics 2013, Stockholm, Sweden, 27 August–29 August 2013. [Google Scholar] [CrossRef]
Rolls, E.T.; Huang, C.C.; Lin, C.P.; Feng, J.; Joliot, M. Automated anatomical labelling atlas 3. NeuroImage 2019, 206, 116–189. [Google Scholar] [CrossRef]
Zhu, Y.; Zhu, X.; Kim, M.; Yan, J.; Wu, G. A Tensor Statistical Model for Quantifying Dynamic Functional Connectivity. In Proceedings of the Information Processing in Medical Imaging, Boone, NC, USA, 25–30 June 2017; Niethammer, M., Styner, M., Aylward, S., Zhu, H., Oguz, I., Yap, P.T., Shen, D., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 398–410. [Google Scholar]
Crimi, A.; Dodero, L.; Murino, V.; Sona, D. Case-control discrimination through effective brain connectivity. In Proceedings of the 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), Melbourne, Australia, 18–21 April 2017; pp. 970–973. [Google Scholar]
Bi, X.A.; Chen, J.; Sun, Q.; Liu, Y.; Wang, Y.; Luo, X. Analysis of Asperger Syndrome Using Genetic-Evolutionary Random Support Vector Machine Cluster. Front. Physiol. 2018, 9, 1646. [Google Scholar] [CrossRef] [Green Version]
Ashburner, J. A fast diffeomorphic image registration algorithm. NeuroImage 2007, 38, 95–113. [Google Scholar] [CrossRef]
Ashburner, J.; Friston, K.J. Unified segmentation. NeuroImage 2005, 26, 839–851. [Google Scholar] [CrossRef]
Subbaraju, V.; Suresh, M.B.; Sundaram, S.; Narasimhan, S. Identifying differences in brain activities and an accurate detection of autism spectrum disorder using resting state functional-magnetic resonance imaging: A spatial filtering approach. Med. Image Anal. 2017, 35, 375–389. [Google Scholar] [CrossRef]
Jun, E.; Suk, H.I. Region-Wise Stochastic Pattern Modeling for Autism Spectrum Disorder Identification and Temporal Dynamics Analysis. In Proceedings of the International Workshop on Connectomics in Neuroimaging, Quebec City, QC, Canada, 14 September 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 143–151. [Google Scholar]
Zhu, Y.; Zhu, X.; Zhang, H.; Gao, W.; Shen, D.; Wu, G. Reveal consistent spatial-temporal patterns from dynamic functional connectivity for autism spectrum disorder identification. In Proceedings of the International conference on Medical Image Computing and Computer-Assisted Intervention, Athens, Greece, 17 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 106–114. [Google Scholar]
Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2011, 2, 27. [Google Scholar] [CrossRef]
Platt, J.C. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 1999, 10, 61–74. [Google Scholar]
Sartipi, S.; Kalbkhani, H.; Shayesteh, M.G. Ripplet II transform and higher order cumulants from R-fMRI data for diagnosis of autism. In Proceedings of the 2017 10th International Conference on Electrical and Electronics Engineering (ELECO), Bursa, Turkey, 30 November–2 December 2017; pp. 557–560. [Google Scholar]
Ren, Y.; Hu, X.; Lv, J.; Quo, L.; Han, J.; Liu, T. Identifying autism biomarkers in default mode network using sparse representation of resting-state fMRI data. In Proceedings of the 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), Prague, Czech Republic, 13–16 April 2016; pp. 1278–1281. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Bengio, Y.; Grandvalet, Y. No unbiased estimator of the variance of k-fold cross-validation. J. Mach. Learn. Res. 2004, 5, 1089–1105. [Google Scholar]
Rodriguez, J.D.; Perez, A.; Lozano, J.A. Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 569–575. [Google Scholar] [CrossRef] [PubMed]
Fushiki, T. Estimation of prediction error by using K-fold cross-validation. Stat. Comput. 2011, 21, 137–146. [Google Scholar] [CrossRef]
Dodero, L.; Sambataro, F.; Murino, V.; Sona, D. Kernel-Based Analysis of Functional Brain Connectivity on Grassmann Manifold. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 604–611. [Google Scholar]
Dodero, L.; Minh, H.Q.; Biagio, M.S.; Murino, V.; Sona, D. Kernel-based classification for brain connectivity graphs on the Riemannian manifold of positive definite matrices. In Proceedings of the 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), Brooklyn, NY, USA, 16–19 April 2015; pp. 42–45. [Google Scholar]
Bhaumik, R.; Pradhan, A.; Das, S.; Bhaumik, D. Predicting Autism Spectrum Disorder Using Domain-Adaptive Cross-Site Evaluation. Neuroinformatics 2018, 16, 197–205. [Google Scholar] [CrossRef]
Hau, J.; Aljawad, S.; Baggett, N.; Fishman, I.; Carper, R.A.; Müller, R.A. The cingulum and cingulate U-fibers in children and adolescents with autism spectrum disorders. Hum. Brain Mapp. 2019, 40, 3153–3164. [Google Scholar] [CrossRef] [Green Version]
Ikuta, T.; Shafritz, K.M.; Bregman, J.; Peters, B.D.; Gruner, P.; Malhotra, A.K.; Szeszko, P.R. Abnormal cingulum bundle development in autism: A probabilistic tractography study. Psychiatry Res. Neuroimaging 2014, 221, 63–68. [Google Scholar] [CrossRef] [Green Version]
Ameis, S.; Fan, J.; Rockel, C.; Soorya, L.; Wang, A.; Anagnostou, E. Altered cingulum bundle microstructure in autism spectrum disorder. Acta Neuropsychiatr. 2013, 25, 275–282. [Google Scholar] [CrossRef] [Green Version]
Sundaram, S.K.; Kumar, A.; Makki, M.I.; Behen, M.E.; Chugani, H.T.; Chugani, D.C. Diffusion Tensor Imaging of Frontal Lobe in Autism Spectrum Disorder. Cereb. Cortex 2008, 18, 2659–2665. [Google Scholar] [CrossRef]
Carper, R.A.; Courchesne, E. Localized enlargement of the frontal cortex in early autism. Biol. Psychiatry 2005, 57, 126–133. [Google Scholar] [CrossRef]
Zilbovicius, M.; Garreau, B.; Samson, Y.; Remy, P.; Barthélémy, C.; Syrota, A.; Lelord, G. Delayed maturation of the frontal cortex in childhood autism. Am. J. Psychiatry 1995, 152, 248–252. [Google Scholar] [CrossRef] [PubMed]
Carper, R.A.; Courchesne, E. Inverse correlation between frontal lobe and cerebellum sizes in children with autism. Brain 2000, 123, 836–844. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Long, Z.; Huang, J.; Li, B.; Li, Z.; Li, Z.; Chen, H.; Jing, B. A Comparative Atlas-Based Recognition of Mild Cognitive Impairment With Voxel-Based Morphometry. Front. Neurosci. 2018, 12, 916. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Liu, J.; Yao, L.; Zhang, W.; Xiao, Y.; Liu, L.; Gao, X.; Shah, C.; Li, S.; Tao, B.; Gong, Q.; et al. Gray matter abnormalities in pediatric autism spectrum disorder: A meta-analysis with signed differential mapping. Eur. Child Adolesc. Psychiatry 2017, 26, 933–945. [Google Scholar] [CrossRef]
Prigge, M.D.; Bigler, E.D.; Fletcher, P.T.; Zielinski, B.A.; Ravichandran, C.; Anderson, J.; Froehlich, A.; Abildskov, T.; Papadopolous, E.; Maasberg, K.; et al. Longitudinal Heschl’s Gyrus Growth During Childhood and Adolescence in Typical Development and Autism. Autism Res. 2013, 6, 78–90. [Google Scholar] [CrossRef] [Green Version]
Kaku, S.M.; Jayashankar, A.; Girimaji, S.C.; Bansal, S.; Gohel, S.; Bharath, R.D.; Srinath, S. Early childhood network alterations in severe autism. Asian J. Psychiatry 2019, 39, 114–119. [Google Scholar] [CrossRef]
Sato, W.; Kubota, Y.; Kochiyama, T.; Uono, S.; Yoshimura, S.; Sawada, R.; Sakihama, M.; Toichi, M. Increased putamen volume in adults with autism spectrum disorder. Front. Hum. Neurosci. 2014, 8, 957. [Google Scholar] [CrossRef] [Green Version]
Hollander, E.; Anagnostou, E.; Chaplin, W.; Esposito, K.; Haznedar, M.M.; Licalzi, E.; Wasserman, S.; Soorya, L.; Buchsbaum, M. Striatal volume on magnetic resonance imaging and repetitive behaviors in autism. Biol. Psychiatry 2005, 58, 226–232. [Google Scholar] [CrossRef]

Figure 1. Classification curve generated by SVM with two features.

Table 1. Automated anatomical labeling (ID and name).

ID	Label Name	ID	Label Name	ID	Label Name
0	Precentral.L	39	ParaHippocampal.R	78	Heschl.L
1	Precentral.R	40	Amygdala.L	79	Heschl.R
2	Frontal.S.L	41	Amygdala.R	80	Temporal.S.L
3	Frontal.S.R	42	Calcarine.L	81	Temporal.S.R
4	Frontal.S.Orb.L	43	Calcarine.R	82	Temporal.Pole.S.L
5	Frontal.S.Orb.R	44	Cuneus.L	83	Temporal.Pole.S.R
6	Frontal.Mid.L	45	Cuneus.R	84	Temporal.Mid.L
7	Frontal.Mid.R	46	Lingual.L	85	Temporal.Mid.R
8	Frontal.Mid.Orb.L	47	Lingual.R	86	Temporal.Pole.Mid.L
9	Frontal.Mid.Orb.R	48	Occipital.S.L	87	Temporal.Pole.Mid.R
10	Frontal.Inf.Oper.L	49	Occipital.S.R	88	Temporal.Inf.L
11	Frontal.Inf.Oper.R	50	Occipital.Mid.L	89	Temporal.Inf.R
12	Frontal.Inf.Tri.L	51	Occipital.Mid.R	90	Cerebelum.Crus1.L
13	Frontal.Inf.Tri.R	52	Occipital.Inf.L	91	Cerebelum.Crus1.R
14	Frontal.Inf.Orb.L	53	Occipital.Inf.R	92	Cerebelum.Crus2.L
15	Frontal.Inf.Orb.R	54	Fusiform.L	93	Cerebelum.Crus2.R
16	Rolandic.Oper.L	55	Fusiform.R	94	Cerebelum.3.L
17	Rolandic.Oper.R	56	Postcentral.L	95	Cerebelum.3.R
18	Sp.Motor.Area.L	57	Postcentral.R	96	Cerebelum.4.5.L
19	Sp.Motor.Area.R	58	Parietal.S.L	97	Cerebelum.4.5.R
20	Olfactory.L	59	Parietal.S.R	98	Cerebelum.6.L
21	Olfactory.R	60	Parietal.Inf.L	99	Cerebelum.6.R
22	Frontal.S.Medial.L	61	Parietal.Inf.R	100	Cerebelum.7b.L
23	Frontal.S.Medial.R	62	SraMarginal.L	101	Cerebelum.7b.R
24	Frontal.Med.Orb.L	63	SraMarginal.R	102	Cerebelum.8.L
25	Frontal.Med.Orb.R	64	Angular.L	103	Cerebelum.8.R
26	Rectus.L	65	Angular.R	104	Cerebelum.9.L
27	Rectus.R	66	Precuneus.L	105	Cerebelum.9.R
28	Insula.L	67	Precuneus.R	106	Cerebelum.10.L
29	Insula.R	68	Paracentral.Lobule.L	107	Cerebelum.10.R
30	Cingulum.Ant.L	69	Paracentral.Lobule.R	108	Vermis.1.2
31	Cingulum.Ant.R	70	Caudate.L	109	Vermis.3
32	Cingulum.Mid.L	71	Caudate.R	110	Vermis.4.5
33	Cingulum.Mid.R	72	Putamen.L	111	Vermis.6
34	Cingulum.Post.L	73	Putamen.R	112	Vermis.7
35	Cingulum.Post.R	74	Pallidum.L	113	Vermis.8
36	Hippocampus.L	75	Pallidum.R	114	Vermis.9
37	Hippocampus.R	76	Thalamus.L	115	Vermis.10
38	ParaHippocampal.L	77	Thalamus.R

Table 2. ADOS maximum score and cut off points for ASD [15].

		Comm	SI	IS + Comm	RB
Module 1	Maximum score	10	14	24	6
	ASD cut off	2	4	7	-
	Autism cut off	4	7	12	-
Module 2		10	14	24	6
		3	4	8	-
		5	6	12	-
Module 3		8	14	22	8
		2	4	7	-
		3	6	10	-
Module 4		8	14	22	8
		2	4	7	-
		3	6	10	-

Comm (Communication); SI (Social Interaction); IS + Comm (Communication + Social Interaction); RB (Repetitive Behavior).

Table 3. ASD subjects group.

Subject Index from ABIDE
51457	50145	50995	51470	50152	51007	50803	50056	51011	50499
50960	50182	51019	50976	51211	51026	50983	51229	50142	50991
50993	51461	50146	51001	51471	50025	51008	50958	50057	51034
51018	50967	51210	51021	50981	51224

Table 4. Autistic subjects group.

Subject index from ABIDE
51456	51458	51459	51460	51462	51463	51464	51465	51466	51467
51468	51469	51472	51474	50649	50653	50651	50791	50792	50795
50798	50799	50800	50802	50804	50823	50824	50825	50954	50955
50956	50961	50962	50964	50965	50966	50968	50969	50970	50972
50973	50974	50977	50978	50979	50982	50984	50985	50986	50987
50988	50989	50990	50992	50994	50996	50997	50998	50999	51000
51002	51003	51006	51009	51010	51012	51014	51015	51016	51017
51020	51023	51024	51025	51027	51028	51029	51032	51033	51035
50143	50144	50148	50150	50153	50004	50005	50006	50007	50012
50014	50016	50022	50024	50027	50029	50183	50184	50186	50187
50188	50189	50190	50191	50212	51206	51208	51212	51214	51216
51217	51218	51221	51222	51223	51226	51234	51235	51236	51237
51239	51240	51241	51248	51249	51291	51293	51294	51295	51298
51301	51302	50477	50480	50482	50483	50486	50487	50488	50490
50491	50492	50493	50494	50496	50497	50498	50500	50502	50503
50504	50505	50507	50514	50515	50516	50518	50519	50520	50521
50524	50525	50526	50528	50529	50530

Table 5. Sex distribution.

Group	Total	Male	Female
ASD	36	32	4
Autistic	166	152	14

Table 6. Age distribution.

Group	AVG	MAX	MIN	Standard Deviation
ASD	16.47	38.76	8.0	7.90
Autistic	17.63	55.4	7.13	8.91

Table 7. FIQ distribution.

Group	AVG	MAX	MIN	Standard Deviation
ASD	108.35	132.0	76.0	13.66
Autistic	104.81	148.0	65.0	16.76

Table 8. SoRs IDs and their respective RoIs IDs from AAL.

Set ID	ROIs IDs	Set ID	ROIs IDs	Set ID	ROIs IDs	Set ID	ROIs IDs
0	[0, 1]	9	[26, 27]	18	[48, …, 53]	27	[72, 73]
1	[2, …, 5]	10	[28, 29]	19	[54, 55]	28	[74, 75]
2	[6, …, 9]	11	[30, …, 35]	20	[56, 57]	29	[76, 77]
3	[10, …, 15]	12	[36, 37]	21	[58, …, 61]	30	[78, 79]
4	[16, 17]	13	[38, 39]	22	[62, 63]	31	[80, …, 89]
5	[18, 19]	14	[40, 41]	23	[64, 65]	32	[90, …, 107]
6	[20, 21]	15	[42, 43]	24	[66, 67]	33	[108, …, 115]
7	[22, 23]	16	[44, 45]	25	[68, 69]	34	ALL
8	[24, 25]	17	[46, 47]	26	[70, 71]

[X, …, Y] is a one-to-one incremental sequence where X is the lower limit and Y the superior (e.g., [1, …, 4] is the same as [1, 2, 3, 4]).

Table 9. SoRs above the required threshold.

SoRs ID	ACC	SNS	SPC
11	73.85%	76.50%	60.83%
23	66.28%	67.38%	60.83%
1	64.88%	65.69%	63.33%
30	63.38%	61.47%	70.83%
27	60.90%	60.84%	60.00%

Table 10. The t-test for features on SoRs with values above required threshold.

Feature	SoRs
Feature	1	11	23	27	30
1st ROI max	−2.2285	−1.7574	−1.3936	−2.0293	−1.9078
1st ROI min	2.0665	1.9254	1.7192	2.2895	1.8749
1st ROI mean	−0.2457	−0.3619	−0.1699	0.4010	−0.4500
1st ROI STD	0.2434	0.2758	0.0988	−0.4227	0.4915
2nd ROI max	−1.6051	−1.7618	−1.4630	−1.8003	−1.9181
2nd ROI min	1.9787	1.6766	1.4059	2.0074	1.8686
2nd ROI mean	0.3057	−0.4697	−0.0915	−0.8066	0.8104
2nd ROI STD	−0.2794	0.4066	0.0596	0.8234	−0.6772
3rd ROI max	−1.8155	−1.7295	-	-	-
3rd ROI min	1.7308	1.6808	-	-	-
3rd ROI mean	0.0548	0.4520	-	-	-
3rd ROI STD	−0.2010	−0.5442	-	-	-
4th ROI max	−1.6348	−1.7266	-	-	-
4th ROI min	1.8527	1.9396	-	-	-
4th ROI mean	−0.1745	1.8407	-	-	-
4th ROI STD	0.1780	−1.9850	-	-	-
5th ROI max	-	−1.8367	-	-	-
5th ROI min	-	1.3644	-	-	-
5th ROI mean	-	−0.5116	-	-	-
5th ROI STD	-	0.5581	-	-	-
6th ROI max	-	−1.5904	-	-	-
6th ROI min	-	1.3676	-	-	-
6th ROI mean	-	0.5552	-	-	-
6th ROI STD	-	−0.5744	-	-	-

Table 11. p-values for features on SoRs with values above required threshold.

Feature	SoRs
Feature	1	11	23	27	30
1st ROI max	0.02696	0.08038	0.16498	0.04375	0.05785
1st ROI min	0.04007	0.05560	0.08713	0.02309	0.06227
1st ROI mean	0.80617	0.71784	0.86524	0.68888	0.65319
1st ROI STD	0.80794	0.78296	0.92141	0.67295	0.62359
2nd ROI max	0.11005	0.07963	0.14504	0.07332	0.05652
2nd ROI min	0.04922	0.09519	0.16131	0.04605	0.06314
2nd ROI mean	0.76015	0.63907	0.92717	0.42088	0.41870
2nd ROI STD	0.78026	0.68474	0.95254	0.41129	0.49906
3rd ROI max	0.07095	0.08527	-	-	-
3rd ROI min	0.08502	0.09437	-	-	-
3rd ROI mean	0.95639	0.65175	-	-	-
3rd ROI STD	0.84090	0.58691	-	-	-
4th ROI max	0.10366	0.08578	-	-	-
4th ROI min	0.06540	0.05384	-	-	-
4th ROI mean	0.86167	0.06715	-	-	-
4th ROI STD	0.85890	0.04852	-	-	-
5th ROI max	-	0.06773	-	-	-
5th ROI min	-	0.17396	-	-	-
5th ROI mean	-	0.60950	-	-	-
5th ROI STD	-	0.57737	-	-	-
6th ROI max	-	0.11332	-	-	-
6th ROI min	-	0.17297	-	-	-
6th ROI mean	-	0.57936	-	-	-
6th ROI STD	-	0.56635	-	-	-

Table 12. Results adding phenotype data to the SoRs.

SoRS ID + Phenotype Data	ACC	SNS	SPC	ACC Gain
23 + Sex	68, 88%	70, 58%	62, 5%	2, 595%
27 + Sex	62, 45%	62, 2%	63, 3%	1, 546%
30 + Age	69, 3%	68, 74%	71, 66%	5, 920%
30 + Age and Sex	66, 28%	65, 69%	69, 16%	2, 900%

The missing combinations did not reach the cut-offs in at least one of ACC, SNS, or SPC.

Table 13. Mean values for features on SoRs from ASD sample.

Feature	SoRs
Feature	1	11	23	27	30
1st ROI max	0.9605	1.6019	2.6961	1.5552	3.0939
1st ROI min	−1.0676	−1.5430	−2.5440	−1.4001	−3.0682
1st ROI mean	0.0000	0.0009	−0.0004	0.0005	−0.0019
1st ROI STD	−0.0341	−0.4287	0.1747	−0.2329	0.9709
2nd ROI max	1.1802	2.8490	2.5306	1.3805	2.3672
2nd ROI min	−1.0081	−2.8490	−2.5677	−1.3228	−2.3113
2nd ROI mean	0.0000	0.0013	0.0001	−0.0002	0.0000
2nd ROI STD	−0.0025	−0.6258	−0.0893	0.1171	0.0720
3rd ROI max	1.7238	0.9808	-	-	-
3rd ROI min	−1.8015	−1.0159	-	-	-
3rd ROI mean	0.0010	0.0005	-	-	-
3rd ROI STD	−0.5470	−0.2449	-	-	-
4th ROI max	1.7880	1.6496	-	-	-
4th ROI min	−1.6441	−1.5445	-	-	-
4th ROI mean	0.0003	0.0021	-	-	-
4th ROI STD	−0.1568	−1.0049	-	-	-
5th ROI max	-	1.6627	-	-	-
5th ROI min	-	−1.8507	-	-	-
5th ROI mean	-	−0.0001	-	-	-
5th ROI STD	-	0.0597	-	-	-
6th ROI max	-	2.9188	-	-	-
6th ROI min	-	−3.2137	-	-	-
6th ROI mean	-	0.0013	-	-	-
6th ROI STD	-	−0.6678	-	-	-

Table 14. Mean values for features on SoRs from autistic sample.

Feature	SoRs
Feature	1	11	23	27	30
1st ROI max	1.7303	2.7176	4.1538	2.8470	5.9894
1st ROI min	−1.7637	−2.7257	−4.4038	−2.7160	−5.8292
1st ROI mean	0.0002	0.0013	−0.0002	0.0001	−0.0008
1st ROI STD	−0.1171	−0.5762	0.1041	−0.0193	0.3691
2nd ROI max	1.9003	4.5360	4.1990	2.3653	4.1346
2nd ROI min	−1.8646	−4.4891	−4.0994	−2.3611	−3.9732
2nd ROI mean	−0.0002	0.0021	0.0003	0.0003	−0.0011
2nd ROI STD	0.0790	−0.9468	−0.1331	−0.1156	0.4724
3rd ROI max	2.8208	1.6436	-	-	-
3rd ROI min	−2.8687	−1.6166	-	-	-
3rd ROI mean	0.0010	0.0003	-	-	-
3rd ROI STD	−0.4340	−0.1223	-	-	-
4th ROI max	2.8618	2.8466	-	-	-
4th ROI min	−2.8870	−2.6779	-	-	-
4th ROI mean	0.0006	0.0001	-	-	-
4th ROI STD	−0.2536	−0.0148	-	-	-
5th ROI max	-	2.6724	-	-	-
5th ROI min	-	−2.6525	-	-	-
5th ROI mean	-	0.0003	-	-	-
5th ROI STD	-	−0.1597	-	-	-
6th ROI max	-	4.6039	-	-	-
6th ROI min	-	−4.9155	-	-	-
6th ROI mean	-	0.0003	-	-	-
6th ROI STD	-	−0.1874	-	-	-

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Machine Learning and rs-fMRI to Identify Potential Brain Regions Associated with Autism Severity

Abstract

1. Introduction

2. Methodology

2.1. Materials

2.1.1. Automated Anatomical Labeling (AAL)

2.1.2. Preprocessing Pipeline

2.2. ADOS Classification

2.3. Region Selection

2.4. SVC Classifying Algorithm

2.5. Validation Process

2.6. Final Data Source

2.7. Accuracy, Sensitivity, and Specificity Restrictions, and Post-Hoc Tests

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics