Node Centrality Measures Identify Relevant Structural MRI Features of Subjects with Autism

Autism spectrum disorders (ASDs) are a heterogeneous group of neurodevelopmental conditions characterized by impairments in social interaction and communication and restricted patterns of behavior, interests, and activities. Although the etiopathogenesis of idiopathic ASD has not been fully elucidated, compelling evidence suggests an interaction between genetic liability and environmental factors in producing early alterations of structural and functional brain development that are detectable by magnetic resonance imaging (MRI) at the group level. This work shows the results of a network-based approach to characterize not only variations in the values of the extracted features but also in their mutual relationships that might reflect underlying brain structural differences between autistic subjects and healthy controls. We applied a network-based analysis on sMRI data from the Autism Brain Imaging Data Exchange I (ABIDE-I) database, containing 419 features extracted with FreeSurfer software. Two networks were generated: one from subjects with autistic disorder (AUT) (DSM-IV-TR), and one from typically developing controls (TD), adopting a subsampling strategy to overcome class imbalance (235 AUT, 418 TD). We compared the distribution of several node centrality measures and observed significant inter-class differences in averaged centralities. Moreover, a single-node analysis allowed us to identify the most relevant features that distinguished the groups.


S1. Stratification with respect to age and multi-site data collection
In this supplementary section, we show that the sMRI data of the subset of ABIDE-I subjects analyzed in this work (i.e., right-handed TD and AUT males, as explained in Section 2.1) is well mixed with respect to acquisition site and age, that thus we did not consider as confounding factors. Acquisition site and age were considered because they are the only variables reported for all subjects. Other information, such as Body Mass Index (BMI), is only reported for a minority of the subjects, making a detailed analysis impossible, thus we did not consider it.
The statistical tests on age distributions included in the main paper (Section 2.1, Section 3.1, and Table 1) were performed to show that selecting right-handed TD and AUT males introduces no agerelated bias compared to the overall ABIDE-I dataset. The contribution of this additional analysis is checking that this restriction on subjects does not induce any artifact (such as clusters) of the sMRI brain features with respect to acquisition centers and age.
The following figures are visualizations of the selected subjects' brain features produced using t-SNE [1,2] and UMAP [3], two established methods to embed high-dimensional data into a lowdimensional space to highlight possible stratifications or clusters. Datapoints (i.e. subjects) colored based on acquisition center or age. As it can be seen, no clusters or stratifications form. Therefore, it is possible to conclude that the cerebral data of the subset of ABIDE-I subjects considered in this study is well stratified as to both acquisition site and age.

S2. Single feature statistical analysis
In this supplementary section, we provide more insight into the distributions of the single brain features of the ABIDE-I subjects considered in this study. All our work is based on second-order statistics (i.e. comparison of relationships between couples of features), but here we show the results of two statistical analyses performed at a single-feature level.
To compare control and AUT groups, we performed a parametric analysis (Student's T test) and a nonparametric test (Wilcoxon's ranksum test). We show here the results for the most significant features (P<0.05 after Bonferroni's post-hoc correction of statistical significance): we found 6 features significantly different for the parametric test, and 4 features for the nonparametric test, with 4 overlapping features in both tests.

S3. Results with resampling for both TD and AUT
To test the robustness of our analysis, we repeated the whole-network comparison stage of our pipeline performing the resampling procedure on both TD and AUT subject groups. This introduces variance for the AUT measures too, thus putting effect size and statistical significance to the test.
Specifically, -for the TD case, we generate Ksuppl = 1000 networks, using Ksuppl random subsamples of Msuppl = 188 TD subjects; -for the AUT case, we do the same (1000 networks, 188 samples for each resampling). Apart from the resampling procedure, all the other steps of the analysis (including visualization) are kept identical to the main experiment, as explained in Section 3. The results are displayed in the figures and in the table below. First, the two cluster maps (left: TD; right: AUT) show that the centrality measures form the same 4 clusters as for the resampling procedure described in the paper. Secondly, the distributions of the whole-network average of every centrality show the same behavior as in the main experiment: the distributions are clearly separated, and the differences have the same sign as in the main experiment.
The reported table is analogous to Table 2 of the main paper, and reports the z-score (i.e, difference of the means, normalized by the pooled variance) for the twelve centrality measures considered: we remark that the top centrality measure for each measure cluster is the same as in the main experiment: Betweenness, Clustering Coefficient, and Weighted Spectral Centrality are the top-separating features inside Cluster I, II, and IV, respectively, whereas the Inverse Participation Ratio constitutes a cluster on its own, as in the main experiment.
Thus, these supplementary results confirm the robustness of our network-based approach, independently of the subsampling strategy adopted.