Support Vector Machine-Based Schizophrenia Classification Using Morphological Information from Amygdaloid and Hippocampal Subregions

Structural changes in the hippocampus and amygdala have been demonstrated in schizophrenia patients. However, whether morphological information from these subcortical regions could be used by machine learning algorithms for schizophrenia classification were unknown. The aim of this study was to use volume of the amygdaloid and hippocampal subregions for schizophrenia classification. The dataset consisted of 57 patients with schizophrenia and 69 healthy controls. The volume of 26 hippocampal and 20 amygdaloid subregions were extracted from T1 structural MRI images. Sequential backward elimination (SBE) algorithm was used for feature selection, and a linear support vector machine (SVM) classifier was configured to explore the feasibility of hippocampal and amygdaloid subregions in the classification of schizophrenia. The proposed SBE-SVM model achieved a classification accuracy of 81.75% on 57 patients and 69 healthy controls, with a sensitivity of 84.21% and a specificity of 81.16%. AUC was 0.8241 (p < 0.001 tested with 1000-times permutation). The results demonstrated evidence of hippocampal and amygdaloid structural changes in schizophrenia patients, and also suggested that morphological features from the amygdaloid and hippocampal subregions could be used by machine learning algorithms for the classification of schizophrenia.


Introduction
Schizophrenia is a complicated mental disorder characterized by auditory hallucinations, paranoid or bizarre delusions, and disorganized speech and thinking [1]. Siblings and children of schizophrenia patients have a higher genetic risk of the disorder and are often accompanied with attentional lapses and memory impairments [2]. Although it is not completely clear what causes schizophrenia, it is believed that schizophrenia is related to genetic factor, environmental factors, and brain alterations [3].
Neuroimaging studies have revealed structural and functional alterations in schizophrenia brain as compared with healthy controls [4]. Among neuroimaging techniques, magnetic resonance imaging (MRI) is a widely used noninvasive technique, which provides structural information on cell loss and metabolic changes [5]. Compared with fMRI, structural MRI (sMRI) is less sensitive to noise and could be acquired with higher spatial resolution [6]. Previous studies have shown that sMRI has played an important role in the analysis of neurological diseases [7]. In terms of schizophrenia, sMRI studies have shown cortical and subcortical alterations, which are related to language impairments both in schizophrenia patients and genetic high-risk individuals for developing schizophrenia [8,9]. The volume of limbic system, including the hippocampus, amygdala, parahippocampal gyrus, etc., Brain Sci. 2020, 10 could be severed as indicators of schizophrenia as studies have shown subcortical structural changes in schizophrenia and high-risk individuals, and have attributed the structural changes to the impending development of schizophrenia [8,9]. Both the amygdala and hippocampus are important nuclei in the limbic system [8,10]. Okada et al. have demonstrated that volume of the bilateral amygdala and hippocampus of schizophrenia patients was smaller than that of healthy controls [11]. Zheng et al. have shown reduced volume in the left hippocampal tail and Cornu Ammonis 1 (CA1) in schizophrenia patients [10]. Current diagnosis of schizophrenia is based on observed behaviors and psychiatric symptoms [12]. However, in many cases, psychiatrists are divided on the diagnosis of schizophrenia due to lack of quantitative measures such as biomarkers [13]. Therefore, it is of clinical importance to find biomarkers to diagnose schizophrenia. Machine learning based on MRI measures, or the so-called multivariate pattern analysis technique, provides a promising way to find biomarkers for schizophrenia classification [12][13][14]. Xiao et al. created a support vector machine (SVM) model which could classify schizophrenia patients and normal subjects based on whole brain gray matter densities from sMRI [15]. Cao et al. achieved classification of schizophrenia patients with combined analysis of single nucleotide polymorphisms and fMRI data based on sparse representation [16]. Recent studies have reviewed MRI-based machine learning methods for classification of schizophrenia and have found that the performance of machine learning methods for schizophrenia classification varied from 0.54 to 0.95 in terms of area under the curve (AUC) [17,18]. The machine learning algorithms used for classification included SVM, neural network, k-nearest neighbors (KNN), random forest (RF), etc. [17,18].
The hippocampus and amygdala are important subcortical nuclei which have strong associations with the severity and progression of schizophrenia [9,11], but they have not been used by machine learning algorithms for schizophrenia classification. Therefore, the purpose of this study was to explore whether morphological information from the hippocampus and amygdala could be used for schizophrenia classification based on machine learning techniques. Figure 1 demonstrates the schematic diagram of our proposed classification framework, consisting of image preprocessing, feature selection, and classification. We will give a detailed description for each step in this section.

Material and Methods
Brain Sci. 2020, 10, x FOR PEER REVIEW 2 of 14 studies have shown cortical and subcortical alterations, which are related to language impairments both in schizophrenia patients and genetic high-risk individuals for developing schizophrenia [8,9]. The volume of limbic system, including the hippocampus, amygdala, parahippocampal gyrus, etc., could be severed as indicators of schizophrenia as studies have shown subcortical structural changes in schizophrenia and high-risk individuals, and have attributed the structural changes to the impending development of schizophrenia [8,9]. Both the amygdala and hippocampus are important nuclei in the limbic system [8,10]. Okada et al. have demonstrated that volume of the bilateral amygdala and hippocampus of schizophrenia patients was smaller than that of healthy controls [11]. Zheng et al. have shown reduced volume in the left hippocampal tail and Cornu Ammonis 1 (CA1) in schizophrenia patients [10]. Current diagnosis of schizophrenia is based on observed behaviors and psychiatric symptoms [12]. However, in many cases, psychiatrists are divided on the diagnosis of schizophrenia due to lack of quantitative measures such as biomarkers [13]. Therefore, it is of clinical importance to find biomarkers to diagnose schizophrenia. Machine learning based on MRI measures, or the so-called multivariate pattern analysis technique, provides a promising way to find biomarkers for schizophrenia classification [12][13][14]. Xiao et al. created a support vector machine (SVM) model which could classify schizophrenia patients and normal subjects based on whole brain gray matter densities from sMRI [15]. Cao et al. achieved classification of schizophrenia patients with combined analysis of single nucleotide polymorphisms and fMRI data based on sparse representation [16]. Recent studies have reviewed MRI-based machine learning methods for classification of schizophrenia and have found that the performance of machine learning methods for schizophrenia classification varied from 0.54 to 0.95 in terms of area under the curve (AUC) [17,18]. The machine learning algorithms used for classification included SVM, neural network, k-nearest neighbors (KNN), random forest (RF), etc. [17,18].
The hippocampus and amygdala are important subcortical nuclei which have strong associations with the severity and progression of schizophrenia [9,11], but they have not been used by machine learning algorithms for schizophrenia classification. Therefore, the purpose of this study was to explore whether morphological information from the hippocampus and amygdala could be used for schizophrenia classification based on machine learning techniques. Figure 1 demonstrates the schematic diagram of our proposed classification framework, consisting of image preprocessing, feature selection, and classification. We will give a detailed description for each step in this section.

Participants and sMRI Acquisition
sMRI data were acquired from the Center for Biomedical Research Excellence database (http: //fcon_1000.project.nitrc.org/indi/retro/cobre.html). The dataset received full approval of local ethics committees in accordance with the Declaration of Helsinki. All subjects gave informed consent and their anonymity was preserved in the dataset. A total of 147 samples were obtained, including 72 patients with schizophrenia and 75 control subjects. The diagnosis of schizophrenia patients was based on the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV SCID). The subjects were all right-handed. Specifically, 15 patients with schizophrenia and 6 healthy controls were excluded during data preprocessing, leaving a total of 126 participants, including 57 patients with schizophrenia and 69 healthy controls. Among the 57 schizophrenia patients, there were 3 disorganized type (295.1), 37 paranoid type (295.3), 9 residual type (295.6), 4 schizoaffective type (295.7), and 4 unspecified type (295.9).

Imaging Processing
In this study, the reconstruction of cortical surface was carried out on T1-weighted images using FreeSurfer 6.0 [19]. Image preprocessing included the following steps: motion correction, brain extraction, Talairach transformation, intensity correction, and brain tissue segmentation [10]. After preprocessing, quality control was performed by a certified neuroradiologist, 15 patients with schizophrenia and 6 healthy controls were excluded because there were errors in skull stripping or segmentation.
The hippocampal and amygdaloid subregions were segmented using FreeSurfer development version [20] as demonstrated in Figure 2 [21,22]. We selected 46 structural features including 26 hippocampal features and 20 amygdaloid features. The 26 hippocampal features were: mean volume of the hippocampal tail, subiculum, CA1, CA3, CA4, hippocampal fissure, presubiculum, parasubiculum, molecular layer, granule cell layer of the dentate gyrus, fimbria, hippocampal-amygdala-transition-area (HATA), and whole hippocampus in the bilateral hemispheres. Optimally, 20 amygdaloid features included: mean volume of the lateral nucleus, basal nucleus, accessory basal nucleus, anterior amygdaloid area (AAA), central nucleus, medial nucleus, cortical nucleus, corticoamygdaloid-transition area (CAT), paralaminar nucleus, and whole amygdala in the bilateral hemispheres. Then, the sMRI features were normalized to a range between 0 and 1 by linear scaling between the minimal and maximal values of each feature. A binary label with 1 for schizophrenia patients and −1 for healthy controls was used. GC-DG: granule cell layer of the dentate gyrus.

Feature Selection
In machine learning, some features are irrelevant for classification, so excluding certain features not only reduces computational complexity but also improves classification accuracy [14,15]. Sequential backward elimination (SBE) algorithm has been applied by several state-of-the-art studies and has achieved better results compared with other feature selection algorithms [23,24]. Therefore, in this study, SBE was adopted for feature selection, and classification error rate of a linear SVM was configured as the criterion function that SBE used to eliminate features and to determine when to terminate.
Starting from the full feature set, SBE created a candidate feature subset by sequentially eliminating each of the feature, which has not yet been eliminated in each backward step. Then, for each candidate feature subset (each backward step), SBE performed leave-one-out cross-validation by repeatedly calling the criterion function with different training set and test set. SBE used training set to train the linear SVM, then predicted values for test set using that model, and output classification error rate. If eliminating a certain feature reduced classification error rate, this feature would be eliminated, and vice versa. SBE algorithm repeated backward steps and stopped until there was no decrease in classification error rate.
We also adopted several competing feature selection approaches for the comparison with the proposed SBE algorithm.

Sequential Selection and Its Variants
Three other sequential selection algorithms were adopted, i.e., sequential forward selection (SFS), sequential forward floating selection (SFFS), and sequential backward floating selection (SBFS).
Unlike SBE, SFS started from an empty set and created a candidate feature subset by sequentially selecting each feature, which has not yet been selected. The criterion function of SFS was the same with SBE.
SFFS also started with an empty set and created a candidate feature subset by sequentially selecting each feature in each forward step. Different from SFS algorithm, SFFS performed backward steps as long as the criterion function increased. SBFS started with the full feature set and performed each backward step by sequentially eliminating features from the full set. SBFS performed forward Then, the sMRI features were normalized to a range between 0 and 1 by linear scaling between the minimal and maximal values of each feature. A binary label with 1 for schizophrenia patients and −1 for healthy controls was used. GC-DG: granule cell layer of the dentate gyrus.

Feature Selection
In machine learning, some features are irrelevant for classification, so excluding certain features not only reduces computational complexity but also improves classification accuracy [14,15]. Sequential backward elimination (SBE) algorithm has been applied by several state-of-the-art studies and has achieved better results compared with other feature selection algorithms [23,24]. Therefore, in this study, SBE was adopted for feature selection, and classification error rate of a linear SVM was configured as the criterion function that SBE used to eliminate features and to determine when to terminate.
Starting from the full feature set, SBE created a candidate feature subset by sequentially eliminating each of the feature, which has not yet been eliminated in each backward step. Then, for each candidate feature subset (each backward step), SBE performed leave-one-out cross-validation by repeatedly calling the criterion function with different training set and test set. SBE used training set to train the linear SVM, then predicted values for test set using that model, and output classification error rate. If eliminating a certain feature reduced classification error rate, this feature would be eliminated, and vice versa. SBE algorithm repeated backward steps and stopped until there was no decrease in classification error rate.
We also adopted several competing feature selection approaches for the comparison with the proposed SBE algorithm.

Sequential Selection and Its Variants
Three other sequential selection algorithms were adopted, i.e., sequential forward selection (SFS), sequential forward floating selection (SFFS), and sequential backward floating selection (SBFS).
Unlike SBE, SFS started from an empty set and created a candidate feature subset by sequentially selecting each feature, which has not yet been selected. The criterion function of SFS was the same with SBE.
SFFS also started with an empty set and created a candidate feature subset by sequentially selecting each feature in each forward step. Different from SFS algorithm, SFFS performed backward steps as long as the criterion function increased. SBFS started with the full feature set and performed each Brain Sci. 2020, 10, 562 5 of 14 backward step by sequentially eliminating features from the full set. SBFS performed forward steps as long as the criterion function increased. In the current study, the criterion functions of SFFS and SBFS were the same as SBE.

T-Test
The differentiation degree of each feature between schizophrenia patients and healthy controls was compared via two-sample t-test to test whether there were significant differences in the mean value of the features between the two groups. In this study, we used a threshold of |t| > 1.04 (p < 0.3) to exclude features and improve classification performance. The threshold value was determined by grid search.

F-Score
F-score is a simple and effective metric for evaluation of a binary classification model [25]. Given the number of schizophrenia patients (n + ) and healthy controls (n − ), the F-score of the ith feature is defined as follows [25]: k,i are the ith feature of the kth positive and negative instance, respectively. In our study, the threshold was set at F-score < 0.009 determined by grid search.

Random Forest
Gini Impurity Index from the RF model is often used to evaluate the importance of features in machine learning [26]. In the current study, we borrowed this metric for feature selection. We kept the feature whose Gini Impurity Index was not equal to 0.
There were only 46 sMRI features in this study, so in order to evaluate the effect of feature selection, we also used the full feature set to train the linear SVM classifier.

Linear SVM
SVM is a type of supervised learning method, which is widely used in machine learning field [15]. In this study, LIBSVM toolbox based on the MATLAB platform was used to implement SVM classifier [27,28]. Linear SVM has been widely applied in multivariate pattern analysis due to its high accuracy, generalization, and interpretability [15,17,18]. Therefore, in this study, linear kernel was selected. For the hyperparameter C in the SVM classifier, which controls the balance between classification error and model generalization, was set at 112 after a coarse grid search to obtain the best performance. A weight of 1.3 was added to the schizophrenia group, making the parameter C for the patients' class to 1.3×C to maintain a balance between the two classes. In order to obtain a reliable performance and to avoid overfitting, leave-one-out cross-validation strategy was used in our study. In each cross-validation fold, 125 subjects were used for training and the remaining one subject was selected to test the model. The iteration continued for 126 times. To represent relative contribution of different features for schizophrenia classification, we accumulated the absolute value of the weight across all cross-validation folds.

Competing Algorithms
In this study, three competing algorithms were used to classify schizophrenia patients, namely, KNN, RF, and feedforward neural network (FNN).
A KNN model was configured with Euclidean distance as distance measure. The value of k was set to 5. In the RF model, 100 trees were used. An FNN with three layers was configured with 46 neurons in the input layer, 15 neurons in the hidden layer, and 1 neuron in the output layer. tansig function was defined as the activation function for the hidden and output layer. Mean-square error was defined as cost function and Levenberg-Marquardt algorithm was used as weight update algorithm for the FNN. All three classifiers were implemented using build-in functions of MATLAB.

Evaluating Metrics
Accuracy, sensitivity, and specificity were used to evaluate the performance of the classifiers. The three metrics were computed in each cross-validation fold and were finally averaged to get the mean values. In addition, receiver operating characteristic (ROC) analysis was also used to evaluate the performance of the classifiers. Area under the curve (AUC) calculated from the ROC curve was used as an indicator of classification performance. Permutation test was applied to explore whether the AUC obtained through the proposed model was significantly higher than AUC of a random guess by randomly permuting the labels of the training data 1000 times prior to the training step and then followed by the entire classification process. The AUCs were obtained across all permutations and the p value was calculated as the proportion of AUCs that were equal to or greater than the AUC obtained by the proposed methods. Statistical significance was set at p < 0.05.

Post Hoc Analysis
According to the cumulative absolute weight, eight features with top cumulative absolute weights were selected by the linear SVM classifier. General linear model was applied to explore whether the features were discriminative between schizophrenia patients and healthy controls with age and gender as nuisance covariates. False discovery rate (FDR) correction was used to control false positives, and p < 0.05 was considered statistically significant. Pearson correlation analysis was performed to explore the relationship between demographic information and the eight features in schizophrenia group. FDR correction with p < 0.05 was considered statistically significant. Furthermore, independent t-test was used to explore whether the features were discriminative between paranoid type and other types of schizophrenia. Table 1 illustrates the demographic information for the 57 schizophrenia patients and 69 healthy controls. The differences in age and gender between the two groups were assessed by independent t-test and chi-square test, respectively. There were no significant differences in age and gender between schizophrenia patients and healthy controls.  Table 2 shows the results of different feature selection approaches. It was obvious that SBE approach associated with a linear SVM classifier was superior to other feature selection approaches in terms of accuracy, sensitivity, specificity, AUC, and p value obtained from permutation test. Other feature selection approaches could also improve classification accuracy by 2-20% compared with the full feature set, which indicated that the step of feature selection in our study is useful for improving classification performance. However, it was also worth mentioning that the several approaches, namely, F-score, t-test, Gini Index, and SBFS decreased the specificity compared with the full feature set. In addition, it could be observed that number of features selected by SBE algorithm was more than that of others except for SBFS. The ROC curves of different feature selection approaches associated with linear SVM classifiers are shown in Figure 3a. SBE-SVM outperformed the full feature set and the other competing feature selection approaches associated with SVM classifiers. Table 3 and Figure 3b show classification results of different classifiers. The linear SVM classifier outperformed competing classification algorithms in terms of accuracy, sensitivity, specificity, and AUC. The ROC curves of different feature selection approaches associated with linear SVM classifiers are shown in Figure 3a. SBE-SVM outperformed the full feature set and the other competing feature selection approaches associated with SVM classifiers.   Figure 4 displays the results of post hoc analysis. Post hoc results showed that there were significantly statistical differences in the volume of the left hippocampal tail (p = 0.004, FDR corrected), left CA1 (p = 0.04, FDR corrected), left basal nucleus (p = 0.011, FDR corrected), and left AAA (p = 0.004, FDR corrected) between schizophrenia patients and healthy controls. However, the volume of the other 4 hippocampal and amygdaloid subregions did not show significant differences between the two groups. In terms of correlation analysis, mean volume of the left AAA, right accessory basal nucleus, and right cortical nucleus showed negative correlations with age in schizophrenia group. Figure 5 shows the results of independent t-test. As shown in the figure, none of the 8 features were significantly discriminative between paranoid type and other types of schizophrenia.

Post Hoc Analysis Results
volume of the other 4 hippocampal and amygdaloid subregions did not show significant differences between the two groups. In terms of correlation analysis, mean volume of the left AAA, right accessory basal nucleus, and right cortical nucleus showed negative correlations with age in schizophrenia group. Figure 5 shows the results of independent t-test. As shown in the figure, none of the 8 features were significantly discriminative between paranoid type and other types of schizophrenia. Figure 4. Results of post hoc analysis: (a) histogram of features with top cumulative absolute weight between the two groups. **, p < 0.05 (false discovery rate (FDR) corrected). (b) correlation analysis between mean volume of the left anterior amygdaloid area (AAA) and age; (c) correlation analysis between mean volume of the right accessory basal nucleus and age; and (d) correlation analysis between mean volume of the right cortical nucleus and age, the red area represents 95% confidence interval.

Discussion
Diagnosis of schizophrenia is clinically dependent on psychiatric examinations since biomarkers that could accurately classify schizophrenia remain unknown [15,29]. Machine learning algorithms associated with neuroimaging features provide a promising way for schizophrenia diagnosis [18]. To

Discussion
Diagnosis of schizophrenia is clinically dependent on psychiatric examinations since biomarkers that could accurately classify schizophrenia remain unknown [15,29]. Machine learning algorithms associated with neuroimaging features provide a promising way for schizophrenia diagnosis [18]. To date, machine learning algorithms including SVM, RF, KNN, FNN, and deep learning algorithms associated with fMRI and sMRI features have been used in schizophrenia diagnosis [17,18]. The performance of machine learning algorithms varied from 70% to 90% in terms of accuracy and from 0.54 to 0.95 in terms of AUC [17]. The high performance demonstrated that machine learning algorithms were useful in recognizing and detecting schizophrenia patients at an early stage.
In addition to schizophrenia classification, machine learning algorithms were useful for identifying biomarkers in schizophrenia. Indeed, previous studies have reported several brain regions with potential diagnostic values including the occipital and frontal gyrus [15,16,29]. Raymond et al. have used gray matter, white matter features, and cortical thickness to distinguish schizophrenia patients from healthy subjects, and they achieved a maximum accuracy of 77% [30]. Castellani et al. have chosen the dorsolateral prefrontal cortex as feature for machine learning, and have achieved a maximum accuracy of up to 84.09% [31]. In the current study, we focused on only two key subcortical nuclei, the amygdala and hippocampus and explored whether the features extracted from these two subcortical nuclei could be used by machine learning algorithm to classify schizophrenia. The SVM classifier based on morphological features from the amygdala and hippocampus had a relatively high accuracy in schizophrenia diagnosis compared with previous studies [17,18]. In addition, SVM has identified several amygdaloid and hippocampal subregions having potential values for schizophrenia classification. The results demonstrated that the proposed approach could be used in assisting clinical schizophrenia diagnosis and also revealed that the hippocampus and amygdala were closely involved in the pathology of schizophrenia, which was consist with previous studies [8][9][10][11]32].
Among the subregions with top contributions to the classification, group analysis showed that there were differences in the volume of the left hippocampal tail, left CA1, left basal nucleus, and left AAA between the two groups, which indicated that these features were discriminative between schizophrenia patients and healthy subjects. Zheng et al. have demonstrated volumetric decline of several subregions in the hippocampus and amygdala in schizophrenia and have speculated key role of these regions in the pathophysiology of schizophrenia [10]. In line with the previous study [10], our results might not only indicate the importance of these three features in the pathophysiology of schizophrenia but also prove the effectiveness of feature selection and cumulative absolute weight in schizophrenia classification.
There was evidence that the activation of postsynaptic 5-HT (1A) receptors in the hippocampal tail was related to stress adaptation and could prevent learned helplessness [33]. The decreased volume in the hippocampal tail indicated that schizophrenia patients might experience decreased learning and memory ability [10]. A study by Kesner et al. showed that CA1 could code the time sequence of events and CA1 was associated with intermediate-term memory [34]. In addition, Schobel et al. found that the CA1 in the hippocampal subregion was differentially targeted by schizophrenia and related psychotic disorders [35]. Several studies attributed differential changes in the CA1 as basal hypermetabolic activity, and demonstrated that dysfunction in the CA1 was a selective defect that predicted progression of schizophrenia [35,36]. Therefore, the contribution of CA1 in classification of schizophrenia might indicate that the left CA1 is closely related to the pathophysiology of schizophrenia.
In our study, we found that the lateral nucleus and CAT in the left amygdala could be used to classify schizophrenia. A previous study speculated that the left amygdala was more closely related to emotion processing in schizophrenia [10]. The lateral nucleus is the largest nucleus, and it is the first structure to appear in the anterior portion of the amygdala [10]. The lateral nucleus could be divided into dorsal, ventral intermediate, and ventral subdivisions [37]. Previous findings showed that the lateral nucleus was an important sensory interface which send projections to the CAT (a zone of confluence of the medial parvicellular basal nucleus, paralaminar nucleus, and sulcal periamygdaloid cortex) [38]. Several studies have demonstrated decreased volume in the lateral nucleus of amygdala in schizophrenia patients compared with normal controls and have concluded that reduced volume of the lateral nucleus could play an important role in the pathological process of schizophrenia [10,39]. The CAT is located in the medial border of the amygdala with connections to many regions such as the hippocampus and temporal cortex [37,40]. The CAT is also a zone of confluence in the amygdalohippocampal area [37,38]. Brown et al. demonstrated volumetric association of the bilateral CAT with neurological disorders such as major depressive disorder [41]. The contributions of the left lateral nucleus and left CAT in the classification of schizophrenia suggested that the two subregions might play more important roles in the pathophysiology of schizophrenia.
The present study showed that the accessory basal nucleus and cortical nucleus in the right amygdala could be used to classify schizophrenia. A neuroimaging study has found that volume of all subregions in the right amygdala was decreased in schizophrenia when compared with psychotic bipolar disorder [42]. The basal nucleus and accessory basal nucleus are the main input sources to the ventral striatum [43]. The ventral striatum and amygdala are two mesolimbic structures associated with psychosis in schizophrenia [44]. Therefore, the results may reflect a potential role of the accessory basal nucleus in psychosis in schizophrenia. The cortical nucleus is a small circular nucleus, which borders the accessory basal nucleus [10]. In addition, in a study with elderly schizophrenia patients and control group, Prestia et al. found that outer soft tissue of the cortical nucleus was lost in schizophrenia patients [45].
Correlation analysis demonstrated negative correlations between volume of the left AAA, right accessory basal nucleus and right cortical nucleus, and age in schizophrenia group. Although there was no significant difference in age between schizophrenia patients and healthy subjects in the present study, previous longitudinal studies have demonstrated that schizophrenia patients have significantly smaller volume in several subcortical nuclei compared with controls at baseline, and the volumetric changes of these subcortical nuclei with age in schizophrenia patients have similar trajectories compared with those of healthy controls [46,47]. According to previous studies [46,47], the contributions of these three subregions in schizophrenia classification may be more related to the pathophysiology of schizophrenia, rather than ageing. Nevertheless, the negative correlations could be explained by ageing-related neurodegeneration.
None of the eight subregions showed volumetric differences among different subtypes of schizophrenia, indicating that no neurobiological differences in the hippocampal and amygdaloid subregions exist among different subtypes of schizophrenia. In a recent study, Lutz et al. have found no large neurobiological differences between paranoid and nonparanoid schizophrenia [48]. In line with the previous study, our results indicated that subtypes of schizophrenia characterized by clinical phenomenology might have difficulty in resolving neurobiological heterogeneity in schizophrenia due to overlapping symptomatology and longitudinal instability [48], while machine learning may stand a chance of investigating neurobiological heterogeneity in schizophrenia [49].
There were several limitations that need to be addressed. First, this study lacked clinical information such as positive and negative syndrome scale (PANSS), which limited clinical interpretation of machine learning results. Future studies will focus on a larger study sample with completed clinical information. Second, we only separated the dataset into training set and test set due to relatively small study sample, the generalization of the proposed approach should be further validated using a larger study sample. Third, the proposed approach was only used to separate schizophrenia patients from healthy controls, however, other disorders with structural changes in the amygdala and hippocampus were not included in the current study.

Conclusions
In this study, we used the volume of the hippocampal and amygdaloid subregions extracted from sMRI data as features and used the SBE-SVM for the classification of schizophrenia. The classifier outperformed competing algorithms with a classification accuracy of 81.75%, and identified several hippocampal and amygdaloid subregions that had potential diagnostic value for schizophrenia. The proposed algorithm showed potential for assisting clinical diagnosis of schizophrenia. Neurobiological mechanism of the hippocampal and amygdaloid subregions in schizophrenia needs further investigation.