De Novo Variants Found in Three Distinct Schizophrenia Populations Hit a Common Core Gene Network Related to Microtubule and Actin Cytoskeleton Gene Ontology Classes

Schizophrenia (SZ) is a heterogeneous and debilitating psychiatric disorder with a strong genetic component. To elucidate functional networks perturbed in schizophrenia, we analysed a large dataset of whole-genome studies that identified SNVs, CNVs, and a multi-stage schizophrenia genome-wide association study. Our analysis identified three subclusters that are interrelated and with small overlaps: GO:0007017~Microtubule-Based Process, GO:00015629~Actin Cytoskeleton, and GO:0007268~SynapticTransmission. We next analysed three distinct trio cohorts of 75 SZ Algerian, 45 SZ French, and 61 SZ Japanese patients. We performed Illumina HiSeq whole-exome sequencing and identified de novo mutations using a Bayesian approach. We validated 88 de novo mutations by Sanger sequencing: 35 in French, 21 in Algerian, and 32 in Japanese SZ patients. These 88 de novo mutations exhibited an enrichment in genes encoding proteins related to GO:0051015~actin filament binding (p = 0.0011) using David, and enrichments in GO: 0003774~transport (p = 0.019) and GO:0003729~mRNA binding (p = 0.010) using Amigo. One of these de novo variant was found in CORO1C coding sequence. We studied Coro1c haploinsufficiency in a Coro1c+/− mouse and found defects in the corpus callosum. These results could motivate future studies of the mechanisms surrounding genes encoding proteins involved in transport and the cytoskeleton, with the goal of developing therapeutic intervention strategies for a subset of SZ cases.


Introduction
Schizophrenia (SZ) is a devastating mental illness characterized by positive symptoms (such as hallucinations and delusions), negative symptoms (such as social withdrawal, avolition, anhedonia, and self-neglect), and cognitive deficits (including impairments in executive function and attention) [1].This heterogeneous and complex psychiatric disorder, characterized by severe cognition, emotion, and social functioning impairments [2], affects 1% of the global population.In addition to environmental factors, although the aetiology of schizophrenia remains elusive, it is known to be a strong genetic component.However, the genetic explanation of the majority of schizophrenia cases remains unresolved.Current theories emphasize the contribution of large numbers of common genetic variants of small effect, combined with rare variants of larger effect.Intensive research has revealed a number of candidate genes that may be involved in SZ; however, the degree of genetic association tends to be inconsistent.
Numerous genetic association studies have made great steps towards the identification of either common or rare variants associated with SZ [3][4][5][6].Mutations that confer substantial risk for SZ have been identified at several loci, most of which have also been implicated in other neurodevelopmental disorders, including autism spectrum disorder [7].Studies involving next-generation sequencing technology have provided preliminary evidence that de novo single-nucleotide mutations might also increase the risk of SZ.However, these are very small in scale [8].Exome sequencing is a powerful tool for identifying mutations on specific genes, predicting consequences of such coding region mutations on gene function, and also predicting the effect of rare or de novo mutation on SZ risk [9].Exome sequencing performed in patients with SZ indicated that a variety of distinct genes are influence the disease.For instance, 15 de novo mutations were identified in 14 patients [10], and 40 de novo mutations were identified in 27 patients, affecting 40 genes [11].Furthermore, exome sequencing and whole-exome sequencing (WES) has been performed for 134 and 32 patients, respectively with the identification of 5155 variants [12].These results indicate that SZ risk is unlikely to be predominantly influenced by variants just outside the range detectable by genome-wide association studies (GWASs).Rather, multiple rarer genetic variants must contribute substantially to the predisposition to SZ, suggesting that both very large sample sizes and gene-based association tests will be required for securely identifying genetic risk factors.However, with the exception of very rare variants, each shows a relatively small contribution to disease risk, and few of them were able to understand the biological effect.Thus, neuropathophysiology and biological mechanisms remain largely unknown.Our study supports the idea that the genetic architecture of neuropsychiatric disorders includes a constellation of rare mutations in many different genes.This "common disease-rare alleles" hypothesis [13] is also supported by findings in human genomics [14].
Here, we applied the next-generation deep sequencing of SZ by using family trios with the goal of identifying de novo mutations associated with SZ and novel rare diseasepredisposing variants.Using bioinformatics analysis methods, we first predicted the most damaging de novo mutations, and second, we identified the gene network involved.Finally, we compared our biological pathway and gene network linked to SZ with those obtained performing a meta-analysis on published GWASs, copy number variation (CNV), and single nucleotide variant (SNV) mutations.

BioInformatics Analysis
To identify functional gene network involved in schizophrenia, we applied the NETwork -Based Analysis of Genomic variations (NETBAG+) computational method described in [15,16].This approach enables the processing of diverse types of genetic variations.We included GWAS loci (as CNV events), genes carrying de novo mutations from recent study and recurrent CNVs highlighted in meta-analysis.We selected the best network according to the minimal adjusted p-value and maximal z-score.In order to characterize the NETBAG+ networks, we then performed functional analysis of the gene nodes.We used the functional annotation chart of the DAVID Bioinformatics resource with a default Human background.We used the BioGRID database [17] (with all interactions of the protein UBC removed) and cytoscape software 3.2.1 to represent our networks.We kept genes with GO annotations "microtubule-based process", "actin cytoskeleton", and "synaptic transmission" or "synapse" and their direct interactors as part of the network.We performed FATIGO analysis with the Babelomics 4 web server.We used a dataset of genes of three diseases (SZ, ID, and ASD) reported in a recent study [18].We used all RefSeq genes mapped to the Human genome (hg19) as a background.We built the Venn diagram according to these analyses with the R package "VennDiagram".

Clinical Samples and DNA Preparation
Clinical samples consisted of three family-based populations of 91 Algerian, 54 French, and 74 Japanese patients with SZ, with their biological parents.In total, 219 patients were evaluated by trained psychiatrists using the Diagnostic Interview for Genetic Study [19], a semi-structured interview that assesses the lifetime Diagnostic and Statistical Manual of Mental Disorders Volume IV criteria of schizophrenia and other psychiatric diseases.In addition, the Scale for the Assessment of Positive and Negative (SANS) Symptoms were used to characterize the predominant clinical symptoms of the patients [20,21].Descriptions of the Algerian and French samples have been reported elsewhere [22,23], as well as their demographic and clinical characteristics [24].This study was approved by the National Ethics Committees of Algeria, France, and Japan, and all individuals, probands and parents, gave written informed consent for their participation.Genomic DNA was extracted from peripheral blood samples from all probands and parents.

De Novo Mutation Identification
De novo mutation identification was performed using a Bayesian approach.We first had to filter the false positives using a homemade method based on a Bayesian approach.We thus apply three successive steps: (i) definition of two classes using DNA chip and exome sequencing data; (ii) arrangement of the two classes according to their mean cover per site; and (iii) separation of the heterozygotes and homozygotes.These three steps filters applied, we built GATK parameter empiric distribution and gathered them.

Sanger Sequencing
Validation of de novo mutations was performed by Sanger sequencing.Sequence alignment and SNP detection were performed using Genalys software (GenalysWin2.8.3b) [25].

Polymorphism Phenotyping
We used PolyPhen-2, which predicts the possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations.http://genetics.bwh.harvard.edu/pph2/(accessed on 12 February 2023).

NETwork-Based Analysis of Genomic Variations
The NETBAG phenotype network finds relationships between genes by using a naïve Bayesian network.NETBAG scores the predicted likelihood that two human genes share the same phenotype.In doing so, NETBAG can uncover disease risk genes among a list of mutations observed in probands.The algorithm searches for cohesive clusters of genes perturbed by disease-associated genetic variations.Among a list of provided genes, NETBAG will search for a strongly interconnected subset of genes.Starting with each input gene as a seed node, a greedy search algorithm will choose the most strongly connected gene to add to the candidate set.Candidate networks are assigned a score based on a weighted sum of their edges, representing the likelihood that the respective genes participate in the same genetic phenotype.Network significance is then determined by comparing this score to a distribution of scores obtained by applying the same search algorithm to sets of random genes.http://innovation.columbia.edu/technologies/cu14362_network-based-analysis-of-genomic-variations (accessed on 12 February 2023).

Coro1c +/− Mouse Brain Analysis
Mouse brains were removed following anaesthesia and decapitation.In all steps of these neuroanatomical studies, the animal's genotypes were blinded to the experimenters.Standard operating procedures are described in more detail elsewhere [26].Mouse brain samples were immersion-fixed in 10% buffered formalin for 48 h, before paraffin embedding and sectioning at 5 µm thickness using a sliding microtome (Leica RM 2145, Leica RM 2145; Leica Microsystems GmbH, Wetzlar, Germany).
Coronal sections were collected at Bregma +0.98 mm and −1.34 mm, according to the Allen Mouse Brain Atlas [27].
Brain sections were double-stained using luxol fast blue for myelin and cresyl violet for neurons, and scanned at cell-level resolution using the Nanozoomer whole-slide scanner (Hamamatsu Photonics, Shizuoka, Japan).Co-variates, such as sample processing dates and usernames, were collected at every step of the procedure and used to identify data drifts.Using in-house ImageJ plugins, an image analysis pipeline was used to standardize measurements of areas and lengths.Each image was quality-controlled for the accuracy of sectioning relative to the reference atlas and controlled for asymmetries and histological artefacts.
At Bregma +0.98 mm, the brain structures assessed were as follows: the total brain area; the lateral ventricles; the cingulate cortex; the genu of the corpus callosum; the caudate putamen; the anterior commissure; the piriform cortex; the primary motor cortex; and the secondary somatosensory cortex.At Bregma −1.34 mm, a maximum of 18 brain structures were assessed: the total brain area; the lateral and third ventricles; the retrosplenial granular cortex; the corpus callosum; the dorsal hippocampal commissure; the amygdala; the piriform cortex; the internal capsule; the optic tract; the mammillothalamic tract; the fimbria; the habenular nucleus; the hippocampus; the primary motor cortex; the secondary somatosensory cortex; the hypothalamus nucleus; the arcuate nucleus; and the 3rd ventricle ventral part.
All samples were also systematically assessed for cellular ectopia (misplaced neurons).Statistical comparison was performed using t-tests.

Over-Representation of the Microtubule-Based Process Gene Ontology in the Schizophrenia NETBAG+ Cluster
To elucidate functional networks perturbed in SZ, we applied NETBAG+ to a set of genes affected by de novo SNVs (n = 609) [18] and CNVs (n = 58) [28].Furthermore, we used GWAS data from a multi-stage SZ GWAS involving up to 36,989 cases and 113,075 controls, and were able to identify 346 genes from the 108 loci found associated with SZ [29] (Supplementary Table S1).All the mutations or loci associations used as input for our analyses were obtained from a dataset generated by genome-wide methodologies in order to induce no bias linked to pre-existing hypothesis.
Types of variants (SNV, CNV, or GWAS) are reported in bracket after each gene.Interestingly, the Synaptic Transmission GO category is over-represented in the SZ GWAS repertoire (6 genes out of 24 in the list; p = 0.021), which is not the case for the MBP GO and Actin Cytoskeleton GO categories (1/25 and 1/23, respectively).
From this 1014-gene repertoire, we found that the largest network comprised 559 genes (p = 0.012) (Figure 1).We used DAVID17 to identify Gene ontology (GO) terms that were significantly enriched among network gene annotations (Tables 1 and 2).This analysis identified three subclusters that are inter-related and with small overlaps, including the Microtubule-Based Process (MBP), Actin Cytoskeleton, and Synaptic Transmission.The microtubule subcluster (GO: 0007017; MBP; Table 2) contained genes that are associated with trafficking, such as MARK4, which phosphorylate microtubule-associated proteins, KIF1A, KIF13B, or KIF20B, which are members of the kinesin family; DNAH1, DNAH3, and DNAH9 which are members of the dynein family; and DCTN3, another member of the DCTN protein family essential for dynein activity in vivo.Here, 7 KIF genes were part of the 559-gene network, indicating a statistically significant enrichment (p < 0.0001) for the KIF superfamily that included 45 genes (http://www.genenames.org/genefamilies/KIFaccessed on 12 February 2023).The actin subcluster (GO: 00015629; Actin Cytoskeleton) contained actin-related proteins (ACTA2, CAPZA1, CTNNA2) and myosins (MYO15A, MYO18A or MYO18B).The synapse subcluster (GO: 0007268; Synaptic Transmission) contained synaptic adhesion molecules such as neurexin (NRXN1); components of the presynapse (RIMS1); receptors of glutamatergic synapses (GRM3, GRIN2A, and GRID2); AKAP9, that directly interacts with NMDA receptors; and voltage-gated calcium channels (CACNA1C and CACNB2).It also included the dopaminergic receptor (DRD2) that is the target of all effective antipsychotic drugs.Interestingly, the GO Synaptic Transmission category showed an over-contribution (6 over 24; p = 0.021) of GWAS genes, namely, GRM3, GRIN2A, CACNA1C, CACNB2, PTPRF, and DRD2, whereas GWAS genes only contributed to 1/25 and 1/23 for the GO MBP and GO Actin Cytoskeleton categories, respectively (Table 2).Therefore, to explore the respective contribution of common variants versus rare variants, we separately analysed the SZ GWAS set and SZ SNVs + CNVs set.For SZ GWAS genes, we only obtained an over-representation of GO: 004274 A presynaptic membrane enrichment category (padj = 0.03) using Babelomics v4.2 19 and no significant GO category were obtained using DAVID.
Table 1.Genes of the network associated with the three first GO categories, namely, Microtubule-Based Process, Synaptic Transmission, and Actin Cytoskeleton.

Synaptic Transmission
(G0:0007268) Actin Cytoskeleton (GO:0015629)  We then performed NETBAG+ searches using only the set of 702 genes that included de novo SNVs (n = 609) and CNVs (n = 58).For this repertoire, the largest network comprised 572 genes (p = 0.0028) (Figures 1 and 2, Supplementary Figure S1).The analysis identified three similar subclusters: MBP, Actin Cytoskeleton, and Synapse Part.Altogether, these results provide evidence that the different sources of genetic variations reinforce each other, as previously reported in [16].We then performed NETBAG+ searches using only the set of 702 genes that included de novo SNVs (n = 609) and CNVs (n = 58).For this repertoire, the largest network comprised 572 genes (p = 0.0028) (Figures 1 and 2, Supplementary Figure S1).The analysis identified three similar subclusters: MBP, Actin Cytoskeleton, and Synapse Part.Altogether, these results provide evidence that the different sources of genetic variations reinforce each other, as previously reported in [16].

WES from Three Different Ethnical Cohorts and De Novo Variant Identification
We performed WES using Illumina technology in three distinct SZ cohorts: 91 Algerian, 54 French, and 74 Japanese SZ patients, with their two parents leading to a total of 657 sequenced DNA samples.All the DNA samples were controlled for quality and quantity before being loaded on sequencers.After sequence alignment and calling, performed with the GATK suite, only 75 Algerian, 45 French, and 61 Japanese trio families could finally be analysed and used for de novo variant identification (Table 3).De novo mutation identification was performed using a Bayesian approach.We estimated the Bayer factor

WES from Three Different Ethnical Cohorts and De Novo Variant Identification
We performed WES using Illumina technology in three distinct SZ cohorts: 91 Algerian, 54 French, and 74 Japanese SZ patients, with their two parents leading to a total of 657 sequenced DNA samples.All the DNA samples were controlled for quality and quantity before being loaded on sequencers.After sequence alignment and calling, performed with the GATK suite, only 75 Algerian, 45 French, and 61 Japanese trio families could finally be analysed and used for de novo variant identification (Table 3).De novo mutation identification was performed using a Bayesian approach.We estimated the Bayer factor (BF) for each single sequence (Figure 3) that enabled us to apply a new filter step on the BF: log(BF) < 3 and death-reading < 20.(BF) for each single sequence (Figure 3) that enabled us to apply a new filter step on the BF: log(BF) < 3 and death-reading < 20.The curves were computed on sites where we had both sequencing information and microarray calls.In red, discordant site (i.e., not the same call on two technologies, "false positives"); in black, concordant sites ("true positives").We fixed the parameter in order to have ~5% of false positives.From the 181 SZ trios, we identified 386 de novo mutations.We then validated 88 by Sanger sequencing: 35 in French, 21 in Algerian, and 32 in Japanese SZ patients (Table 3).These 88 de novo mutations were identified in 72 SZ patients (30 French,17 Algerian, and 25 Japanese) localized on 22 different chromosomes, and only chromosomes carrying no de novo mutations.
We then looked into the distribution in percentages of the de novo mutations according to (i) the number of patients (Table 4) and (ii) the number of chromosomes (Table 5) carrying de novo mutation(s).While no difference was observed for the first distribution with almost 40% of patients, whatever the cohort, i.e. carrying mutation(s) or only one for the majority of them, a slight difference was shown for the second distribution.There was still the same trend for the total percentage of chromosomes carrying mutation(s) (61%, 70%, and 78% in the French, Japanese, and Algerian cohorts, respectively), with the majority of chromosomes carrying only one de novo mutation; however, there was a contrasting distribution for the percentage of chromosomes carrying several mutations.In fact, whereas the French cohort showed 17% with two mutations and only 4% with three or more, the two other cohorts, Algerian and Japanese, both displayed 4% with two mutations, and 22% and 26%, respectively, with three or more. .The curves were computed on sites where we had both sequencing information and microarray calls.In red, discordant site (i.e., not the same call on two technologies, "false positives"); in black, concordant sites ("true positives").We fixed the parameter in order to have ~5% of false positives.
From the 181 SZ trios, we identified 386 de novo mutations.We then validated 88 by Sanger sequencing: 35 in French, 21 in Algerian, and 32 in Japanese SZ patients (Table 3).These 88 de novo mutations were identified in 72 SZ patients (30 French, 17 Algerian, and 25 Japanese) localized on 22 different chromosomes, and only chromosomes carrying no de novo mutations.
We then looked into the distribution in percentages of the de novo mutations according to (i) the number of patients (Table 4) and (ii) the number of chromosomes (Table 5) carrying de novo mutation(s).While no difference was observed for the first distribution with almost 40% of patients, whatever the cohort, i.e. carrying mutation(s) or only one for the majority of them, a slight difference was shown for the second distribution.There was still the same trend for the total percentage of chromosomes carrying mutation(s) (61%, 70%, and 78% in the French, Japanese, and Algerian cohorts, respectively), with the majority of chromosomes carrying only one de novo mutation; however, there was a contrasting distribution for the percentage of chromosomes carrying several mutations.In fact, whereas the French cohort showed 17% with two mutations and only 4% with three or more, the two other cohorts, Algerian and Japanese, both displayed 4% with two mutations, and 22% and 26%, respectively, with three or more.From these results, we decided to combine all de novo mutations from the three cohorts and analysed them together.

SZ De Novo Variant Analysis: Damaging Impact Prediction and Gene Network Identification
We ran two complementary bioinformatics analyses to characterize our de novo SZ mutations.First, we used the polymorphism phenotyping PolyPhen-2 (PP2) software tool (http://genetics.bwh.harvard.edu/pph2/)to predict the possible impact of an amino acid substitution on the structure and function of a human protein.We were then able to rank the de novo variants according to the PP2 score, establishing three classes of mutations: (i) probably damaging, (ii) possibly damaging, and (iii) benign.Considering the three cohorts as distinct populations or taken together (Table 6), the majority of the de novo mutations were ranked as probably damaging, except for the Japanese cohort (only 37% of the de novo variants were ranked probably damaging).The entire list of de novo variants with their PP2 score is given (Table 7).Second, we used GO tools (David and Amigo/Panther) (Supplementary Table S2) to identify enrichment in a given GO class.For the Algerian cohort, we evidenced an enrichment in genes encoding proteins related to GO:0051015~actin filament binding (p = 0.0011) using David.For the sum of the three cohorts, we detected an enrichment for the same class GO:0051015~actin filament binding (p = 0.0020) using David.Panther analysis identified an enrichment in G0:0003774~transport (p = 0.019) and GO:0003729~mRNA binding (p = 0.010).We thus identified gene networks based on our de novo variants for classes similar to the biological pathways identified in the meta-analysis (Figures 1 and 2, Supplementary Figure S1).Mutations of de novo variants are identified in conserved domain regions of the proteins, as illustrated by alignments of proteins from rodents to primates.We show this alignment for ten proteins: CORO1C, SYNE1, MYO1B, MYH11, TTN, RBM14, PUM1, SECIPBP2, DNAH6, and DNAH10 (Figure 4A-E,J-N).Note the localized expression of CORO1C, MYO1B mouse orthologs in hippocampus, the brain region involved in memory and cognition (Figure 4F-I).Furthermore, we validated these 10 de novo variants for each trio (mother, father, and patient) (Figure 5).We identified a possibly damaging CORO1C mutation in one patient of our cohorts (Figure 4A).The CORO1C gene is part of GO:0051015~actin filament binding category and encodes a WD40-repeat (WDR) protein [30].
Neuroanatomical study was carried out using eight male mice.There were five WT mice and three HET mice.These mice were 16 weeks of age and bred on a B6N background.A full list of parameters studied in this manuscript is provided as Supplementary Table S3.A number of parameters were significantly decreased or compared with WT, and are listed in Supplementary Table S4: the corpus callosum (cc) in section Br −1.34, piriform cortex, right (Pir_R) in section Br +0.98, total brain area (2_TBA) in section Br −1.34, lateral ventricles, right (2_LV_R) in section Br −1.34, dorsal hippocampal commissure (2_dhc) in section Br −1.34, and the arcuate nucleus (2_Arc) in section Br −1.34.Significant parameters were mapped onto the two neuroanatomical sections performed in this study (Figure 6A) and the full spectra of parameters studied in this report are expressed as relative increases or decreases compared with controls in bar graphs (Figure 6A below).Significant neuroanatomical phenotypes at Br. +0.98 mm (right piriform cortex and right secondary somatosensory cortex) and −1.34 mm (total brain area and measurements pertaining to ventricles and the corpus callosum) are shown on representative images of wild-type and mutant animals, together with box plots (Figure 6B,C).The most significant neuroanatomical finding pertained to the size of the corpus callosum that was increased in size by about 20% (p = 0.004) (Figure 6B,C).Statistically significant unilateral neuroanatomical phenotypes (oriens layer of the hippocampus, piriform cortex and secondary somatosensory cortex) should be considered with due caution.The raw data are provided in Supplementary Table S5.
Life 2024, 14, 244  indicates the significance threshold: white when not significant; grey when not computable.A list of neuroanatomical parameters and corresponding numbers is provided as the X-axis (a full description of the parameters is also provided in Supplementary Table S3).

Discussion
Using an analysis of large-scale data from whole-genome SNVs, CNVs, and GWASs studies of SZ patients, we were able to evidence a protein network enriched in products of genes encoding Microtubule-associated proteins (MAPs), actin-interacting proteins, and synaptic proteins.These three groups of proteins interact in the organization of the neuronal cytoskeleton, regulate cellular development, and maintain axons, dendrites, and synapses [30][31][32][33][34][35].This result fits into a larger theme of cytoskeletal dysregulation in SZ.Such abnormalities in cytoskeletal organization networks have become increasingly implicated in SZ pathogenesis through the evidence of large-scale genomic studies, proteomic analyses, and immunohistochemical assays [18,32,36].
Currently, large-scale genetic studies are dominated by European-descent samples.This restricted population can induce failure to capture the level of diversity that exists globally [37,38].Due to differential genetic architectures, the universality of genetic findings between populations is generally limited.Therefore, this imbalance poses a limitation in our understanding of the genetic architecture of complex diseases in non-European populations.Here, we analysed three distinct trio cohorts from France, Algeria, and Japan.From these distinct three populations, we observed a similar enrichment in genes that encode products related to G0:0003774~transport, GO:0051015~actin filament binding, and GO:0003729~mRNA binding.One can expect a functional interaction between these three groups of proteins.GO:0003729~mRNA binding proteins are expected to bind mRNAs that are localized either in dendritic spines or axonal boutons and involved in the synaptic plasticity [39,40].
Limitations of our study need to be evaluated.Genomic studies of SZ have revealed a condition that is highly polygenic, with risk conferred by probably thousands of risk alleles, each of small effect [41], pointing to the importance of studying cohorts as large as possible.Our work was based on the analyses of GWAS data that identified 346 genes from the 108 genetic loci found associated with SZ [28] and our 181 SZ trios.To date, GWASs have identified 287 loci associated with SZ [42].De novo mutations have been validated from 1695 SCZ-affected trios and 1077 published SCZ-affected trios [43].Inclusions of these data in meta-analysis are expected to refine the analysis of gene networks found in this study.
Neuroimaging reports have found corpus callosum deficits in patients with SZ.Examining neurocircuitry, diffusion-weighted imaging studies have identified an altered structural integrity of white matter in frontal and temporal brain regions and tracts, such as the corpus callosum associated with the illness [44].One meta-analysis study identified differences in the fractional anisotropy of corpus callosum between patients and controls [45].Furthermore, multivariate associations among white matter, neurocognition, and social cognition were quantified across individuals with SZ spectrum disorders [46].
Studies from patients-derived neurons also indicate defects in microtubules.Deficits in microtubules organization and stability were evidenced using olfactory neuroepithelial cells from patients with SZ [47,48].These neurons are expected to express a glutamatergic phenotype [49].In contrast, no deficits were reported for patient-derived dopaminergic neurons [50].These results suggest that SZ affects microtubule function, depending on the neurotransmitter phenotype of human neurons.
From these results and the data reported in our manuscript, one can propose that SZ induces defects in microtubule function, impacting long-range projecting glutamatergic neurons such as those implicated in the development of the corpus callosum.This hypothesis should be studied further in the future.
Altogether, our results emphasize the importance of gene networks involved in neuronal cytoskeleton transport and their deregulation linked to de novo mutations in SZ.

Figure 1 .
Figure 1.Enrichment of Gene Ontology GO:0007017; "Microtubule-Based Process" category in schizophrenia (SZ) and in autism spectrum disorder (ASD).(A) The network implicated by NET-BAG+ based on SZ-associated 613 SNV, 63 CNV, and 360 GWAS from 108 loci identified a principal network (network comprises 559 genes, p = 0.012) displaying (i) genes belonging to the three highest-connected GO classes: Microtubule-Based Process (blue), Actin Cytoskeleton (pink), and Synaptic Transmission (orange); and (ii) direct interacting genes (white) from BioGRID direct proteinprotein interactions (PPIs).Diamond symbols represent genes from GWASs; triangles represent genes from CNVs; rectangles are associated with genes from SNVs; octagons indicate genes from both SNVs and CNVs; and parallelograms pertain to genes from both GWAS and SNV mutation types.(B) Overlap of genes bearing de novo mutations in SZ, ASD, and intellectual disability (ID) disorders with genes of the GO:0007017; "Microtubule-Based Process" category.The overlap area in the Venn diagram shows the number of genes between/among different disorders and MBP (C) Bar plot showing the enrichment of MBP in SZ and ASD (SZ: p = 0.00013 and ASD: p = 0.00049).*** p < 0.001.

Figure 1 .
Figure 1.Enrichment of Gene Ontology GO:0007017; "Microtubule-Based Process" category in schizophrenia (SZ) and in autism spectrum disorder (ASD).(A) The network implicated by NETBAG+ based on SZ-associated 613 SNV, 63 CNV, and 360 GWAS from 108 loci identified a principal network (network comprises 559 genes, p = 0.012) displaying (i) genes belonging to the three highest-connected GO classes: Microtubule-Based Process (blue), Actin Cytoskeleton (pink), and Synaptic Transmission (orange); and (ii) direct interacting genes (white) from BioGRID direct protein-protein interactions (PPIs).Diamond symbols represent genes from GWASs; triangles represent genes from CNVs; rectangles are associated with genes from SNVs; octagons indicate genes from both SNVs and CNVs; and parallelograms pertain to genes from both GWAS and SNV mutation types.(B) Overlap of genes bearing de novo mutations in SZ, ASD, and intellectual disability (ID) disorders with genes of the GO:0007017; "Microtubule-Based Process" category.The overlap area in the Venn diagram shows the number of genes between/among different disorders and MBP (C) Bar plot showing the enrichment of MBP in SZ and ASD (SZ: p = 0.00013 and ASD: p = 0.00049).*** p < 0.001.

Figure 3 .
Figure 3. Bayesian factor.Proportions of validated mutations according to score (BF).The curves were computed on sites where we had both sequencing information and microarray calls.In red, discordant site (i.e., not the same call on two technologies, "false positives"); in black, concordant sites ("true positives").We fixed the parameter in order to have ~5% of false positives.

Figure 3 .
Figure 3. Bayesian factor.Proportions of validated mutations according to score (BF).The curves were computed on sites where we had both sequencing information and microarray calls.In red, discordant site (i.e., not the same call on two technologies, "false positives"); in black, concordant sites ("true positives").We fixed the parameter in order to have ~5% of false positives.

Figure 4 .
Figure 4.Multiple protein sequence alignment from rodents to primates indicating impact on conserved protein domains linked to de novo mutations.We show alignment for ten proteins-CORO1C, SYNE1, MYO1B, MYH11, TTN, RBM14, PUM1, SECIPBP2, DNAH6, and DNAH10-with the impact of the mutations in conserved protein domains (A-E,J-N).Alignments were performed using Clustal suite.Note the localized expression of CORO1C, MYO1B mouse orthologs in hippocampus, the brain region involved in memory and cognition (data from mouse Allen brain atlas) (F-I).

Figure 4 .
Figure 4.Multiple protein sequence alignment from rodents to primates indicating impact on conserved protein domains linked to de novo mutations.We show alignment for ten proteins-CORO1C, SYNE1, MYO1B, MYH11, TTN, RBM14, PUM1, SECIPBP2, DNAH6, and DNAH10-with the impact of the mutations in conserved protein domains (A-E,J-N).Alignments were performed using Clustal suite.Note the localized expression of CORO1C, MYO1B mouse orthologs in hippocampus, the brain region involved in memory and cognition (data from mouse Allen brain atlas) (F-I).

Figure 4 .
Figure 4. Multiple protein sequence alignment from rodents to primates indicating impact on con-served protein domains linked to de novo mutations.We show alignment for ten proteins-CORO1C, SYNE1, MYO1B, MYH11, TTN, RBM14, PUM1, SECIPBP2, DNAH6, and DNAH10-with the impact of the mutations in conserved protein domains (A-E,J-N).Alignments were performed using Clustal suite.Note the localized expression of CORO1C, MYO1B mouse orthologs in hippocampus, the brain region involved in memory and cognition (data from mouse Allen brain atlas) (F-I).

3. 4 .
Functional Analysis of Mouse Coro1c Haploinsufficiency in the Brain: Coro1c Tm1a Mice Show Mild Neuroanatomical Defects Pertaining to the Corpus Callosum

Figure 6 .
Figure 6.Coro1c tm1a mice show mild neuroanatomical defects pertaining to the corpus callosum.(A) Coronal planes at Br. +0.98 and −1.34 mm with numbered measurements shown on histograms are expressed as percentages from wild-type brain corresponding structures.Positive and negative

Figure 6 .
Figure 6.Coro1c tm1a mice show mild neuroanatomical defects pertaining to the corpus callosum.(A) Coronal planes at Br. +0.98 and −1.34 mm with numbered measurements shown on histograms are expressed as percentages from wild-type brain corresponding structures.Positive and negative values correspond to increased and decreased measurements relative to WT, respectively.The colourindicates the significance threshold: white when not significant; grey when not computable.A list of neuroanatomical parameters and corresponding numbers is provided as the X-axis (a full description of the parameters is also provided in Supplementary TableS3).(B) Nissl-stained coronal brain sections from a Coro1c tm1a mouse (right) against WT (left).Corresponding scale bars are shown on each panel.(C) Box plots focusing on corpus callosum measurements at Br. +0.98 and −1.34 mm.Statistics indicated are t-tests.
Figure 6.Coro1c tm1a mice show mild neuroanatomical defects pertaining to the corpus callosum.(A) Coronal planes at Br. +0.98 and −1.34 mm with numbered measurements shown on histograms are expressed as percentages from wild-type brain corresponding structures.Positive and negative values correspond to increased and decreased measurements relative to WT, respectively.The colourindicates the significance threshold: white when not significant; grey when not computable.A list of neuroanatomical parameters and corresponding numbers is provided as the X-axis (a full description of the parameters is also provided in Supplementary TableS3).(B) Nissl-stained coronal brain sections from a Coro1c tm1a mouse (right) against WT (left).Corresponding scale bars are shown on each panel.(C) Box plots focusing on corpus callosum measurements at Br. +0.98 and −1.34 mm.Statistics indicated are t-tests.

Table 2 .
Gene Ontology classification after DAVID analysis using the NETBAG+ network on SNV and CNV classes.
Life 2024, 14, x FOR PEER REVIEW 7 of 19

Table 3 .
Distribution of the de novo mutations per cohorts.
Life 2024, 14, x FOR PEER REVIEW 9 of 19

Table 3 .
Distribution of the de novo mutations per cohorts.

Table 4 .
Distribution of the de novo mutations per SZ patient.

Table 5 .
Distribution of the de novo mutations per chromosome.

Table 6 .
Distribution of the de novo mutations per PP2 ranking.

Table 7 .
List of de novo mutations validated by Sanger sequencing.