Integrated microRNA–mRNA Expression Profiling Identifies Novel Targets and Networks Associated with Autism

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder, with mutations in hundreds of genes contributing to its risk. Herein, we studied lymphoblastoid cell lines (LCLs) from children diagnosed with autistic disorder (n = 10) and controls (n = 7) using RNA and miRNA sequencing profiles. The sequencing analysis identified 1700 genes and 102 miRNAs differentially expressed between the ASD and control LCLs (p ≤ 0.05). The top upregulated genes were GABRA4, AUTS2, and IL27, and the top upregulated miRNAs were hsa-miR-6813-3p, hsa-miR-221-5p, and hsa-miR-21-5p. The RT-qPCR analysis confirmed the sequencing results for randomly selected candidates: AUTS2, FMR1, PTEN, hsa-miR-15a-5p, hsa-miR-92a-3p, and hsa-miR-125b-5p. The functional enrichment analysis showed pathways involved in ASD control proliferation of neuronal cells, cell death of immune cells, epilepsy or neurodevelopmental disorders, WNT and PTEN signaling, apoptosis, and cancer. The integration of mRNA and miRNA sequencing profiles by miRWalk2.0 identified correlated changes in miRNAs and their targets’ expression. The integration analysis found significantly dysregulated miRNA–gene pairs in ASD. Overall, these findings suggest that mRNA and miRNA expression profiles in ASD are greatly altered in LCLs and reveal numerous miRNA–gene interactions that regulate critical pathways involved in the proliferation of neuronal cells, cell death of immune cells, and neuronal development.


Introduction
Autism spectrum disorder (ASD) is an enigmatic neurodevelopment disorder that affects approximately 2% of children [1]. The etiology is not known in most cases, with studies suggesting that it involves multiple organs and physiological systems outside of the brain [2] and the deregulation of diverse subcellular molecular pathways [3]. For example, abnormalities in immune regulation; metabolism, including folate and energy metabolism; and redox regulations and methylation have been documented in multiple studies, but the origin of these abnormalities, how they are connected, and their effects on ASD symptoms remain unclear [4]. Molecular pathways including mTOR, PTEN, AKT, and β-catenin have also been found to be deregulated in ASD [3].
Lymphoblastoid cell lines (LCLs) provide a valuable tool to study underlying molecular regulation in ASD [5]. Biobanked LCLs have been utilized to study structural genetic changes, such as copy number variations and gene mutations; changes in cellular regulation, including mRNA [6] and microRNA (miRNA) expression [7]; metabolic alteration, including alterations in mitochondrial function [8], redox [9,10], and methylation metabolism [11]; and changes in immune regulation [12].
MicroRNAs (miRNAs) are small non-coding RNAs with tissue-specific expression that regulate post-transcriptional gene expression [13]. Given the diverse pathways that have been found to be dysregulated in ASD, studying miRNA expression may be very promising, as miRNAs can regulate many seemingly unrelated molecular pathways and may result in diverse changes in gene expression depending on the tissue. Some initial studies have found alternations in miRNAs in ASD using LCLs [7,14,15]. Despite promising results, studies remain with inconsistent findings. Since miRNAs regulate mRNAs, one approach is to integrate analysis of changes in both mRNA and miRNA expression to explore regulatory networks.
This study presents the mRNA and miRNA sequencing profile of control and ASD LCLs and an integrative analysis of miRNA-mRNA expression using miRWalk 2.0. The findings from the bioinformatic analysis show that miRNAs potentially regulate mRNAs involved in the proliferation of neuronal cells, the cell death of immune cells, epilepsy or neurodevelopmental disorders, and the WNT/β-catenin/PTEN signaling pathways.

Cell Lines and Tissue Culture Conditions
Samples from children with autistic disorder were obtained from the Autism Genetics Resource Exchange (AGRE), a publicly available biomaterials repository located in Los Angeles, CA (AGRE; Los Angeles, CA, USA). AGRE collects samples from non-idiopathic cases of autism spectrum disorders. The procedures for screening children for collection are provided on the program website (https://www.autismspeaks.org/agre-program, accessed on 17 May 2022). Age and gender-matched controls with no documented behavioral or neurological disorder or a first-degree relative with a medical disorder were obtained from Coriell Cell Repository (Camden, NJ, USA).
LCLs were maintained in RPMI 1640 culture medium as previously described [7,8,10] in a humidified incubator at 37 • C with 5% CO 2 . These LCLs were used in a previous study examining bioenergetics [10] and miRNA in ASD [7]. In this manuscript, LCLs from subjects with autistic disorder (ICD-9: 299.0; ICD-10: F84.0) are designated as ASD and those from controls are designated as CNT (Table 1). Total RNA was extracted using Trizol reagent (ThermoFisher, Carlsbad, CA, USA) following the manufacturer s procedure. The total RNA quantity and purity were analyzed by the Bioanalyzer 2100 and RNA 6000 Nano LabChip Kit (Agilent, Santa Clara, CA, USA) with an RIN number of >7.0. Approximately 10 ug of total RNA was subjected to isolated Poly (A) mRNA with poly-T oligo attached magnetic beads (ThermoFisher, Carlsbad, CA, USA). Following purification, the poly(A)-or poly(A)+ RNA fractions were fragmented into small pieces using divalent cations under an elevated temperature. The cleaved RNA fragments were reverse transcribed to create the final cDNA library in accordance with the protocol for the mRNA-seq sample preparation kit (Illumina, San Diego, CA, USA) and the average insert size for the paired-end libraries was 300 bp (±50 bp). Finally, paired-end sequencing was performed on an Illumina Hiseq 4000 following the vendor s recommended protocol.

LCLs Small RNA Sequencing (miRNA-Seq)
Total RNA was extracted using Trizol reagent (ThermoFisher, Carlsbad, CA, USA) following the manufacturer s procedure. The total RNA quantity and purity were assessed with Bioanalyzer 2100 (Agilent, Santa Clara, CA, USA) with an RIN number of >7.0. Small RNA enrichment was performed by the excision of the 15 to 50 nt fraction from a polyacrylamide gel. Approximately 1 ug of enriched RNA was used to prepare the small RNA library, according to the protocol of the TruSeq Small RNA Sample Prep Kits (Illumina, San Diego, CA, USA). Finally, single-end sequencing (36 bp or 50 bp) was performed on an Illumina Hiseq 2500 at LC Sciences (Houston, TX, USA) following the vendor s recommended protocol.

Bioinformatics Analysis for RNA-Seq
The raw reads of both sequencing profiles were aligned using Bowtie [16] against the hg19 version of the human genome, and RSEM v1.2.12 software [17] was used to estimate raw read counts using Ensemble v84 gene information. DESeq2 [18] was utilized to identify differentially expressed genes between sample groups. Samples with poor alignment rates were identified as outliers in quality control analysis and therefore removed before carrying out the differential expression analysis. A false-positive rate of α = 0.05 with false-discovery rate (FDR) correction was used as the level of significance. Only a handful of genes were found to satisfy the FDR < 5% cut-off, which was not sufficient for functional enrichment analysis. Therefore, it was decided to consider a p < 0.05 threshold to select differentially expressed genes. These genes were then subjected to functional enrichment analysis.

Functional Annotation of Differentially Expressed Genes
Pathway and functional enrichment analyses were tested on genes that passed a significant p-value of <0.05 using QIAGEN's Ingenuity ® Pathway Analysis (IPA ® , QIAGEN Redwood City, CA, USA) software. The most relevant (p < 0.05) pathways and functions associated with autism were selected to generate dot plots using a customized R script.
2.6. miRNA Target Predictions using miRWalk2.0 The miRWalk2.0 database was used to gather putative miRNA binding sites within the DEGs [19,20]. miRWalk3.0 is a new publicly available version of miRWalk2.0. However, miRWalk3.0 is not very useful for carrying out miRNA-target predictions, as it fails to offer users the flexibility to choose the prediction algorithm of their choice, and at the same time, it entirely misses all the key features that are of utmost importance to the scientific community (e.g., a meta-analysis of targets by 13 different prediction datasets). On the other hand, the miRWalk2.0 database offers a meta-analysis of putative miRNA binding sites by collecting 13 prediction datasets from existing miRNA-target resources, which can help reduce the number of false-positive targets [21,22].

Integrated Analysis of RNA-and miRNA-Sequencing Data
After the initial analysis of removing bad quality reads, the DESeq2 package [18] was utilized to normalize the raw counts and fit models to identify differentially expressed genes (DEGs) and miRNAs (DEMs)between ASD and control samples. The level of significance was set to p < 0.05. The meta-analysis platform of the miRWalk2.0 database [19,21] was employed for this integrated analysis to collect putative interactions between significant genes and miRNAs. The interactions among significant genes and miRNAs predicted with at least 2 algorithms were compiled into a list (only DEGs) and were uploaded to IPA for enrichment analysis and identification of associated networks ( Figure 1).

Quantitative Reverse-Transcriptase-Polymerase Chain Reaction (qRT-PCR) Validation
LCLs were used to isolate total RNA and microRNA as previously described [7]. Briefly, RNA isolated with miRNeasy Mini Kit (Qiagen, Valencia, CA, USA) and miRNA isolated with RNeasy MinElute Cleanup Kit (Qiagen, Valencia, CA, USA) were used to perform cDNA synthesis with the miScript II RT kit (Qiagen, Valencia, CA, USA). The qRT-PCR was run in triplicate on a QuantStudio™ 6 Flex Real-Time PCR System (ThermoFisher Scientific, Carlsbad, CA, USA).
Negative controls and no template controls (NTC) were run with each assay. Relative quantitation for miRNA and mRNA was calculated using the 2 -∆∆Ct method.

Statistical Analysis
All experimental data for qRT-PCR are presented as means ± SEM, and differences between the two groups were examined using Student s t-test (2-tailed). p-Values of less than 0.05 were considered significant.

Identification of Differentially Expressed Genes and miRNAs
The top candidate genes and miRNAs are shown in Figures 2 and 3. The samples with poor alignment rates were dropped from the subsequent analysis. These results demonstrated that genes and miRNAs were differentially expressed between ASD and control samples. A false-positive rate of α = 0.05 with false discovery rate (FDR) correction was used as the level of significance.

Differential Gene Expression in ASD LCLs
The RNA-seq analysis showed 1700 significantly (p < 0.05) differentially expressed mRNAs (DEGs) between the ASD and control groups ( Figure 2A). The list of deregulated genes shows that 629 were downregulated and 1071 were upregulated (Table S3).
The top 45 (25 most upregulated and 20 most downregulated) differentially regulated genes (FDR < 5%) in ASD LCLs compared with control LCLs are depicted in Figure 2B. ASD LCLs showed highly upregulated expression for GABRA4, with a fold change of 607 (p ≤ 0.05), followed by IL27, with a fold change of 27.7 (p ≤ 0.05), compared with control LCLs. By contrast, the downregulated genes (p < 0.05) were from immunoglobulin light chain and heavy chain members in ASD LCLs, e.g., IGKV1-6, with a fold change of -4127.6, followed by IGLV1-40, with a fold change of -3750 (p ≤ 0.05).

qRT-PCR Validation of Differentially Expressed miRNAs and Genes
To verify the findings of both sequencing results, we randomly selected three differentially expressed genes and three differentially expressed miRNAs for qRT-PCR validation.

Gene Expression Validation by qRT-PCR
To validate the RNA-seq dataset, the expression levels of the three randomly selected genes AUTS2, FMR1, and PTEN were determined by qRT-PCR analysis. The expression levels of AUTS2 ( Figure 4A) were upregulated 3.7-fold in ASD LCLs compared with the control LCLs (p ≤ 0.05). The other two mRNAs, FMR1 ( Figure 4B) and PTEN ( Figure 4C), were significantly downregulated in ASD LCLs compared with the control LCLs (p ≤ 0.05).

Pathway Analysis of Differentially Expressed Genes
A total of 1700 DEGs and 102 DEMs with p < 0.05 were selected to test for pathway enrichment analysis using IPA software. The significantly expressed genes can participate in a network of diverse signaling pathways, such as apoptosis, cell death of immune cells, cell cycle progression, epilepsy, nNOS, and proliferation of neuronal cells ( Figure 6). Z-scores are represented by red or blue colors for the signaling pathways. The activated pathways (shown in red) were related to cell death of immune cells, inflammatory response, the proliferation of neuronal cells, central nervous system solid tumor, epilepsy, the endocannabinoid neuronal synapse pathway, and synaptic long-term potentiation. Inhibited pathways (shown in blue) controlled apoptosis, inflammation of the organs, movement disorders, motor dysfunction, PI3K/AKT, ATM, and PTEN (Supplementary Tables S5 and S6).

Functional Enrichment Analysis of Differentially Expressed Genes
Subsequently, the 1700 DEGs and 102 DEMs (p < 0.05) were further selected to test for functional enrichment analysis using IPA software. Interestingly, these genes are also associated with diverse functions related to cancer, apoptosis, cellular homeostasis, movement disorders, and proliferation of neuronal cells (Figure 7). The genes with activated functions (shown in red) regulate cancer, cell survival, cell death of immune cells, neuronal cell death, the endocannabinoid neuronal synapse pathway, and synaptic long-term potentiation. By contrast, genes with inhibited functions (shown in blue) were involved in apoptosis, movement disorders, motor dysfunction, seizure, colon cancer, ATM signaling, PI3K/AKT signaling, and neuronal cell death. For details on various functions associated with ASD and control LCLs, see Supplementary Tables S5 and S6.

Integrated Analysis of miRNA and mRNA Expression
The mRNA-miRNA interaction analysis using miRWalk2.0 showed 267 genes in significant pathways that were predicted to be targeted by deregulated miRNAs (Supplementary  Table S7). These pathways control cell death of immune cells, the proliferation of neuronal cells, epilepsy or neurodevelopmental disorders, and Wnt/β-catenin and PTEN signaling (Table S7). The activated pathways are shown in red, whereas the inhibited pathways are shown in blue.
The ASD LCLs showed upregulation of GABRA4, IL27, and PTEN; the downregulated genes were FOXP1, NTN1, and NCAM2 ( Figure 8 and Table S7). For example, GABRA4 was upregulated 606-fold in ASD LCLs and regulated pathways of epilepsy and neurodevelopmental disorders (Figure 8, Table S7). GABRA4 was predicted to be targeted by 16 upregulated miRNAs and 10 downregulated miRNAs (Figure 8, Table S7). By contrast, only miR-3529-3p (Table S7) was validated to be the target of GABRA4. Table S7 also shows that miR-3529-3p was upregulated twofold (red) in ASD LCLs, and this has been validated by two publications. PTEN regulates pathways of cell death of immune cells and proliferation of neuronal cells. PTEN was targeted by miR-21-5p, which was upregulated fourfold in ASD LCLs, and this has been validated by 62 publications (Table S7). miR-3529-3p also targets NCAM2, which was upregulated 6.06-fold (Figure 8). The sequencing results showed that miR-3529-3p was upregulated twofold in ASD LCLs, and only one publication validates this result (Table S7).

Dysregulated Expression at Transcriptomic and Epigenetic Levels in ASD
Previous transcriptomic [6,14,15,[23][24][25][26][27][28] and epigenetic [14,15,23] studies on LCLs in ASD help us understand the genes and their functional pathways involved in the risk of developing ASD. For example, Talebizadeh et al. 2014 [24] looked at alternate splicing in ASD using LCLs and identified CYFIP1, ZMYM6, and TRAP150 as potential ASD candidate genes. The RNA-seq dataset from the current study could not detect TRAP150 and showed no significant change for CYFIP1 and ZMYM6. Hu and colleagues [25] showed that the NFKB1 and MBD2 to be a strong candidate for genes for ASD, and the RNA-seq data on ASD LCLs did not detect expression of MBD2 or NFKB1. Mutations in the fragile X mental retardation 1 gene (FMR1) are associated with the inherited form of ASD, and Nishimura and colleagues [26] found downregulation of FMR1 in ASD vs. control samples, and our study confirmed this with qRT-PCR (p < 0.05) (Table S3 and Figure 5). However, for the JAKMIP1 and GPR155 genes, which were differentially expressed in their study [26], our RNA-Seq ASD LCLs data show no significant change for JAKMIP1 and GPR15 (Table S3).
ASD LCLs showed significant upregulation of GABRA4 (Figure 2). An association involving GABRA4 has been reported in ASD patients [27,28], and this is the major inhibitory neurotransmitter in the mammalian brain. GABRA4 is expressed in the thalamus, striatum, cerebral cortex, dentate gyrus (DG), and CA1 region of the hippocampus [29]. Another well-known gene in autism is AUTS2, and this study reported upregulation of AUTS2 expression in ASD LCLs compared with control LCLs (Figure 2, Figure 4A). Most studies have reported intragenic de novo deletions of AUTS2 [30][31][32] and have rarely described disease point mutations [33].
Many studies using LCLs as model cell type to understand autism, including this one, have many limitations, particularly the modest sample sizes in the context of the heterogeneous nature of ASD. Indeed, ASD is defined by a collection of various symptoms, which can be different from patient to patient. Although some categories are used as modifiers of the diagnosis, such as with and without language impairment or intellectual disability, the variation in symptoms has been difficult to easily describe or categorize. Thus, without well-defined subgroups, it may be difficult to find consistency with modest sample sizes. Clearly, larger studies on well-defined groups of individuals with ASD will be needed in the future. However, despite the heterogeneous nature of ASD, this study was consistent with some previous studies.

Immune Abnormalities and ASD
Immunoglobulin polymorphisms are known for infection and auto-immune disease susceptibility. Dysfunction of the immune system appears to be associated with ASD. A recent meta-analysis demonstrated evidence for immunological dysregulation in ASD with a reduction in total IgG and an elevation in the IgG4 subset [34]. In addition, studies have demonstrated that total IgM and IgG concentrations correlate with aberrant behavior with lower immunoglobulin concentrations being associated with worse behavior [35]. Additionally, a meta-analysis also demonstrated that intravenous immunoglobulin improved aberrant behavior, specifically irritability, hyperactivity, and social withdrawal [34]. Thus, this clinical data is consistent with the findings of decreased expression of genes involved in the production of IgG.
IL-27 is a pleiotropic cytokine involved in infection, cellular stress, neurological disease, and cancer that has complex activating and inhibitory properties in both innate and acquired immunity [36]. It is secreted from and binds to microglia, macrophages, and neurons and promotes neuronal survival by regulating cytokines, neuroinflammation, oxidative stress, apoptosis, autophagy, and epigenetics [36].
IL-27 gene expression was found to be markedly elevated in ASD LCLs. In the BTBR mouse model of ASD, studies have demonstrated that IL-27 agonist normalized neuroimmune dysfunction [37] yet other studies show that methylmercury [38] and tyrosine kinase inhibitor tyrphostin AG126 [39] decrease IL-27 in the BTBR mouse model. Another study demonstrated decreased IL-27 production by CD14+ cells derived from ASD individuals following in-vitro immunological challenge [40].
The integrated analysis of miRNA-mRNA showed that 29 miRNAs (18 upregulated and 11 downregulated) and 267 genes formed miRNA-target gene pairs, which may have a role in complex network to regulate the proliferation of neuronal cells, cell death of immune cells, epilepsy or neurodevelopmental disorders, and Wnt/β-catenin and PTEN signaling ( Figure 8). Table S7 represents new mediators of abnormal gene expression that could be potential targets for further exploration and therapeutic interventions in the proliferation of neuronal cells, epilepsy or neurodevelopmental disorders, and Wnt/β-catenin and PTEN signaling. However, functional validation is needed to test our miRNA-mRNA integration findings in ASD LCLs.

Autism and Cancer Genes Link?
Some have observed an overlap between the molecular pathways in ASD and cancer [48]. In this study, we found many cancer genes that demonstrated changes in both mRNA and miRNA expression in the ASD LCLs. The ASD LCLs demonstrated the downregulation of mRNA of several tumor suppressor genes, including KREMEN1 [49], ST5 [50], ILDR1 [51], and FOXP4 [52].

Conclusions
RNA-Seq analysis identified top 45 genes and 29 miRNAs that were differentially expressed in ASD LCLs compared to control LCLs. An integrated miRNA-mRNA analysis showed that the genes in significant pathways that are predicted to be targeted by deregulated miRNAs control apoptosis, cell death of immune cells, cell cycle progression, epilepsy, and proliferation of neuronal cells. Our results reinforce findings from other groups in regard to important underlying molecular pathways and processes, such as the regulation of cell growth, the role of organs besides the brain in ASD (e.g., the immune system), and the overlap with other non-neurodevelopmental diseases, such as cancer. These findings refine the landscape of ASD genes using LCLs and show the importance of studying different cell types to understand the regulatory networks in ASD.