Association Study between the CD157/BST1 Gene and Autism Spectrum Disorders in a Japanese Population

CD157, also referred to as bone marrow stromal cell antigen-1 (BST-1), is a glycosylphosphatidylinositol-anchored molecule that promotes pre-B-cell growth. Previous studies have reported associations between single-nucleotide polymorphisms (SNPs) of the CD157/BST1 gene with Parkinson’s disease. In an attempt to determine whether SNPs or haplotypes in the CD157/BST1 are associated with other brain disorders, we performed a case-control study including 147 autism spectrum disorder (ASD) patients at Kanazawa University Hospital in Japan and 150 unselected Japanese volunteers by the sequence-specific primer-polymerase chain reaction method combined with fluorescence correlation spectroscopy. Of 93 SNPs examined, two SNPs showed significantly higher allele frequencies in cases with ASDs than in unaffected controls (rs4301112, OR = 6.4, 95% CI = 1.9 to 22, p = 0.0007; and rs28532698, OR = 6.2, 95% CI = 1.8 to 21, p = 0.0012; Fisher’s exact test; p < 0.002 was considered significant after multiple testing correction). In addition, CT genotype in rs10001565 was more frequently observed in the ASD group than in the control group (OR = 15, 95% CI = 2.0 to 117, p = 0.0007; Fisher’s exact test). The present data indicate that genetic variation of the CD157/BST1 gene might confer susceptibility to ASDs.


Subjects
We recruited 147 ASD subjects (113 males, 34 females; 15.6 ± 0.6 years) from the outpatient psychiatry department of the Kanazawa University Hospital as previously described [3,33]. All subjects fulfilled the DSM-IV criteria for pervasive developmental disorder. The diagnoses were made by two experienced child psychiatrists through interviews and clinical record reviews, as described previously [3], and the subjects had no apparent physical anomalies. The two experienced child psychiatrists independently confirmed the diagnosis of ASD for all patients by semi-structured behavior observations and interviews with the subjects and their parents. At the interviews with the parents, which were helpful in the evaluation of autism-specific behaviors and symptoms, the examiner used one of the following methods: the Asperger Syndrome Diagnostic Interview [34], Autism Diagnostic Interview-Revised (ADI-R) [35], Pervasive Developmental Disorders Autism Society Japan Rating Scale [36], Diagnostic Interview for Social and Communication Disorders [37], or Tokyo Autistic Behavior Scale [38]. The 150 controls (115 males, 35 females; 23.8 ± 0.3 years) were unselected Japanese volunteers. All patients and controls were Japanese with no non-Japanese parents or grandparents. This study was approved by the ethics committees of Kanazawa University School of Medicine. All examinations were performed after informed consent according to the Declaration of Helsinki.

Genotyping
Genomic DNA was extracted as previously described [33] from venous blood samples using a kit (Wizard Genomic DNA Purification kit; Promega, Madison, WI, USA), or from nails using the (ISOHAIR DNA extraction kit; Nippon Gene, Tokyo, Japan). In some instances, genomic DNA samples were subjected to the whole-genome amplification method (the REPLI-g kit; Qiagen, Hilden, Germany). Then SNPs were determined at Kurabo Industries Ltd. (Osaka, Japan) by the sequence-specific primer (SSP)-PCR method combined with fluorescence correlation spectroscopy as described by Nishida et al. [39]. The SNPs selected for genotyping were mostly with a minor allele frequency (MAF) >0.1, as indicated by the dbSNP database [40], HapMap genome browser (release 27) [41], and 1000 Genomes Project database [42,43] in the JPT (Japanese in Tokyo, Japan), CHB (Han Chinese in Beijing, China) plus JPT, and global populations (Supplementary Table S1). These SNPs were located in a region covering the CD157/BST1 gene (chr4:15704573-15733796, based on the human genome assembly GRCh37/hg19 at the UCSC Genome Bioinformatics Site [44]. The most upstream and downstream SNPs were rs112044965 at chr4:15704603 and rs11934811 at chr4:15738253, respectively. Inter-SNP distance was less than 2 kb. Linkage disequilibrium (LD) blocks in our sample were analyzed by HaploView 4.2 [45].

Statistical Analysis
Genotype and allele frequencies were analyzed using a contingency table and the Fisher exact test (GraphPad Prism 6; GraphPad Software Inc., San Diego, CA, USA), and p-values smaller than 0.05 were considered to be statistically significant. Multiple-testing correction was performed after controlling for LD between the selected SNPs by the method of Nyholt [46,47]. The estimated effective number for independent loci was 23 and α was estimated to be equal to 0.002. p-Values below 0.002 were thus considered significant for single SNP association analysis.
Statistical power was calculated using the Genetic Power Calculator [50,51]; calculations were undertaken assuming a population prevalence of 0.015 for ASD [52], a false-positive rate (α) of 0.05, and a D′ value of 1 between the marker and disease, with a false positive rate of 5%. Alternatively, Chi-squared power calculation was done using the statistical package R; effect sizes were calculated following the method described by Chinn [53]. Table S1), 93 with a high success rate (>95%) were further subjected to statistical analysis. Among them, three SNPs showed significantly higher allele frequencies in cases with ASDs than in unaffected controls (rs4301112, OR = 6.4, 95% CI = 1.9 to 22, p = 0.0007; rs28532698, OR = 6.2, 95% CI = 1.8 to 21, p = 0.0012 and rs10001565, OR = 5.5, 95% CI = 1.6 to 19, p = 0.0038; Fisher's exact test; Table 1). rs4301112, rs28532698, and rs10001565 are located in introns 4, 6, and 7, respectively ( Figure 1). After multiple testing correction for effective total number of SNPs, significantly higher allele frequency was observed in rs4301112 and rs28532698, but not in rs10001565 (Table 1). In rs10001565, only C/T genotype was significantly more frequent in the ASD group than in the unaffected control group (OR = 15, 95% CI = 2.0 to 117, p = 0.0007; Fisher's exact test; Table 1).
We then analyzed the data based on three different genetic models. The three SNPs (rs4301112, rs28532698, and rs10001565) showed significant associations with ASD in a recessive model, but not in additive and dominant models (Supplementary Table S3 Table S4). In addition, we assessed HWE in three genetic models by the likelihood ratio test [49]. Higher p-values (>0.05) were obtained in the recessive model (p = 0.0870 for rs4301112, p = 0.0876 for rs28532698, and p = 0.0993 for rs10001565; Supplementary Tables S4-S6).
Using the Genetic Power Calculator [50,51], the power of a significance test (type I error rate of 0.05, Table 1) was calculated to be 1.0 for the three SNPs. In the three genetic model analysis; the highest statistical power was 1.0 under a recessive model for the three SNPs, with lowest value being 0.05 for rs1001565 under a dominant model (Supplementary Table S3). rs12502586 was not tested (plain and italicized). Red lettering represents SNPs that showed significant association with ASD in allele and/or genotype frequencies in the present study; asterisks indicate those previously reported as Parkinson's disease-associated markers [21,[23][24][25][27][28][29][30]. The locations of the SNPs on human chromosome 4 (chr4) are indicated in parentheses; numbers after colons represent genomic positions based on the human genome assembly GRCh37/hg19 at the UCSC Genome Bioinformatics Site [44]. Haplotype analysis revealed that 13 cases (9.0%, n = 145) carried all the minor alleles of the three SNPs (AG/AG/CT for rs4301112-rs28532698-rs10001565), whereas only one (0.7%, n = 141) did in the control group (OR = 14.2, 95% CI = 1.4 to 110; Table 2). LD analysis of these SNPs identified two haplotype blocks: a 5-kb one comprising the ASD-associated rs4301112, rs28532698 and rs10001565 (Block 1; Figure 2), and a 12-kb one including the SNPs associated with Parkinson's disease (Block 2; Figure 2).

Discussion
In this study, we performed a case-control study in a Japanese population to test for genetic association between 93 SNPs in the CD157/BST1 gene and ASDs. Our results show three possible risk SNPs for ASDs. As these SNPs are in high LD, it is likely that the results represent only one effect.
Additionally, in the UCSC (GRCh37/hg19) track "Transcription Factor ChIP-seq (161 factors) from ENCODE [59,60] with Factorbook Motifs", the LD block 1 between rs4301112 and rs10001565 (Chr4: 15717226-15722573) includes predicted binding sites for c-Jun, STAT3 (signal transducer and activator of transcription 3), FOXP2 (forkhead box protein P2), PolR2a (Polκ RNA polymerase II polypeptide A), Elf-1 (E74-like factor 1), HNF4G (hepatocyte nuclear factor 4 gamma), HNF4A (hepatocyte nuclear factor 4 alpha), JunD, and C/EBPβ (CCAAT/enhancer binding protein beta). These sites are also overlapped with a peak of H3K27Ac Mark track, where acetylation of lysine 27 of the H3 histone protein is thought to enhance transcription [61] and possibly regulates brain development [61,62]. Of these transcription factors, FOXP2 is of particular interest, because its genetic abnormalities have been implicated in speech and language disorders [63,64]. A chromosomal translocation disrupting the FOXP2 gene and a point mutation causing an amino-acid substitution in its forkhead domain have been identified in patients with severe developmental disorders of speech and language [63]. FOXP2 mRNA is expressed in the developing human brain, in good concordance with anomalous sites identified by brain imaging in adult speech and language disorders [64]. In this study, ASD-associated SNPs were located separately from Parkinson's disease-associated ones. It is tempting to postulate that, during early brain development, CD157/BST-1 expression is under FOXP2-mediated transcriptional control, which may not involve the region containing Parkinson disease-associated SNPs. Future studies will be directed to explore these possibilities experimentally.
The limitation of this study is that sample size is small. In particular, the heterozygote numbers observed were small in both case and control groups, resulting in deviation from HWE and limited reliability and usefulness of the three SNPs as biomarkers. Although our results favor a recessive model, effect size of the CD157/BST1 genetic variants should be carefully estimated. We tested HWE in unselected Japanese populations deposited in the HapMap [41], 1000 Genomes Project database [42], and human genome variation database [65], but did not detect any deviation in all seven available entries (one for rs4301112, one for rs28532698 and five for rs10001565; Supplementary Table S7). The reason for this discrepancy remains unknown: we have not recognized population stratification, admixture and cryptic relatedness among the subjects in this study. Future studies with larger sample size and/or family-based association testing are needed. Additionally, there are ethnic differences in allele frequencies; global MAFs for rs4301112, rs28532698 and rs10001565 are nearly 16%, whereas those in unselected Japanese populations are as low as 3% (the 1000 Genomes Project database [42,43], Supplementary Table S1). Therefore, replication in independent populations with various ethnic backgrounds is necessary.

Conclusions
We report association between SNPs (rs4301112, rs28532698, and rs10001565) located in the CD157/BST1 gene with ASD. Our results warrant further analysis of CD157/BST1 variants in ASD patients.