Brazilian Multiethnic Association Study of Genetic Variant Interactions among FOS, CASP8, MMP2 and CRISPLD2 in the Risk of Nonsyndromic Cleft Lip with or without Cleft Palate

Associations of CRISPLD2 (cysteine-rich secretory protein LCCL domain containing 2) and genes belonging to its activation pathway, including FOS (Fos proto-oncogene), CASP8 (caspase 8) and MMP2 (matrix metalloproteinase 2), with nonsyndromic orofacial cleft risk, have been reported, but the results are yet unclear. The aim of this study was to evaluate single nucleotide polymorphisms (SNPs) in FOS, CASP8 and MMP2 and to determine their SNP-SNP interactions with CRISPLD2 variants in the risk of nonsyndromic cleft lip with or without cleft palate (NSCL±P) in the Brazilian population. The SNPs rs1046117 (FOS), rs3769825 (CASP8) and rs243836 (MMP2) were genotyped using TaqMan allelic discrimination assays in a case-control sample containing 801 NSCL±P patients (233 nonsyndromic cleft lip only (NSCLO) and 568 nonsyndromic cleft lip and palate (NSCLP)) and 881 healthy controls via logistic regression analysis adjusted for the effects of sex and genomic ancestry proportions with a multiple comparison p value set at ≤0.01. SNP-SNP interactions with rs1546124, rs8061351, rs2326398 and rs4783099 in CRISPLD2 were performed with the model-based multifactor dimensionality reduction test complemented with a 1000 permutation-based strategy. Although the association between FOS rs1046117 and risk of NSCL±P reached only nominal p values, NSCLO risk was significantly higher in carriers of the FOS rs1046117 C allele (OR: 1.28, 95% CI: 1.10–1.64, p = 0.004), TC heterozygous genotype (OR: 1.59, 95% CI: 1.16–2.18, p = 0.003), and in the dominant model (OR: 1.50, 95% CI: 1.10–2.02, p = 0.007). Individually, no significant associations between cleft risk and the SNPs in CASP8 and MMP2 were observed. SNP-SNP interactions involving CRISPLD2 variants and rs1046117 (FOS), rs3769825 (CASP8) and rs243836 (MMP2) yielded several significant p values, mostly driven by FOS rs1046117 and CASP8 rs3769825 in NSCL±P, FOS rs1046117 in NSCLO and CRISPLD2 rs8061351 in NSCLP. Our study is the first in the Brazilian population to reveal the association of FOS rs1046117 with NSCLO risk, and to support that CRISPLD2, CASP8, FOS and MMP2 interactions may be related to the pathogenesis of this common craniofacial malformation.


Introduction
Nonsyndromic cleft lip only (NSCLO) and nonsyndromic cleft lip and palate (NSCLP), that combined are denominated nonsyndromic cleft lip with or without cleft palate (NSCL±P), are the most common craniofacial defect in the world [1]. Around 1 out of 700 live births worldwide are affected, but large variations, depending on population origin, are observed [1]. Asians and Native Americans have the highest occurrence (1:500), followed by Europeans (1:1000), while Africans have the lowest incidence (1:2500) [2,3]. The incidence fluctuates between 1:650 and 1:2700 live births in Brazil due to the population's ethnic diversity [4,5]. Despite the fact that NSCL±P has a multifactorial and complex genesis, our knowledge about the genetic and environmental players is still incomplete [6]. As a complex polygenic disorder with clear influence of the ancestral origin, it is essential to characterize the critical genetic variants in development pathways that interfere with normal lip and palate embryogenesis, defining common and population-specific risk variants and understanding the contribution for the etiology of malformation.
In an original genome-wide scan, Chiquet et al. revealed a strong linkage between CRISPLD2 and multiplex families with NSCL±P [7]. Follow up studies confirmed that Crispld2 is expressed in the lateral palatine processes during palatogenesis, and singlenucleotide polymorphisms (SNPs) within the CRISPLD2 gene are associated with NSCL±P risk [8,9]. In zebrafish, Crispld2 silencing impaired the neural crest cell migration, resulting in both jaw and palatal abnormalities in a dose-dependent manner [10], which was associated, among other putative candidates, with dysregulated expression of FOS (Fos proto-oncogene, which encodes a leucine zipper protein that can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1), CASP8 (caspase 8) and MMP2 (matrix metalloproteinase 2) [11]. Although putative roles for those genes are expected during orofacial development, only MMP2 has been described as essential for normal palatogenesis [12]. In our previous study, associations of CRISPLD2 rs4783099 and rs8061351 variants with NSCL±P were detected, with rs8061351 association being driven by participants with high African genomic ancestry [13]. Indeed, the influence of the intense admixed ancestry of the Brazilian population, mainly from three different roots (Amerindians, Europeans and Africans), in NSCL±P susceptibility has been previously reported [14][15][16].
In this study, we investigated the association of SNPs in FOS (rs1046117), CASP8 (rs3769825) and MMP2 (rs243836) with NSCL±P risk in a Brazilian population using an ancestry-structured case-control approach. The SNP-SNP epistatic interactions of CRISPLD2 variants (rs1546124, rs8061351, rs2326398 and rs4783099), which were previously studied by us [13], with those SNPs on NSCL±P susceptibility were also assessed.

Ethics
The study was approved by the ethics review board of each of the centers affiliated with the collaborative study (approval number 08452819.0.0000.5418 on 29 April 2019). Written informed consent was obtained from the parents or guardians and/or the participants in compliance with the World Medical Association Declaration of Helsinki, Ethical Principles for Medical Research Involving Human Subjects.

Samples
This case-control study was conducted with samples from 233 unrelated patients with NSCLO and 568 with NSCLP, totaling 801 patients with NSCL±P, and 881 healthy control individuals. Patients with NSCL±P were carefully examined and screened for the presence of associated anomalies or syndromes by the specialized team of the four associated centers for rehabilitation of craniofacial anomalies in the Brazilian Associação de Portadores de Fissura Lábio Palatal (APOFILAB, Cascavel-PR), the Centro de Atendimento Integral ao Fissurado Labiopalatal (CAIF, Curitiba-PR), both located in the South of Brazil, the Centro Pró-Sorriso (Hospital Alzira Velano, UNIFENAS, Alfenas-MG), located in Southeastern Brazil, and the Centro de Reabilitação de Anomalias Crânio Faciais (Hospital Santo Antônio, Salvador-BA), located in Northeastern Brazil. The control group consisted of samples from the same geographic areas and it was composed of healthy individuals with no physical illness, psychiatric, birth defects or a family history of orofacial clefts.

Genotyping and Assessment of Genomic Ancestry
Genomic DNA was extracted from oral mucosa cells obtained by mouthwash with a 3% sucrose solution or by scraping the oral mucosa using a salting-out protocol [17]. PCRbased genotyping of rs1046117 (C_3269911_20), rs3769825 (C_1226568_10), and rs243836 (C_3225973_10) were performed on the StepOnePlus Real-Time PCR platform (Applied Biosystems, Foster City, CA, USA) using TaqMan 5 -exonuclease allelic discrimination assays (Assay-on-Demand service, Applied Biosystems). For genotyping validation, 10% of the sample was randomly selected and the reactions were repeated, revealing a reproducibility of 100%. The genotyping of rs1546124, rs8061351, rs2326398 and rs4783099 in CRISPLD2 was previously reported [13].
Each sample was independently genotyped for 40 biallelic short insertion-deletion polymorphisms (INDELs), which were previously validated as ancestry informative markers of the Brazilian population [13]. The genomic ancestry of each individual in the case-control study was determined with the Structure software version 2.3.4 [18], applying the model of K = 3 for parental populations based on the tri-hybrid origin of the Brazilian population.

Statistical Analysis
The differences between groups were analyzed by chi-square test (for sex) or Kruskal-Wallis test (for genomic ancestry proportions). Genotype distributions were assessed for the Hardy-Weinberg equilibrium (HWE) in the control group by the chi-square test. The multiple logistic regression analysis under unrestricted, dominant and recessive genetic models was performed with the SNPassoc package in the R software, considering sex and ancestry proportions as potential confounders. For those analyses, a Bonferroni-adjusted p value threshold of ≤0.01 was applied.
Pairwise SNP-SNP interactions among FOS, CASP8, MMP2 and CRISPLD2 (rs1546124, rs8061351, rs2326398 and rs4783099) were performed using the model-based multifactor dimensionality reduction (mbmdr) test (mbmdr package for R) adjusted by sex and genomic ancestry and applying cross-validation and 1000 permutation strategies to reject falsepositive interactions [19]. STRING, the protein-protein interaction network software, was applied to investigate the functional interactions among the examined genes (https:// string-db.org/, accessed on 13 December 2022).

Results
Individual variations in the genetic ancestry proportions were detected, but the groups were not statistically different ( Figure 1). All groups showed a higher prevalence of European ancestry compared to African and Amerindian. Regarding sex, the frequency of males was significantly higher in NSCL±P (n = 451, 56.3%, p < 0.0005) and in NSCLP (n = 326, 57.4%, p < 0.0005) than in the control group (n = 421, 47.8%). No significant difference between the control and NSCLO (n = 125, 53.6%) was observed. The genotype call rate ranged from 96.7% to 99.7%, and the genotype distributions had no derivation from Hardy-Weinberg equilibrium in the control group for all SNPs (Table 1).  The FOS rs1046117 polymorphism revealed significant associations ( Table 2). The frequency of the C allele was significantly higher in patients with NSCLO than in individuals of the control group (23.5% vs. 19.6%, p = 0.004), with the OR in heterozygotes of 1.59 (95% CI: 1.16-2.18, p = 0.003) and dominant model of 1.50 (95% CI: 1.10-2.02, p = 0.007). The variant C allele was also more frequent in NSCL±P than in controls (22.5% vs. 19.6%), yielding an OR of 1.19 (95% CI: 1.00-1.41) and a nominal p value of 0.04, which did not resist to correction for multiple comparison tests. This same trend was observed for the TC heterozygotes, with an OR of 1.25 (95% CI: 1.01-1.55, p = 0.04), and the dominant genetic model, with an OR of 1.25 (95% CI: 1.02-1.53, p = 0.03), that were more frequent in NSCL±P compared to controls, but the significance did not remain after application of Bonferroni correction for multiple tests. There was no evidence of allelic or genotypic associations of CASP8 rs3769825 (Table 3) and MMP2 rs243836 (Table 4) with the susceptibility to NSCL±P, NSCLO and NSCLP. The stratified analysis of the samples by genomic ancestry (patients with high European ancestry and patients with high African ancestry) showed no significant results at a Bonferroni threshold (Supplementary Tables S1-S3).  The FOS rs1046117 polymorphism revealed significant associations ( Table 2). The frequency of the C allele was significantly higher in patients with NSCLO than in individuals of the control group (23.5% vs. 19.6%, p = 0.004), with the OR in heterozygotes of 1.59 (95% CI: 1.16-2.18, p = 0.003) and dominant model of 1.50 (95% CI: 1.10-2.02, p = 0.007). The variant C allele was also more frequent in NSCL±P than in controls (22.5% vs. 19.6%), yielding an OR of 1.19 (95% CI: 1.00-1.41) and a nominal p value of 0.04, which did not resist to correction for multiple comparison tests. This same trend was observed for the TC heterozygotes, with an OR of 1.25 (95% CI: 1.01-1.55, p = 0.04), and the dominant genetic model, with an OR of 1.25 (95% CI: 1.02-1.53, p = 0.03), that were more frequent in NSCL±P compared to controls, but the significance did not remain after application of Bonferroni correction for multiple tests. There was no evidence of allelic or genotypic associations of CASP8 rs3769825 (Table 3) and MMP2 rs243836 (Table 4) with the susceptibility to NSCL±P, NSCLO and NSCLP. The stratified analysis of the samples by genomic ancestry (patients with high European ancestry and patients with high African ancestry) showed no significant results at a Bonferroni threshold (Supplementary Tables S1-S3).   As the SNPs rs1546124, rs8061351, rs2326398 and rs4783099 in CRISPLD2 were previously analyzed in this same case-control sample [13], we sought to verify whether SNP-SNP interactions among variants in CRISPLD2 and genes of its pathway could increase the prediction risk for nonsyndromic orofacial clefts. All possible combinations of pairs were analyzed, but only those with nominal p values are depicted in Table 5 (p < 0.05). For NSCL±P, the interactions containing CASP8 rs3769825 and FOS rs104617 (p perm = 0.01) and CASP8 rs3769825 and CRISPLD2 rs8061351 p perm = 0.02) were found to be significant after permutation tests. The FOS rs1046117 interactions with MMP2 rs243836 (p perm = 0.03) and with CASP8 rs3769825 (p perm = 0.05) potentially increased the risk of NSCLO, whereas the interactions between rs8061351 in CRISPLD2 with CASP8 rs3769825 (p perm = 0.02) or with MMP2 rs243836 (p perm = 0.04) increased the risk for NSCLP.

Discussion
Evidence has suggested that CRISPLD2, and genes in its pathway, may play important roles during craniofacial development, and the polymorphic variants in these genes may influence their function, contributing to nonsyndromic orofacial cleft susceptibility [7][8][9][10][11][12][13]. We have demonstrated, in a previous study, the association of CRISPLD2 variants and NSCL±P risk, with a clear influence on the individual genomic ancestry [13]. This influence was observed in other studies, with CRISPLD2 representing a candidate gene for Caucasian, Hispanic, African and Chinese populations [7][8][9]20,21], but not for individuals of Italian or Indian ancestry [22,23]. The current study explored whether polymorphic variants in CRISPLD2-pathway genes such as FOS, CASP8 and MMP2 individually or interacting with CRISPLD2 contribute to NSCL±P susceptibility in the Brazilian population. Although in CASP8 and MMP2, SNPs were not associated with the risk of NSCL±P, a clear tendency between the risk allele of FOS rs1046117 and NSCL±P was observed. Further stratified analysis revealed that the FOS rs1046117 C allele, TC heterozygous genotype and TC/CC genotype, representing the dominant model, significantly increased the risk of NSCLO, but not of NSCLP.
Chiquet et al. demonstrated that Fos is abundantly expressed in the orofacial region during zebrafish development, and the FOS rs1046117 C allele is significantly associated with an increased risk of NSCL±P in non-hispanic white families [11]. In a dimeric form with JUN, ATF or MAF, FOS forms the AP-1 transcription complex, which is implicated in the control of proliferation, differentiation, apoptotic cell death and many other important events associated with both normal development and tumorigenesis [24,25]. The rs1046117, characterized by a T to C transition at nucleotide position 252, represents a synonymous genetic variant with no alteration on the amino-acid sequence of the protein (Reference SNP Report, https://www.ncbi.nlm.nih.gov/snp/?term=rs1046117, accessed on 13 December 2022). However, the predicting effect of rs104117 on protein function was verified in the sorting intolerant from tolerant (SIFT) [26] and combined annotation-dependent depletion (CADD) [27], and both software revealed damaging scores (0.122 for SIFT and 0.867 for CADD) for the protein function in the presence of the variant allele. Consequently, a covered impact of polymorphism on the gene function, including translation efficiency and RNA stability, or affecting the AP-1 complex structure and subsequently its activity, is possible. It still is possible that this SNP belongs to a region within the gene that acts as a cis-regulatory element regulating the transcription of neighboring genes. On the other hand, as rs1046117 belongs to a large linkage disequilibrium block, its association detected in this study may potentially rely on a causative variant in this block.
In the past, genetic studies in nonsyndromic orofacial clefts have mostly focused on the analysis of individual SNPs. Although this approach has allowed the discovery of important candidates for nonsyndromic orofacial cleft risk, it clearly does not uncover potential interactions among them. SNP-SNP interaction analysis applied in this study, representing epistasis, revealed important interactions that predicted the risk of nonsyndromic orofacial clefts. The significant interactions containing FOS rs1046117 with CASP8 rs3769825 in both NSCL±P and NSCLO or with MMP2 rs243836 in NSCLO classified correctly high-risk and low-risk genotypes. Although MMP2 is expressed during craniofacial development, including normal palate fusion process, and MMP2 knockout mice display many craniofacial defects due to dysregulated osteoblast and osteoclast differentiation [28], the specific expression of MMP2 during lip development has never been described. However, due to its importance to extracellular matrix remodeling, facilitating angiogenesis and cellular migration [12,29,30], the expression of MMP2 during lip development is expected. MMP2 expression levels are known to be regulated by the AP-1 transcription factor, which is composed by association of Fos and Jun family members, and a large variety of cytokines and growth factors can trigger cell signaling culminating in MMP2 promoter activation by the convergence at the AP-1 [31]. Although evidence from both animal and human studies support a role for MMP2 as a candidate gene in the occurrence of nonsyndromic orofacial clefts, only one study has explored its genetic variants in NSCL±P risk [32]. During lip and palate morphogenesis, processes such as apoptosis play important roles in different periods, and some craniofacial abnormalities have been attributed to irregular activation of the apoptotic cascade [33,34]. After activation, caspase 8 activates effector caspases, inducing the apoptotic caspase cascade which is essential to medial edge epithelium degradation and subsequent palatal fusion [35]. Collectively, the results suggested that FOS rs1046117 may affect NSCL±P and NSCLO susceptibility by interacting with MMP2 and CASP8.
For NSCLP, interactions with significant p values after correction with 1000 permutations involved CRISPLD2 rs8061351 with CASP8 rs3769825 and with MMP2 rs243836, but not with FOS. CRISPLD2 overexpression promoted apoptosis of lung fibroblasts after activation of multiple proapoptotic genes and caspase activities, and also regulated migration and extracellular matrix genes that modulate lung development and repair, including MMP [36]. The loss-of-function strategy using morpholino targeting Crispld2 revealed a direct control of both Casp8 and Mmp2, supporting a regulatory effect of CRISPLD2 in events dependent on interactions between CRISPLD2-CASP8 and CRISPLD2-MMP2 [11]. Together, our findings show that variants in the CRISPLD2 pathway may influence the risk of NSCLP through potential epistatic interaction.
The study has strengths and limitations. Among the strengths we can highlight its multicenter design, enrolling samples from distinct regions of Brazil, which brings a better representation of the Brazilian population, and the use of robust statistical approaches with control for confounding effects including sex and ancestry proportions and application of correction for multiple comparison tests such as Bonferroni threshold and 1000 permutation, which reduce spurious results. The limitations include the test of only one SNP in each of the candidate genes, the lack of characterization of the impact of SNPs on function of the encoded proteins, the limited power in the stratification analyses due to modest sample size, though the effect of FOS SNP was highlighted when NSCLO was separated from NSCLP, and the absence of environmental factors, which could exert important roles under gene-environment interactions.

Conclusions
Our results demonstrated the association of the genetic variant rs1046117 in FOS with NSCLO in the Brazilian population. Based on the collective information and biological plausibility (Figure 2), genes in the CRISPLD2 pathway are likely to be involved in the occurrence of NSCL±P and they must be studied further in large and independent datasets, providing more valid results for clinical decision-making.
representation of the Brazilian population, and the use of robust statistical approaches with control for confounding effects including sex and ancestry proportions and application of correction for multiple comparison tests such as Bonferroni threshold and 1000 permutation, which reduce spurious results. The limitations include the test of only one SNP in each of the candidate genes, the lack of characterization of the impact of SNPs on function of the encoded proteins, the limited power in the stratification analyses due to modest sample size, though the effect of FOS SNP was highlighted when NSCLO was separated from NSCLP, and the absence of environmental factors, which could exert important roles under gene-environment interactions.

Conclusions
Our results demonstrated the association of the genetic variant rs1046117 in FOS with NSCLO in the Brazilian population. Based on the collective information and biological plausibility (Figure 2), genes in the CRISPLD2 pathway are likely to be involved in the occurrence of NSCL±P and they must be studied further in large and independent datasets, providing more valid results for clinical decision-making.