Human SMAD4 Genomic Variants Identiﬁed in Individuals with Heritable and Early-Onset Thoracic Aortic Disease

: Thoracic aortic aneurysms (TAAs) that progress to acute thoracic aortic dissections (TADs) are life-threatening vascular events that have been associated with altered transforming growth factor (TGF) β signaling. In addition to TAA, multiple genetic vascular disorders, including hereditary hemorrhagic telangiectasia (HHT), involve altered TGF β signaling and vascular malformations. Due to the importance of TGF β , genomic variant databases have been curated for activin receptor-like kinase 1 ( ALK1 ) and endoglin ( ENG ). This case report details seven variants in SMAD4 that are associated with either heritable or early-onset aortic dissections and compares them to pathogenic exon variants in gnomAD v2.1.1. The TAA and TAD variants were identiﬁed through whole exome sequencing of 346 families with unrelated heritable thoracic aortic disease (HTAD) and 355 individuals with early-onset (age ≤ 56 years old) thoracic aortic dissection (ESTAD). An allele frequency ﬁlter of less than 0.05% was applied in the Genome Aggregation Database (gnomAD exome v2.1.1) with a combined annotation-dependent depletion score (CADD) greater than 20. These seven variants also have a higher REVEL score (>0.2), indicating pathogenic potential. Further in vivo and in vitro analysis is needed to evaluate how these variants affect SMAD4 mRNA stability and protein activity in association with thoracic aortic disease.


Introduction
Transforming growth factor β (TGFβ) plays a critical role in vascular development. Many vascular disorders, such as hereditary hemorrhagic telangiectasia (HHT), Marfan syndrome, and thoracic aortic aneurysm and dissection (TAA/TAD) have been associated with disruption of the TGF β signaling axis. Mutations in many proteins that are involved in this pathway have been identified. For example, pathogenic variants that underlie HHT are found in multiple genes, including endoglin (ENG) associated with HHT1, activin receptor-like kinase 1 (ALK1) associated with HHT2, mothers against decapentaplegic homolog 4 (SMAD4) associated with HHT3, and bone morphogenetic protein 9 (BMP9) associated with HHT5. The majority of HHT cases result from variants in ENG or ALK1, and genomic variant databases have been established and maintained that elegantly catalogue numerous ENG and ALK1 genetic variants [1]. Such databases are critical for curating Cardiogenetics 2021, 11 133 genetic variants associated with a disorder and can assist early clinical diagnoses. To date, no such resource exists for SMAD4 or BMP9 variants. Herein, we report seven confirmed SMAD4 variants associated with heritable or early-onset TAA.
Aortic aneurysms are enlargements of the aorta. Thoracic aortic aneurysms are typically asymptomatic and may lead to sudden death due to acute aortic dissections. TAAs are less prevalent and occur in younger patients when compared to abdominal aortic aneurysms. TAAs can result from single-gene pathogenic variants that confer a high risk of TAA, termed heritable thoracic aortic disease (HTAD) [2,3]. Early detection and clinical management of TAAs are critical to prevent deaths due to TAD. SMAD4 protein is a central molecule in TGFβ signal transduction through the canonical arm of the TGFβ signaling pathway. The recognition of the SMAD4 variants reported herein was facilitated by GeneMatcher and MyGene2, nodes of the MatchMaker Exchange platform which serves to connect investigators with an overlapping interest in a registered gene, providing an interface to clinicians and scientists that supports the discovery of genes underlying rare diseases [4]. Herein, we report SMAD4 variants that were identified through a screen of exome sequencing data from individuals with HTAD or TAD performed in the Milewicz lab which employed whole exome sequencing [5,6]. This report increases the accessibility of the identified SMAD4 variants and may be utilized in future efforts that aim to build a SMAD4 variant database for genetic disorders.

Study Population
Whole exome sequencing data were obtained for affected probands and family members from 346 unrelated heritable thoracic aortic disease families (HTAD) and 355 individuals with early-onset (age ≤ 56 years old) thoracic aortic dissection (ESTAD) from 2000 to 2019. Blood or saliva samples were collected after obtaining approval from the Institutional Review Board at the University of Texas Health Science Center at Houston. Informed consent was obtained from all participants.

Whole Exome Sequencing
Exome sequences were captured by SeqCap EZ Exome probes version 2.0 (Roche) and recovered according to the manufacturer's directions. Enriched libraries were then sequenced on an Illumina GAIIx using manufacturer protocols. Reads were mapped to the reference human genome (UCSC hg19) with BWA (Burrows-Wheeler Aligner), and variant detection and genotyping were performed using the UnifiedGenotyper (UG) tool from GATK. Annotation of variants was performed using the SeattleSeq server (http://gvs.gs.washington.edu/SeattleSeqAnnotation, SeattleSeq v.151, accessed on 17 August 2018) and Annovar variant annotation (https://annovar.openbioinformatics.org/ accessed on 17 August 2018). Sanger DNA sequencing assay was performed to validate SMAD4 variants identified by exome sequencing.
To identify potential pathogenic or likely pathogenic SMAD4 (NM_005359.6) variants, whole exome sequencing data were filtered based on three criteria: (1) variants that altered amino acids, including nonsynonymous, stop-loss, stop-gain, coding indel, frameshift, and splice site variants; (2) variants with a minor allele frequency less than 0.05% in the Genome Aggregation Database (gnomAD exome v2.1.1) and a combined annotation-dependent depletion score (CADD) that was larger than 20, and (3) variants that segregated with thoracic aortic disease in HTAD families. The Sorting Intolerant from Tolerant for Genome (SIFT4G) database was used to predict possible damaging variants.

Mapping of Genomic Variants
SMAD4 variants in the gnomAD structural variant (SV) v2.1 were examined and compared to TTA-associated genes. The gnomAD is based on more than 141,000 exomes and genomes from unrelated individuals sequenced as part of various disease-specific and population genetic studies and is aligned against the GRCh37 reference. The referenced data are derived from a whole exome sequencing database described by Karczewski et al. and Collins et al. [7,8].

Results
Exome sequencing analysis identified seven rare variants in SMAD4 with a CADD score of more than 20 that are predicted to result in amino acid substitutions. SMAD4 protein is comprised of two functional domains, named MH1 and MH2, which are held together by a linker. The MH1 domain complexes with DNA [9][10][11], while the MH2 domain located at the C-terminal interacts with other proteins, including other SMAD proteins. For these variants, two are located in each MH1 and MH2 domain, while the remaining three are located in the linker domain ( Figure 1A). The variants M24V, R97L, and P246V, indicated in the Figure by white arrows, have been previously reported [5]. The R97L mutation in the MH1 domain had the highest CADD score (32) and a Rare Exome Variant Ensemble Learner (REVEL) score (0.938), suggesting a more severe phenotype. Further analysis of these variants using the SIFT4G database identified possible damaging (D) or tolerant (T) variants (Table 1). All variants were validated by Sanger sequencing ( Figure 1B). Subsequent analysis showed that R97L is characterized by decreased SMAD4 stability and reduced TGFβ signaling [5]. Two other variants, M24V and P246T, were also associated with TAD [5]. R97L and I525V were identified in unrelated HTAD families; R97L segregated with disease, and I525V was shared by two affected cousins. With the exception of I525V, which was found in two unrelated ESTAD families, the remaining variants were identified in only one ESTAD family. The GenomeAD v2.1.1 database also identified three more variants, viz., R445X, R496C, and I500V (black arrows). Taken together, we report seven novel variants of SMAD4 identified in individuals with either earl-onset or familial TAA.
Cardiogenetics 2021, 11, FOR PEER REVIEW 3 and genomes from unrelated individuals sequenced as part of various disease-specific and population genetic studies and is aligned against the GRCh37 reference. The referenced data are derived from a whole exome sequencing database described by Karczewski et al. and Collins et al. [7,8].

Results
Exome sequencing analysis identified seven rare variants in SMAD4 with a CADD score of more than 20 that are predicted to result in amino acid substitutions. SMAD4 protein is comprised of two functional domains, named MH1 and MH2, which are held together by a linker. The MH1 domain complexes with DNA [9][10][11], while the MH2 domain located at the C-terminal interacts with other proteins, including other SMAD proteins. For these variants, two are located in each MH1 and MH2 domain, while the remaining three are located in the linker domain ( Figure 1A). The variants M24V, R97L, and P246V, indicated in the Figure by white arrows, have been previously reported [5]. The R97L mutation in the MH1 domain had the highest CADD score (32) and a Rare Exome Variant Ensemble Learner (REVEL) score (0.938), suggesting a more severe phenotype. Further analysis of these variants using the SIFT4G database identified possible damaging (D) or tolerant (T) variants ( Table 1). All variants were validated by Sanger sequencing ( Figure 1B). Subsequent analysis showed that R97L is characterized by decreased SMAD4 stability and reduced TGFβ signaling [5]. Two other variants, M24V and P246T, were also associated with TAD [5]. R97L and I525V were identified in unrelated HTAD families; R97L segregated with disease, and I525V was shared by two affected cousins. With the exception of I525V, which was found in two unrelated ESTAD families, the remaining variants were identified in only one ESTAD family. The GenomeAD v2.1.1 database also identified three more variants, viz., R445X, R496C, and I500V (black arrows). Taken together, we report seven novel variants of SMAD4 identified in individuals with either earl-onset or familial TAA.

Discussion
Many TAA are asymptomatic and only diagnosed when a life-threatening TAD occurs. Several parameters are used for early diagnosis of TAA, so that monitoring and clinical interventions can be pursued to prevent TAD. These include a family history of TAA or TAD, genetic testing, and familial screening. There are over 20 genes with evidence that variants within the gene predispose to TAA [12], and sufficient evidence of TAA disease-causing for 11 of these genes [13]. Genetic variants that predispose to TAA lead to decreased smooth muscle cell (SMC) contraction and survival, altered extracellular matrix (ECM) integrity, and decreased canonical TGFβ signaling [2,[14][15][16]. Thus, variants in genes involved in TGFβ signaling, that regulate many processes associated with vascular development and repair, are associated with TAA [12,17,18].
SMAD4 protein plays a critical role in TGFβ signal transduction [12,17,18]. It binds to phosphorylated (i.e., activated) SMAD2/3, and in turn this complex translocates to the nucleus, where it alters target gene expression in concert with other transcription factors [19]. We report rare, predicted damaging SMAD4 variants identified through exome sequencing of a large cohort of patients with TAA. Exome sequencing identified seven variants distinct from the three variants reported in gnomAD v2.1.1 with potential pathogenic or likely pathogenic consequences. These variants are located within regions of SMAD4 that encode the MH1, MH2, and linker domains of SMAD4 protein. It remains unclear whether specific SMAD4 variants associate with AVM formations as seen in HHT, underscoring the need for a SMAD4 variant database. Further analysis of individual variants is necessary to establish disease specificity. Together with available data, the identification of these variants contributes to our understanding of how SMAD4 may protect against vascular disorders.
Research with the global Smad4 knockout mouse model has revealed critical roles in embryogenesis and in the development of cancer [20][21][22]. Smad4 plays a key role in blood vessel angiogenesis, and establishing its importance in vascular development remains a very active area of research. To understand the importance of Smad4 in embryonic blood vessel development and disease, elegant studies have been conducted with animal models of inducible, tissue-specific Smad4 loss. Total loss of Smad4 in mice is embryonically lethal due to the inability of the embryo to complete gastrulation, supporting a critical role for Smad4 in embryonic development [20]. Endothelial cell-specific deletion is also embryonically lethal due to an angiogenesis failure and impaired recruitment of smooth muscle cells [21]. In adult and neonatal mice, inducible Smad4 conditional knockout models have been used to understand the role of Smad4 in adult tissues. Inducible endothelial cell-specific Smad4 knockout mice develop AVMs and vascular defects following tamoxifen injection at postnatal day 1 (PN1), recapitulating, in part, the AVMs that form in patients with HHT [23]. Importantly, endothelial cell-specific inducible deletion of Smad4 has uncovered that AVM formation is associated with reduced signaling through TEK due to increased expression of its antagonistic ligand angiopoietin-2 [24]. This phenotype can be rescued through inhibition of angiopoietin 2, supporting the development of new therapeutic approaches to treat AVMs [24]. While complete loss of Smad4 has been shown to result in early embryonic lethality, the variants identified here were sufficient to support human development but are associated with early onset of familial TAA. These data raise the possibility that specific SMAD4 variants may partially disrupt vascular development and/or exacerbate angiogenic mechanisms that contribute to thoracic aortic disease risk.
In addition to the critical roles of Smad4 in embryonic development in mice, human SMAD4 variants have been identified and are associated with several human diseases. Variants in the SMAD4 gene have been associated with juvenile polyposis syndrome (JPS) and also in association with HHT and Myhre syndrome. More specifically, variants that encode the MH2 domain associate with JPS and HHT and result in decreased activity of SMAD4 protein due to increased ubiquitination-mediated degradation [25]. Additional variants in the MH2 domain that increase SMAD4 activity are also associated with a different disorder, Myhre syndrome [26]. Vascular malformations are a common factor amongst HHT, Myhre syndrome, and TAA, yet only a limited number of specific TAA/TADassociated SMAD4 variants have been identified. A report by Lifei Wu suggests that an S271N mutation in non-MH regions of the protein may not be directly causal to TAA but may contribute to TAA in combination with other risk factors [27]. More recently, an R97L mutation in SMAD4 has also been associated with TAA in the absence of HHT or JPS [5]. Herein, we report results from exome sequencing of a cohort of thoracic aortic disease patients without HHT or JPS and the identification of novel SMAD4 rare and damaging variants, in addition to the previously reported R97L [5]. Together with available data, the knowledge of these variants contributes to our understanding of how SMAD4 variants may contribute to vascular diseases.
With the exception of R97L [5], further investigation and mechanistic validation are needed to determine the physiological relevance of each variant and to examine the mechanistic relationship between SMAD4 and vascular diseases, including thoracic aortic disease, HHT, and Myhre syndrome. Additional research is needed to pursue how these variants affect SMAD4 expression and activity, such as a reduction of SMAD4 mRNA levels, altered protein structure, or allosteric inhibition of the endogenous, normal SMAD4 protein. Finally, the comparison of variants identified through exon sequencing of TAA tissue with potential pathogenic and likely pathogenic variants in gnomAD strengthens the significance of exon sequences encoding the MH2 domain. Taken together, the SMAD4 variants identified in ESTAD and HTAD patients in this report may have utility in future work aimed at generating a detailed database for human SMAD4 variants with possible pathogenic potential. Future investigation in vivo and in vitro of the reported variants is now needed to pursue their molecular functions and advance our understanding of the interaction between the reported SMAD4 variants and vascular health. Informed Consent Statement: Informed consent was obtained from all the participants involved in the study.

Data Availability Statement:
The availability of exome data is depended on individuals' consent. For individuals who signed agreement to deposit their exome data on the NCBI database of Genotypes and Phenotypes (dbGaP), the exome data are available on the dbGaP phs000693.

Conflicts of Interest:
The authors declare no conflict of interest.