Computational and Experimental Analyses for Pathogenicity Prediction of ACVRL1 Missense Variants in Hereditary Hemorrhagic Telangiectasia

Hereditary hemorrhagic telangiectasia (HHT) is a vascular disease caused by the defects of ALK1/ACVRL1 receptor signaling. In this study, we evaluated 25 recently identified ACVRL1 missense variants using multiple computational pathogenicity classifiers and experimentally characterized their signal transduction capacity. Three extracellular residue variants showed no detectable cell surface expression and impairment of bone morphogenetic protein 9 (BMP9) responsiveness of SMAD-dependent transcription in luciferase assays. Four variants with amino acid replacement in the motifs essential for the intracellular kinase function lost SMAD-dependent signaling. Most of other variations in the kinase domain also caused marked downregulation of signaling; however, two variants behaved as the wild-type ACVRL1 did, while computational classifiers predicted their functional abnormalities. Three-dimensional structure prediction using the ColabFold program supported the significance of the L45 loop and NANDOR domain of ACVRL1 for its association with SMAD1 and BMPR2, respectively, and the variations in these motifs resulted in the reduction of SMAD signaling. On the other hand, two of the GS domain variants maintained high signal transduction capacity, which did not accord with their computational pathogenicity prediction. These results affirm the requirement of a combinatory approach using computational and experimental analyses to accurately predict the pathogenicity of ACVRL1 missense variants in the HHT patients.


Introduction
Hereditary hemorrhagic telangiectasia (HHT), or Osler-Weber-Rendu disease, is a vascular disorder of autosomal dominant inheritance. The Curaçao criteria is widely used for the diagnosis based on its features: spontaneous and recurrent epistaxis; mucocutaneus telangiectasia; visceral lesions, such as gastrointestinal telangiectasia and pulmonary/hepatic/cerebral/spinal arteriovenous malformations; and family history [1]. A definitive diagnosis of HHT is made if three or more criteria are satisfied. Defects in vascular ALK1 signaling are implicated in its pathogenesis, and heterozygous variations of the genes encoding endoglin (ENG), activin A receptor like type 1 (ACVRL1), SMAD family member 4 (SMAD4) and bone morphogenetic protein 9 (GDF2) have been identified to be causative in HHT [2].
The ACVRL1 receptor, also called ALK1, forms a complex with bone morphogenetic protein (BMP) receptor type 2 (BMPR2) and endoglin, which is activated upon the BMP9/BMP10 ligand binding and provokes SMAD-dependent transcriptional regulation of downstream genes [3,4]. ALK1 signaling is essential for normal vascular formation during embryonic development, and the mice and zebrafish null for Acvrl1/acvrl1 show embryonic lethality due to severe vascular abnormalities [5,6]. Variations in the causative genes were found in most of the patients with a definite HHT diagnosis, and approximately half of them are in ACVRL1. The haploinsufficiency of ACVRL1 and other disease genes is accepted as a pathogenic mechanism of HHT while protein products of particular missense variants may remain and show diverse modes of functional deficits in patient cells.
The American College of Medical Genetics and Genomics (ACMG) standards and guidelines have been used to evaluate the clinical significance of novel variants [7]. The ACMG guideline includes the utilization of in silico pathogenicity prediction programs. The computational pathogenicity classification is, however, not regarded as decisive (evidence of pathogenicity is supportive: PP3) in the guideline, and the results are sometimes inconsistent among different programs. The experimental studies of gene product functions can provide stronger evidence, PS3, to support the presence of a damaging effect by a certain missense variation. Since previous studies reported functional analyses of ACVRL1 variant proteins [8][9][10], numerous different missense variations have been published or registered in the public databases, such as those by the Associated Regional and University Pathologists (ARUP) laboratories and ClinVar archive, which are present in the ACVRL1 exons that encode the extracellular region, intracellular kinase domain and functional motifs conserved among ALK family receptors.
In the present study, we examined recently identified ACVRL1 missense variants using multiple computational pathogenicity classifiers and characterized their signal transduction capacity in cell culture-based experiments. Our results showed that the evaluation with computational and experimental analyses was not consistent in several variants, supporting a notion that a combinatory approach was necessary to accurately estimate the pathogenicity of ACVRL1 missense variations detected in the HHT patients.

Patient and Computational Pathogenicity Prediction
To obtain the information of ACVRL1 missense variants found in the HHT patients, we utilized the original papers and reviews, public databases and unpublished clinical cases genetically analyzed at our institute. Variant detection of the patients listed in Table 1 was performed with Sanger sequencing of the coding regions and the exon-intron boundaries of ACVRL1 and ENG.

ACVRL1 and Endoglin Expression Plasmids
The expression plasmid constructs for wild-type human ACVRL1 and its variants were prepared with site-directed mutagenesis PCR of the pcDNA3-HASL-ALK1(WT) plasmid provided by Drs. Kohei Miyazono and Tetsuro Watabe [15,16]. Inverse PCR method was employed using the PrimeSTAR MAX enzyme (Takara Bio, Kusatsu, Japan) and oligonucleotide primers with designed mutations. The expression plasmid for human endoglin, pDEF3-Endoglin, was provided by Dr. Fumiko Itoh [17]. HA-tag and FLAG-tag at the C-terminus were used for ACVRL1 and endoglin plasmids, respectively.

Computational Structure Prediction
The three-dimensional structures of the ACVRL1-BMPR2-SMAD1 trimer as well as the ACVRL1-BMPR2 and ACVRL1-SMAD1 dimers were predicted by using ColabFold (AlphaFold2 using MMseqs2) (https://colab.research.google.com/github/sokrypton/ ColabFold/blob/main/AlphaFold2.ipynb, accessed on 22 July 2022) [18]. The amino acid sequences of the receptor intracellular kinase domains, ACVRL1 [amino acid no. (aa) 195-503] and BMPR2 (aa 197-512), and the SMAD1 MH2 domain (aa 271-465) were analyzed with the default settings and heterooligomer option. Predicted local distance difference test (pLDDT) and predicted alignment error (pAE) graphs of five models were generated to select the one with the best prediction quality. The images of protein structures were generated using the PyMOL Molecular Graphics System (Schrödinger, New York, NY, USA).

Statistical Analysis
Tukey's or Dunnett's test was performed using the Prism 9 software (GraphPad Software, Boston, MA, USA). The method used in each experiment was described in the figure legends.
Some of them were detected in unpublished clinical cases at our institute, including a novel variant, c.703G>T (D235Y) ( Table 1). The first patient with a D235Y variation was a 67-year-old woman. She had presented recurrent epistaxis for approximately 10 years. To reduce nasal bleeding, the sphenopalatine artery ligation, internal maxillary artery embolization and nasal mucosa cauterization were performed when she was 65 and 67 years old. She had characteristic telangiectasias on the lip, tongue and fingers. Enhanced computed tomography revealed tortuous and dilated hepatic arteries and multiple hepatic arteriovenous shunts, but pulmonary and cerebral vascular malformations were not detected. Her father also presented recurrent epistaxis, while he did not have an opportunity of the evaluation for the HHT diagnosis. The second D235Y patient was a 61-year-old man. He experienced recurrent epistaxis and had telangiectasias on the fingers. Computed tomography demonstrated multiple arteriovenous shunts in the liver and pancreas but not in the lung. His uncle and aunt were clinically diagnosed as HHT. These patients satisfied the Curaçao criteria for the definite HHT diagnosis. They belonged to unrelated families but lived in the vicinity, and the genetic analysis at our institute was requested by the same hospital. In addition, the genetic analyses at our institute identified the patients with the D176Y, P424T, D437G, R479P or R484L variation, each of which was previously reported in only one case [24,28,35,36,38]. Clinical features of those patients were described in Table 1 and Supplemental Information. families but lived in the vicinity, and the genetic analysis at our institute was reques by the same hospital. In addition, the genetic analyses at our institute identified the tients with the D176Y, P424T, D437G, R479P or R484L variation, each of which was pre ously reported in only one case [24,28,35,36,38]. Clinical features of those patients w described in Table 1 and Supplemental Information.  [19, T52P [21], C90W [20], D176Y [24], L193P [22], T195I [25], R200G [19,20], V205G [19,20], K229R [ D263G [37], T265P [24], H280R [19,20,28,30], L300P [20], H328Y [19,20,27,29], D330Y [10,31, D348E [20], D437G [38], P433S [33,34], P424T [35], C443R [20], P449A [20], C471Y [20], R479P [ R484L [36].
We first performed computational pathogenicity prediction of these variants using four systems, PolyPhen-2, SIFT, PROVEAN and PANTHER, which mainly relied on evolutional conservation and structural/chemical properties of amino acid residues [11,13,41,42]. Being consistent with the fact that they were detected in the patients, many of these variants were predicted to have significant pathogenicity by all the programs. However, three variants (V32E, T52P and T265P) were given incongruous results and were categorized into non-pathogenic groups, such as "benign", "tolerated" or "neutral", by more than one classifier ( Table 2).

Subcellular Localization and Signal Transduction Capacity of Extracellular Residue Variants
To examine how accurately these classifiers can predict functional alteration of ACVRL1 missense variant proteins, we examined their capacity to transduce intracellular signaling in cultured cells. According to the previous studies [8,9,43], we transfected mouse NIH-3T3 cells, which expressed a low level of endogenous Acvrl1 protein, with an expression plasmid for wild-type or variant ACVRL1. Immunocytochemistry using an antibody recognizing the extracellular region of ACVRL1 revealed that three extracellular residue variants (V32E, T52P and C90W) did not show detectable signals on the cell surface while wild-type ACVRL1 was localized on the cell surface ( Figure 2A). Significant amounts of V32E, T52P and C90W variants were detected in the cytoplasm when the staining was performed after the membrane permeabilization with Triton X-100 ( Figure 2B), suggesting that the mutation of V32, T52, or C90 residue affected the cell surface presentation of ACVRL1 receptor. We then studied the effects of these three variations upon BMP9-induced SMAD-mediated transcriptional activity. As expected, NIH-3T3 cells expressing one of these extracellular residue variants showed an apparent reduction in the BMP9-induced activation of BRE-luciferase, which was a well-established reporter for the SMAD-dependent transcriptional regulation ( Figure 2C) [15]. Signal transduction mediated by the V32E variant was markedly downregulated but still remained significantly, suggesting that an undetectable amount of V32E receptor was presented on the cell surface. It is noteworthy that V32E and T52P variants were categorized into non-pathogenic groups by multiple computational classifiers (Table 2) in contrast to their abnormal signal transduction capacity in the experimental analyses. ated by the V32E variant was markedly downregulated but still remained significantly, suggesting that an undetectable amount of V32E receptor was presented on the cell surface. It is noteworthy that V32E and T52P variants were categorized into non-pathogenic groups by multiple computational classifiers (Table 2) in contrast to their abnormal signal transduction capacity in the experimental analyses.

Importance of Functional Motifs and Conserved Residues in Intracellular Kinase Domain
We next examined the function of 14 missense variants of the serine/threonine kinase domain that encompassed most of the ACVRL1 intracellular structure. ACVRL1 phosphorylates SMAD1/5/9 as a major downstream event of ACVRL1-BMPR2 receptor complex activation, and it contains several essential motifs for the kinase activity [44]. K229 and D348 are in the AxK and DLG motifs, respectively, which are essential for the interaction with ATP and magnesium ion, while the HRD motif (aa 328-330) is required for the association with the substrate serine residue [45]. Indeed, four variants with the alteration of conserved residues in those motifs, K229R, H328Y, D330Y and D348E, failed to mediate the BRE-luciferase activation in response to BMP9, although they were correctly expressed on the cell surface ( Figure 3A,B).
We also selected 10 missense variants that had an alteration of the amino acid residues highly conserved among species (Table 2) but not known to be indispensable for the

Importance of Functional Motifs and Conserved Residues in Intracellular Kinase Domain
We next examined the function of 14 missense variants of the serine/threonine kinase domain that encompassed most of the ACVRL1 intracellular structure. ACVRL1 phosphorylates SMAD1/5/9 as a major downstream event of ACVRL1-BMPR2 receptor complex activation, and it contains several essential motifs for the kinase activity [44]. K229 and D348 are in the AxK and DLG motifs, respectively, which are essential for the interaction with ATP and magnesium ion, while the HRD motif (aa 328-330) is required for the association with the substrate serine residue [45]. Indeed, four variants with the alteration of conserved residues in those motifs, K229R, H328Y, D330Y and D348E, failed to mediate the BRE-luciferase activation in response to BMP9, although they were correctly expressed on the cell surface ( Figure 3A,B).
We also selected 10 missense variants that had an alteration of the amino acid residues highly conserved among species (Table 2) but not known to be indispensable for the kinase activity. All of them were localized on the cell surface ( Figure 3A), and the cells expressing eight variant proteins showed the absence or marked reduction of downstream SMAD signaling ( Figure 3B), which got along well with the importance of ACVRL1 kinase function. In contrast, V205G and D235Y variants behaved as the wild-type ACVRL1 did in the luciferase analysis ( Figure 3B), although computational classifiers suggested their functional abnormalities (Table 2). expressing eight variant proteins showed the absence or marked reduction of down SMAD signaling ( Figure 3B), which got along well with the importance of ACVRL1 function. In contrast, V205G and D235Y variants behaved as the wild-type ACVR in the luciferase analysis ( Figure 3B), although computational classifiers suggeste functional abnormalities (

Structural Prediction of ACVRL1-SMAD1 Interaction and Impact of Missense Varia L45 Loop upon Signal Transduction Activity
ACVRL1/ALK1 belongs to the ALK receptor family that is compo ACVR1/ALK2, BMPR1A/ALK3, ACVR1B/ALK4, TGFBR1/ALK5, BMPR1B/ALK ACVR1C/ALK7 [46]. Preceding investigations on different ALK family recepto cated conserved functional domains that also existed in ACVRL1 [46]. For exam L45 loop of TGFBR1/ALK5 was shown to physically associate with SMAD2 [47,48 it was untested whether ACVRL1 also interacted with SMAD proteins via its L45 l to mediate BMP9-induced transcriptional activation, while the D437G variant showed weak but statistically significant signal transduction in BRE-luciferase assays. In contrast, V205G and D235Y variants kept high BMP9-induced signaling activity comparable to that of wild-type ACVRL1. Results are expressed as fold induction over the value obtained for the cells expressing wild-type (WT) ACVRL1 with the vehicle treatment. The data for wild-type ACVRL1 are same as those in Figure 2C. *** p < 0.001 (Tukey's test).

Structural Prediction of ACVRL1-SMAD1 Interaction and Impact of Missense Variations in L45 Loop upon Signal Transduction Activity
ACVRL1/ALK1 belongs to the ALK receptor family that is composed of ACVR1/ALK2, BMPR1A/ALK3, ACVR1B/ALK4, TGFBR1/ALK5, BMPR1B/ALK6 and ACVR1C/ALK7 [46]. Preceding investigations on different ALK family receptors indicated conserved functional domains that also existed in ACVRL1 [46]. For example, the L45 loop of TGFBR1/ALK5 was shown to physically associate with SMAD2 [47,48] while it was untested whether ACVRL1 also interacted with SMAD proteins via its L45 loop (aa 262-270). We then employed ColabFold, a newly-developed computational system based on the AlphaFold2 program [18], for a three-dimensional prediction of the interaction between ACVRL1 and SMAD1. ColabFold predicted stable trimer formation among the ACVRL1 kinase domain (aa 195-503), BMPR2 kinase domain (aa 197-512) and SMAD1 MH2 domain (aa 271-465) ( Figure 4A). In addition to a sufficiently high pLDDT score and a sufficiently low pAE value at the contacts, the contact surfaces of the trimer are free of major collisions, and the electrostatic and shape complementarity makes it a very plausible complex structure. The analysis using the ACVRL1-SMAD1 dimer indicated their interaction via the L45 loop ( Figure 4B). Especially, it was predicted that D263 of ACVRL1 was located in the β sheet at the ACVRL1-SMAD1 interphase and was in close contact with N280 of SMAD1 ( Figure 4B).  Figure 4A). In addition to a sufficiently high pLDDT score and a sufficiently low pAE value at the contacts, the contact surfaces of the trimer are free o major collisions, and the electrostatic and shape complementarity makes it a very plausi ble complex structure. The analysis using the ACVRL1-SMAD1 dimer indicated their in teraction via the L45 loop ( Figure 4B). Especially, it was predicted that D263 of ACVRL1 was located in the β sheet at the ACVRL1-SMAD1 interphase and was in close contac with N280 of SMAD1 ( Figure 4B). Consistently, a D263G variation in the L45 loop resulted in the loss of BMP9-induced BRE-luciferase reporter activity while the variant protein was properly expressed on the cell surface ( Figure 5A,B). We also tested the effect of the T265P variation, which resulted In the ACVRL1-SMAD1 interface shown on the right, the interaction between ACVRL1 D263 in the L45 loop and SMAD1 N280 was predicted. Intramolecular interaction between ACVRL1 D263 and A199 was also indicated. D263 and T265 of ACVRL1 are localized in the same β sheet. (C) ACVRL1-BMPR2 dimer structure (left) with an enlarged view of the dotted box (middle). The figure on the right depicts the image in the middle figure viewed from another direction. In the ACVRL1-BMPR2 interface shown on the right, the interaction between ACVRL1 R484 in the NANDOR domain with BMPR2 D482 and D485 was predicted. On the other hand, ACVRL1 R479 did not show direct interaction with BMPR2. Carbon, oxygen and nitrogen atoms are represented by gray, red and blue, respectively, and hydrogen atoms are omitted for clarity. Dotted lines indicate possible amino acid interaction with the distance less than 4 Å.
Consistently, a D263G variation in the L45 loop resulted in the loss of BMP9-induced BRE-luciferase reporter activity while the variant protein was properly expressed on the cell surface ( Figure 5A,B). We also tested the effect of the T265P variation, which resulted in an amino acid replacement in the same β sheet of the L45 loop. Increase in the BRE-luciferase reporter activity by the BMP9 treatment was apparently low but still statistically significant in the cells expressing the T265P variant ( Figure 5B), suggesting that the alteration of T265 to proline might have a milder impact on the interaction between ACVRL1 and SMAD1 via the L45 loop.
lin. Med. 2023, 12, x FOR PEER REVIEW luciferase reporter activity by the BMP9 treatment was apparently low but stil significant in the cells expressing the T265P variant ( Figure 5B), suggesting th ation of T265 to proline might have a milder impact on the interaction betwe and SMAD1 via the L45 loop.  assays while that mediated by T265P and R484L variants still had statistical significance. In contrast, the cells expressing D176Y and R200G variants showed high transcriptional activation in response to BMP9. Results are expressed as fold induction over the value obtained for the cells expressing wild-type (WT) ACVRL1 with the vehicle treatment. The data for wild-type ACVRL1 are same as those in Figure 2C. ** p < 0.01; *** p < 0.001 (Tukey's test).

Structural Prediction of ACVRL1-BMPR2 Interaction and Influence of Missense Variations in NANDOR and GS Domains
On the other hand, the NANDOR domain (aa 479-489) is located adjacent to the carboxy-terminus of ALK family receptors and plays a central role for the type I-type II receptor complex formation [8,49,50]. A ColabFold prediction indicated the interaction interface between the ACVRL1 kinase domain (aa 195-503) and BMPR2 kinase domain (aa 197-512), in which the R484 residue of ACVRL1 appeared to form salt bridges and close contact with D482 and D485 of BMPR2 ( Figure 4A,C). Mutations to this R484 residue would therefore be associated with ACVRL1 molecular dysfunction. Numerous different variations in R484 have been deposited in the ARUP database, such as those resulting in amino acid change to glutamine, glycine, leucine, proline and tryptophan, and Ricard et al. reported functional deficits of R484Q and R484W variants [8]. In this study, the R484L variant showed downregulation of BMP9-induced signal transduction activity while it kept statistical significance ( Figure 5A,B). In addition, we found that the R479P variant was completely defective in mediating the BMP9 action ( Figure 5A,B). R479 is thoroughly conserved among ACVRL1 proteins in various species (Table 2) as well as other ALK family receptors. We detected the R479P variation in an HHT family in addition to one published case [28] (Table 1), and two other variants, R479L and R479Q, were identified in HHT patients [8,26]. Although ColabFold did not predict that R479 was in direct contact with BMPR2, R479 likely plays an essential role in the ACVRL1 function through a different mechanism.
The other functional motif conserved in ALK family receptors is the GS domain (aa 173-201), which is phosphorylated by BMPR2 upon the BMP ligand binding and acts as a hub for the interaction with SMAD as well as FKBP12 [46]. Unlike the other two functional motifs, ColabFold was not able to show a stable structure of the GS domain when analyzed with BMPR2, SMAD1 and/or FKBP12, preventing us from estimating its importance. Concerning four GS domain variants tested in this paper, the mutations of the T197-equivalent residue (T200) in TGFBR1/ALK5 receptor eliminated TGFβ-induced signal transduction in cell culture experiments [51], but it was not known whether the other three residues were absolutely required for the function of ACVRL1 and other family receptors. Analysis using BRE-luciferase revealed that L193P and T197I variants entirely lacked the signal transduction capacity while D176Y and R200G kept it despite of their pathogenicity predicted by the computational classifiers ( Figure 5A,B, Table 2).

Detailed Characterization of Variants without Functional Defects and Responsiveness to Endoglin Co-Expression
As for D176Y and R200G variants (GS domain) as well as V205G and D235Y variants (kinase domain), we confirmed that they responded to smaller amounts of BMP9 and displayed a normal pattern of dose-dependency in BRE-luciferase assays ( Figure 6A). Their function was kept unaffected when different SMAD-dependent luciferase reporters were used, in contrast to the defects of other variants such as H280R ( Figure 6B,C), and SMAD1 phosphorylation in response to the BMP9 treatment was equivalent in the cells expressing wild-type ACVRL1 and these four variants ( Figure 6D). As previously described [43], the coexpression of endoglin with wild-type ACVRL1 significantly increased the responsiveness to BMP9 ( Figure 7A). In this experimental condition, D176Y, R200G, V205G and D235Y variants mediated normal up regulation of BMP9-induced BRE-luciferase activation by the endoglin co-expression ( Figure 7B). It is therefore suggested that these ACVRL1 variants can normally mediate SMAD-dependent signaling at least in our experimental systems.
It is worthwhile to mention that the endoglin co-expression significantly increased BMP9-induced BRE-luciferase activity mediated by some functionally defective variants (K229R, T265P and D348E) but decreased that mediated by the R484L variant ( Figure 7C), suggesting that ACVRL1 missense variants might have diverse characteristics concerning the endoglin effect. ness to BMP9 ( Figure 7A). In this experimental condition, D176Y, R200G, V205G D235Y variants mediated normal up regulation of BMP9-induced BRE-luciferase ac tion by the endoglin co-expression ( Figure 7B). It is therefore suggested that t ACVRL1 variants can normally mediate SMAD-dependent signaling at least in our ex imental systems.

Discussion
While previous studies analyzed molecular functions of ACVRL1 missense variants in cell culture experiments [8][9][10], they did not utilize in silico pathogenicity classifiers to compare the evaluation results. In this study, we performed computational and experimental analyses for the pathogenicity prediction of ACVRL1 variants that were recently reported for HHT patients ( Figure 1, Table 1). We used four in silico pathogenicity classifiers that evaluated the impact of amino acid replacement by the evolutionary conservation of protein sequence and the structural and chemical characteristics of amino acid residues involved in the substitution (Table 2) [7,11,13,41,42]. The extracellular region of ACVRL1 is less conserved across species and compared to other ALK family receptors, which may explain, at least in part, why multiple classifiers miscategorized V32E and T52P as benign, tolerated or neutral. Immunocytochemistry of these proteins clearly showed their pathogenicity due to the impairment of cell surface localization, indicating its usefulness for the variant classification (Figure 2A,B). The analyses using SMAD-dependent luciferase reporters confirmed their functional defects ( Figure 2C), providing strong evidence, PS3, in the ACMG criteria classification. Since the predictive accuracy of individual classifiers is relatively low, it is recommended to utilize a combination of in silico tools or an ensemble method, such as M-CAP and REVEL, to obtain improved performance [7,[52][53][54]. For example, both V32E and T52P were evaluated to be pathogenic by M-CAP. M-CAP was developed using public database information including patient profiles and results of variant functional analyses to optimize the accuracy of pathogenicity prediction, while it incorporates the evolutionary and structural data from the systems such as SIFT and PolyPhen-2 with its own ways of weighting [53]. On the premise of its accuracy, the usage of clinical and experimental information in in silico classifiers should help better pathogenicity prediction for particular variants when coupled with the evolutional and structural characteristics.
It was unexpected that D176Y, R200G, V205G and D235Y variants maintained high SMAD-dependent signal transduction capacity ( Figures 3B, 5B and 6). These amino acid residues are localized in the functional domains of ACVRL1 and are well conserved among species (Figure 1, Table 2). Four pathogenicity classifiers mostly evaluated them to be pathogenic (Table 2), and M-CAP did as well. All those variants have been observed in more than two independent HHT cases. The D176Y variant was reported by a German group [24] and was additionally found in our genetic analysis of a Japanese patient (Table 1). R200G and V205G were identified in the patients from three and two Japanese families, respectively [20]. While D235Y has not been described until this study, we detected it in two probands with a definite HHT diagnosis and confirmed the absence of ENG exonic variations by Sanger sequencing (H. Morisaki, unpublished observation). It is therefore suggested that these variations are important as a cause of disease. Although the lack of abnormalities in our experimental systems do not inevitably negate the involvement of SMAD signaling, it will be necessary to analyze other mechanisms of variant protein dysfunction including that related to non-SMADdependent protein kinase signaling [55][56][57]. Furthermore, it is important to note that the exonic variations may also lead to splicing defects [9,58]. An in silico analysis of 25 variants using MaxEntScan [59] suggested that the c.703G>T (D235Y) substitution could create an aberrant 5'-splice site in the middle of ACVRL1 exon 6. Future studies may well have to test such a possibility by performing minigene splicing experiments as well as characterizing the ACVRL1 mRNA splicing in patient tissues and cells.
Additionally, V32E, T265P, D437G and R484L variants kept low but statistically significant levels of SMAD-dependent signal transduction capacity while it was markedly down regulated compared to that of wild-type protein ( Figures 2C, 3B and 5B). The patients carrying one of these variations satisfied the definite Curaçao criteria [19,24,35,36] (Table 1), and we assume that these variations are pathogenic despite such residual activity. Remaining signaling activity in the variants does not necessarily corelate with the positive effects by the endoglin co-expression (Figure 7). Among these four variants, only T265P possessed the endoglin responsiveness and R484L even showed decreased SMAD-dependent signaling in the condition with endoglin co-expression. In contrast, K229R and D348E, which did not mediate BMP9-induced SMAD signal activation without the endoglin co-expression ( Figure 3B), displayed a significant response when co-expressed with endoglin. On the assumption that at least particular variant proteins may remain in patient cells without degradation, understanding diverse properties of different variants may help clarify fine pathophysiology in each patient.
ACVRL1 forms the receptor complex with BMPR2 and endoglin and also interacts with the ligands, functional modulators and effector proteins [2]. Although the monomer structure of ACVRL1 intracellular kinase domain was predicted by the homology modeling and determined by crystallography [8,60,61], how ACVRL1 interacts with its partners has not been analyzed. The ColabFold-based structural prediction confirmed the importance of the L45 loop and NANDOR domain for the association of ACVRL1 with SMAD1 and BMPR2, respectively ( Figure 4) [49,62]. Missense variants of these motifs showed apparent deficiency in SMAD-dependent signaling ( Figure 5B), which suggested the compromised protein interaction as a mechanism of abnormalities in the patients with those variations. In this study, however, we were not able to detect significant changes of protein complex structures when variant ACVRL1 proteins were used for the ColabFold analyses. Current in silico pathogenicity classifiers do not provide insight into the mechanisms of variant protein dysfunction, and so, it will be highly beneficial if we can utilize the computational prediction to estimate the impact of variations upon the single chain protein structure as well as the protein complex formation.
The present study evaluated the influence of ACVRL1 missense variations in its functional domains and motifs using computational pathogenicity classification, threedimensional structure prediction and experimental studies for cellular localization and signal transduction. The results in different analyses were sometimes inconsistent with each other, suggesting that a combinatory approach was all the more important to precisely estimate the pathogenicity of ACVRL1 missense variants found in HHT patients.  Informed Consent Statement: Written informed consent was obtained from all the patients described in Table 1.
Data Availability Statement: Information concerning the experimental data, materials and methods presented in this study are available on request from the corresponding author. The clinical information is not publicly available due to the ethical reason.