Whole Exome Sequence Analysis Provides Novel Insights into the Genetic Framework of Childhood-Onset Pulmonary Arterial Hypertension

Pulmonary arterial hypertension (PAH) describes a rare, progressive vascular disease caused by the obstruction of pulmonary arterioles, typically resulting in right heart failure. Whilst PAH most often manifests in adulthood, paediatric disease is considered to be a distinct entity with increased morbidity and often an unexplained resistance to current therapies. Recent genetic studies have substantially increased our understanding of PAH pathogenesis, providing opportunities for molecular diagnosis and presymptomatic genetic testing in families. However, the genetic architecture of childhood-onset PAH remains relatively poorly characterised. We sought to investigate a previously unsolved paediatric cohort (n = 18) using whole exome sequencing to improve the molecular diagnosis of childhood-onset PAH. Through a targeted investigation of 26 candidate genes, we applied a rigorous variant filtering methodology to enrich for rare, likely pathogenic variants. This analysis led to the detection of novel PAH risk alleles in five genes, including the first identification of a heterozygous ATP13A3 mutation in childhood-onset disease. In addition, we provide the first independent validation of BMP10 and PDGFD as genetic risk factors for PAH. These data provide a molecular diagnosis in 28% of paediatric cases, reflecting the increased genetic burden in childhood-onset disease and highlighting the importance of next-generation sequencing approaches to diagnostic surveillance.


Introduction
Pulmonary arterial hypertension (PAH) is an uncommon vascular disorder that remains incurable despite significant advances in current treatment regimens. PAH is characterised by obstruction and occlusion of the pulmonary arterioles leading to progressive pulmonary arterial pressure overload, right ventricular hypertrophy and failure of the right side of the heart [1]. Typical histopathological features of PAH include marked vascular remodelling of the pulmonary arterioles as a consequence of exuberant proliferation of pulmonary artery endothelial (PAEC) and smooth muscle (PASMC) cells. In families, PAH predominantly segregates as an autosomal dominant trait displaying features of complex disease, namely variable expressivity, reduced penetrance and a gender bias favouring females [2]. Despite major advances in delineating PAH aetiology, the associated morbidity and early mortality burden of this disease remains perniciously high, with an average life expectancy of 3-5 years from diagnosis [3].
Whole exome sequencing (WES) has been instrumental in the accelerated gene discovery in PAH, notably in genetic risk unrelated to the canonical BMP pathway. Moreover, these findings both expanded the mutation spectrum in cPAH and shed light on the epidemiology of this discrete disease entity. For example, mutations within the transcription factor genes T-box 4 (TBX4) and SRY-box 17 (SOX17) show enriched contribution to paediatric PAH compared to adult-onset, explaining up to 7.7% and 3.3% of paediatric cases, respectively [8,21,22]. Of note, SOX17 mutations show a strong correlation to PAH associated with congenital heart disease in both adult and paediatric disease [22,23]. More recent findings include loss-of-function variants in the ATP-binding cassette subfamily C member 8 (ABCC8) gene and in the endothelial cell-specific ligand BMP10 [24,25], whilst genome-wide gene burden approaches in predominantly adult cohorts identified the candidate risk genes ATPase 13A3 (ATP13A3), fibulin 2 (FBLN2) and platelet-derived growth factor D (PDGFD), among others [10,14]. These novel gene discoveries demonstrated the benefit of a granular approach to the genomic interrogation of discrete case cohorts.
While these data further developed the molecular architecture of PAH, the mutational landscape remains relatively poorly understood in paediatric cohorts, particularly for more recently reported genes [10,12]. Here, we sought to investigate a previously unsolved paediatric cohort, first, to better define the mutational background of cPAH, and second, to augment the existing body of data underlying this disease. Taken together, this study highlights the hitherto undefined role of ATP13A3 in the pathogenesis of paediatric PAH and, importantly, provides independent validation of putative gene-disease associations in BMP10 and PDGFD.

Patient Cohort and Clinical Assessment
Eighteen unrelated patients and their relatives were recruited from specialist PAH centres in the UK and the Netherlands. Ethical approval for the study was received from local ethics review boards and written informed consent was obtained from all participants or their legal guardian. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the St Thomas' Hospital Research Ethics Committee (08/H0802/32).
Clinical diagnoses were established by PAH specialists in accordance with the classification of the World Symposium on Pulmonary Hypertension [26]. All patients previously underwent conventional genetic screening for BMPR2, ACVRL1 and ENG, and were found to be mutation-negative [16,27]. Following pedigree analysis and clinical examination, patients were classified as having either idiopathic PAH (IPAH; n = 8), PAH associated with congenital heart disease (APAH-CHD; n = 7) or heritable PAH with one or more affected relatives (HPAH; n = 3).

Exome Sequencing
Genomic DNA was extracted from peripheral blood using standard methodologies. Whole exome sequencing for all probands was performed by Novogene Ltd. Briefly, exome libraries were captured using the SureSelect Human All Exon kit (Agilent Technologies, Santa Clara, CA, USA), and enriched fragments underwent paired-end sequencing on an Illumina platform. Quality control filtration of raw FASTQ files discarded read pairs containing adapter contamination and those with <80% of base calls achieving a Phred score of Q30. Sequence alignment to human reference genome assembly GRCh37 was undertaken using Burrows-Wheeler Aligner [28] and Picard MarkDuplicates (Genome Analysis Tool Kit (GATK), Broad Institute, Cambridge, MA, USA) was used to flag PCR duplicate reads for removal. GATK (v3.8) software was used for single nucleotide variant (SNV) and insertion and deletion (indel) detection [29]. Variant annotation was performed using ANNOtate VARiation (ANNOVAR) [30].
Candidate variants were visualised and inspected for artefacts using Integrative Genomics Viewer (IGV) to exclude false positives [35,36]. All identified variants were validated in an independent sample by Sanger sequencing (Eurofins Genomics, Ebersberg, Germany) and tested for familial co-segregation where DNA samples were available.

WES Analysis of the cPAH Cohort
Overall, 92% of base calls achieved a mean quality score of Q30. A mean depth of coverage of 65x (range: 54x-78x) was achieved for all samples, with an average of 91% (range: 86-94%) of bases achieving >20x coverage. Manual inspection of all PAH risk genes using IGV confirmed sufficient coverage of ≥10x across all exonic regions and intron-exon boundaries.
A targeted variant analysis of previously reported risk genes in our panel of childhood-onset IPAH, APAH and HPAH patients revealed putatively pathogenic missense variants in ABCC8, ATP13A3, BMP10, PDGFD and SMAD9 (Table 1). All variants were validated by Sanger sequencing. Based on our exclusion criteria, no other likely causal variants in the reported PAH risk genes were identified in these patients.

Novel Association of ATP13A3 with Paediatric APAH-CHD
We identified a novel missense variant in ATP13A3 (c.1148C>A, p.Thr383Lys) in a male child with APAH-CHD. To the best of our knowledge, this represents the first finding of a molecular defect in this risk gene in a paediatric case. The patient was diagnosed with PAH at nine years of age, associated with a secundum atrial septal defect ( Figure 1A). All in silico predictions were consistent with a high probability of pathogenicity (Table 1). Parental samples were not available to conduct familial segregation analysis.

Novel Association of ATP13A3 with Paediatric APAH-CHD
We identified a novel missense variant in ATP13A3 (c.1148C>A, p.Thr383Lys) in a male child with APAH-CHD. To the best of our knowledge, this represents the first finding of a molecular defect in this risk gene in a paediatric case. The patient was diagnosed with PAH at nine years of age, associated with a secundum atrial septal defect ( Figure 1A). All in silico predictions were consistent with a high probability of pathogenicity (Table 1). Parental samples were not available to conduct familial segregation analysis. While annotated as a missense variant, further analysis of the ATP13A3 c.1148C>A variant revealed that the mutation is located 3 bp from the intron-exon boundary and one nucleotide upstream of the highly conserved donor splicing motif of exon 12 (CAG|gtagga). An independent interrogation of in silico data using the NNSplice tool in MutationTaster revealed a potential disruption of the donor splice site, likely resulting in the formation of an aberrant gene transcript or premature stop codon leading to nonsense-mediated decay of the messenger RNA. However, further work is required on the cDNA level to experimentally validate this predicted deleterious effect.

Novel Mutations of BMP10, PDGFD and ABCC8 in Childhood IPAH
In the IPAH cohort we detected a BMP10 variant (c.247G>A, p.Glu83Lys) in a female Dutch patient diagnosed with severe disease at 36 months. The patient was stably treated on oral dual therapy (bosentan and sildenafil) but died at 18 years of age. This mutation, which is observed in 1/109,376 control individuals in gnomAD and was not previously associated with disease in a PAH clinical setting, provides the first independent confirmation of BMP10 as a PAH risk gene [25]. All While annotated as a missense variant, further analysis of the ATP13A3 c.1148C>A variant revealed that the mutation is located 3 bp from the intron-exon boundary and one nucleotide upstream of the highly conserved donor splicing motif of exon 12 (CAG|gtagga). An independent interrogation of in silico data using the NNSplice tool in MutationTaster revealed a potential disruption of the donor splice site, likely resulting in the formation of an aberrant gene transcript or premature stop codon leading to nonsense-mediated decay of the messenger RNA. However, further work is required on the cDNA level to experimentally validate this predicted deleterious effect.

Novel Mutations of BMP10, PDGFD and ABCC8 in Childhood IPAH
In the IPAH cohort we detected a BMP10 variant (c.247G>A, p.Glu83Lys) in a female Dutch patient diagnosed with severe disease at 36 months. The patient was stably treated on oral dual therapy (bosentan and sildenafil) but died at 18 years of age. This mutation, which is observed in 1/109,376 control individuals in gnomAD and was not previously associated with disease in a PAH clinical setting, provides the first independent confirmation of BMP10 as a PAH risk gene [25]. All bioinformatic predictions were consistent with a high likelihood of pathogenicity (Table 1). Familial co-segregation analysis confirmed the unaffected mother was wild-type but, as no DNA was available from the father, it was not possible to determine whether this variant had arisen de novo in the proband ( Figure 1B).
PDGFD is a newly identified risk gene in PAH [14]; this analysis offers the first independent validation of its causal association with PAH, as well as confirming a specific role in the development of paediatric disease through the identification of a rare missense variant (c.550G>A, p.Glu184Lys) in a female cPAH patient ( Figure 1C). The variant, which is not present in population control databases, gave a CADD score of 28.2 and was predicted to be deleterious by all three missense prediction algorithms (Table 1).
A heterozygous missense variant in ABCC8 (c.1069G>A, p.Val357Ile), novel in the context of PAH, was observed in a male patient diagnosed at 19 months of age ( Figure 1D). While the CADD score and MutationTaster prediction were strongly indicative of a deleterious impact, the conclusions of pathogenicity based on other in silico markers were ambiguous, likely due to the interchangeability of valine and isoleucine residues at this position across related proteins in lower organisms ( Table 1). Unavailability of parental DNA samples precluded familial segregation analysis of the PDGFD and ABCC8 variants.

Analysis of HPAH Samples Identifies a Potential Molecular Defect of SMAD9
An examination of WES data from the three HPAH probands revealed a missense variant in SMAD9 (c.1117G>A, p.Val373Ile) in a single family. Following a diagnosis of PAH, the proband died at five years of age. The affected amino acid residue displays high conservation across species, reflected in high deleterious scores across all in silico prediction algorithms employed (Table 1). Co-segregation was confirmed in two affected family members and an obligate carrier ( Figure 1E). In addition, the candidacy of this variant as a causal defect in PAH was underpinned by the same finding in an HPAH patient reported in the ClinVar database (SCV000384082). However, as an observed allele frequency of >0.0001 in European controls exceeded our threshold for inclusion, this was resultantly considered to be a variant of unknown significance (VUS) in the absence of compelling functional data.

Discussion
This study applied a targeted 26 gene analysis to WES data in a European paediatric cohort, elucidating candidate molecular mechanisms in five patients to further illuminate the genetic architecture of early-onset PAH. It must be noted that all candidate variants presented in this analysis were categorised as variants of unknown significance in accordance with American College of Medical Genetics and Genomics (ACMG) guidelines [37], including those that met our criteria as being highly deleterious. Evidence of variant pathogenicity derived from population frequency, bioinformatic predictions and the biological relevance of these genes to PAH pathology is considered herein.
While ATP13A3 was recently identified and functionally characterised as being causative in adult PAH [10], this study is the first to implicate heterozygous variation of ATP13A3 in paediatric-onset cases. The gene encodes a P-type ATPase for which the substrate specificity and biological role remain poorly defined [38], but is a known transmembrane protein that localises to endosomal compartments and is likely involved in polyamine transport, key molecules for cell growth and proliferation [39,40]. Recent functional work demonstrated that ATP13A3 mRNA is expressed in PASMCs, whilst loss of ATP13A3 inhibits endothelial cell proliferation and increases apoptosis, consistent with disease initiation models in PAH [10]. The novel c.1148C>A (p.Thr383Lys) variant identified in this study alters a highly conserved amino acid residue located within the actuator domain, which functions to dephosphorylate the catalytic phosphorylation domain (Figure 2). The variant is predicted to disrupt splicing and result in nonsense-mediated decay. Although molecular defects in the ATP13A3 gene clearly underlie susceptibility to PAH [10,12,20,41], the mechanism of pathogenicity remains to be defined. Further functional analyses are therefore warranted to elucidate whether ATP13A3 loss-of-function represents a molecular mechanism independent to the BMP signalling pathway, and thus a new therapeutic target.
Genes 2020, 11, x FOR PEER REVIEW 7 of 12 whether ATP13A3 loss-of-function represents a molecular mechanism independent to the BMP signalling pathway, and thus a new therapeutic target. and published reports [38,40]. Likely pathogenic PAH mutations (CADD ≥ 15) reported in the literature are indicated by the filled grey circles [10,12,20,41]. The c.1148C>A (p.T383K) mutation identified in this study is highlighted by the gold star and is located adjacent to a previously reported splice-region variant [10]. The WES analysis workflow in this study identified a novel BMP10 c.247G>A (p.Glu83Lys) missense variant with consistent in silico markers of pathogenicity. The residue is highly conserved across species and is located within the conserved TGF-beta propeptide domain, where a BMP10 nonsense mutation was previously identified in cPAH [25]. This observation supports accumulating evidence implicating BMP10 loss-of-function in PAH pathogenesis [25,42]. BMP10 encodes a high affinity ligand that activates the ALK1 and BMPR2 endothelial cell surface receptor within the BMP signalling pathway [43]. The gene product shares 65% amino acid sequence identity with its BMP9 paralogue, encoded by the GDF2 gene. GDF2 variant enrichment was recently reported in three large PAH cohorts, strongly implicating the BMP9/BMP10/ALK1/BMPR2 signalling pathway in PAH pathobiology [10,12,20]. Both in vitro and in vivo studies propose a model where BMP9 and BMP10 act interchangeably and with compensatory functionality via ALK1 to regulate endothelial cell migration and proliferation, maintaining endothelial quiescence [44,45]. Based on evidence of sequence homology, mature peptide structural similarities and reported functional redundancy of these gene products, it is proposed that the BMP10 missense variant reported here may have a similar pathogenic mechanism to those observed in GDF2; further exploration on both the genetic and functional level is now warranted.
PDGFD missense variants were recently associated with PAH pathogenesis in a case-control analysis of a large, combined cohort [14]. The identification of a novel c.550G>A (p.Glu184Lys) variant of this gene in the cPAH panel analysed here serves to establish PDGFD as a contributing factor to childhood disease and provides independent validation of the discovery study. PDGF family signalling pathways play important roles in vascular pathologies, including mediation of remodelling processes that are known to occur in PAH [46]. The plausibility of this gene contributing to PAH risk is further enhanced by cell-type-specific gene expression studies using single cell RNAseq data, demonstrating high expression in endothelial cells within heart and lung tissue, a key site of disease potentiation in PAH [14]. Of interest, PDGFD presents an exciting drug-targeting candidate, with studies of the tyrosine kinase inhibitor imatinib, a known pharmacological inhibitor  [38,40]. Likely pathogenic PAH mutations (CADD ≥ 15) reported in the literature are indicated by the filled grey circles [10,12,20,41]. The c.1148C>A (p.T383K) mutation identified in this study is highlighted by the gold star and is located adjacent to a previously reported splice-region variant [10]. Numbers indicate amino acid positions at each end of the 10 transmembrane domains. The red triangles denote essential asparagine residues (D498: active catalytic site; D883, D887: Mg 2+ binding sites). A domain: actuator domain; C: carboxyl terminus; N domain: nucleotide binding domain; NH 2 : amino terminus; P domain: phosphorylation domain.
The WES analysis workflow in this study identified a novel BMP10 c.247G>A (p.Glu83Lys) missense variant with consistent in silico markers of pathogenicity. The residue is highly conserved across species and is located within the conserved TGF-beta propeptide domain, where a BMP10 nonsense mutation was previously identified in cPAH [25]. This observation supports accumulating evidence implicating BMP10 loss-of-function in PAH pathogenesis [25,42]. BMP10 encodes a high affinity ligand that activates the ALK1 and BMPR2 endothelial cell surface receptor within the BMP signalling pathway [43]. The gene product shares 65% amino acid sequence identity with its BMP9 paralogue, encoded by the GDF2 gene. GDF2 variant enrichment was recently reported in three large PAH cohorts, strongly implicating the BMP9/BMP10/ALK1/BMPR2 signalling pathway in PAH pathobiology [10,12,20]. Both in vitro and in vivo studies propose a model where BMP9 and BMP10 act interchangeably and with compensatory functionality via ALK1 to regulate endothelial cell migration and proliferation, maintaining endothelial quiescence [44,45]. Based on evidence of sequence homology, mature peptide structural similarities and reported functional redundancy of these gene products, it is proposed that the BMP10 missense variant reported here may have a similar pathogenic mechanism to those observed in GDF2; further exploration on both the genetic and functional level is now warranted.
PDGFD missense variants were recently associated with PAH pathogenesis in a case-control analysis of a large, combined cohort [14]. The identification of a novel c.550G>A (p.Glu184Lys) variant of this gene in the cPAH panel analysed here serves to establish PDGFD as a contributing factor to childhood disease and provides independent validation of the discovery study. PDGF family signalling pathways play important roles in vascular pathologies, including mediation of remodelling processes that are known to occur in PAH [46]. The plausibility of this gene contributing to PAH risk is further enhanced by cell-type-specific gene expression studies using single cell RNA-seq data, demonstrating high expression in endothelial cells within heart and lung tissue, a key site of disease potentiation in PAH [14]. Of interest, PDGFD presents an exciting drug-targeting candidate, with studies of the tyrosine kinase inhibitor imatinib, a known pharmacological inhibitor of PDGF signalling, demonstrating 15-fold reductions in PDGFD expression in cardiac tissue [47] and suppression of medial thickening via receptor inhibition in patient-derived in vitro PASMC models [48].
The novel c.1069G>A (p.Val357Ile) missense variant identified in ABCC8 describes an amino acid substitution in the highly conserved transmembrane domain 1 of the mature and functionally important SUR1 product. This finding supports previous investigations that indicate pathogenic variation in ABCC8 underlies 2-5% of PAH cases, including 10 previous reports of paediatric-onset cases [8,12,24]. The gene encodes the SUR1 subunit of K ATP potassium channels located in cardiac smooth muscle cells. Of note, the variant identified in this study was previously reported to cause hyperinsulinism in compound heterozygosity with a second ABCC8 allele, although without any clinical signs of PAH [49]. Whole cell electrophysiology and radium efflux assays of reported missense variants have demonstrated loss of K ATP channel activity [24]. Taken together, these data suggest that loss of SUR1-dependent channel function represents a pathogenic mechanism in PAH, although the primary physiological role remains unknown. These findings are of clinical pertinence, as ABCC8 is considered a possible therapeutic target wherein impaired K ATP activity could be pharmacologically rescued in vitro using diazoxide, a selective potassium-channel opening drug [24]. This may have further implications for the development of diazoxide-induced pulmonary hypertension in children with hyperinsulinaemic hypoglycaemia [50,51].
The SMAD9 c.1117G>A (p.Val373Ile) variant detected in an HPAH case in this study is predicted to be deleterious by in silico analysis, further supported by familial co-segregation and its location with the C-terminal Mad Homology 2 (MH2) domain, a region of extensive evolutionary conservation. SMAD1, SMAD4 and SMAD9 encode signalling intermediaries that act as downstream mediators of the BMPR2 pathway. Variants in SMAD9 were previously identified in three patients [17,52,53], and were recently validated in both adult-onset and paediatric PAH [12,20,22]. Functional analyses of the impact of SMAD9 variants on SMAD-mediated signalling demonstrated impaired responses to ligand stimulation and reduced transcriptional activation of the downstream BMP target gene, ID2 [17,52]. Although this variant was deemed a VUS based on our minor allele frequency threshold boundary, it is noteworthy that it was previously detected in an independent patient with PAH. This speaks to the challenges of setting an appropriate MAF cut-off when investigating rare diseases, especially in the context of notable reduced penetrance. Whereas most studies consider an approximation of disease allele frequency equivalent to population disease prevalence, inheritance-based calculations derived from Hardy-Weinberg equilibrium with adjustment for nonpenetrant alleles may be a more appropriate method to avoid false negative findings.

Conclusions
Taken together, these findings provide data supporting wider locus heterogeneity of paediatric PAH than previously reported and expand the allelic series of variation in known cPAH genetic risk factors. This study supports the inclusion of these genes in screening panels in future work for diagnostic surveillance of paediatric cases. In particular, we provide independent validation of the recently identified risk factors BMP10 and PDGFD, and the first report of an ATP13A3 variant in a paediatric case, warranting further analysis of understudied cellular pathways key to PAH pathogenesis. Notably, although novel gene detection was beyond the scope of this study in the context of extensive locus heterogeneity, this report emphasises the importance of the rigorous analysis of well-defined case cohorts to delineate the molecular basis of cPAH as a framework for future gene discovery initiatives.

Conflicts of Interest:
The University Medical Center Groningen contracts with Actelion, GSK and Lilly for steering committee and advisory board activities of Berger. All other authors declare no conflicts of interest. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript or in the decision to publish the results.