Contribution of LRP1 in Human Congenital Heart Disease Correlates with Its Roles in the Outflow Tract and Atrioventricular Cushion Development

Due to the prevalence of congenital heart disease in the human population, determining the role of variants in congenital heart disease (CHD) can give a better understanding of the cause of the disorder. A homozygous missense mutation in the LDL receptor-related protein 1 (Lrp1) in mice was shown to cause congenital heart defects, including atrioventricular septal defect (AVSD) and double outlet right ventricle (DORV). Integrative analysis of publicly available single-cell RNA sequencing (scRNA-seq) datasets and spatial transcriptomics of human and mouse hearts indicated that LRP1 is predominantly expressed in mesenchymal cells and mainly located in the developing outflow tract and atrioventricular cushion. Gene burden analysis of 1922 CHD individuals versus 2602 controls with whole-exome sequencing showed a significant excess of rare damaging LRP1 mutations in CHD (odds ratio (OR) = 2.22, p = 1.92 × 10−4), especially in conotruncal defect with OR of 2.37 (p = 1.77 × 10−3) and atrioventricular septal defect with OR of 3.14 (p = 0.0194). Interestingly, there is a significant relationship between those variants that have an allele frequency below 0.01% and atrioventricular septal defect, which is the phenotype observed previously in a homozygous N-ethyl-N-nitrosourea (ENU)-induced Lrp1 mutant mouse line.


Introduction
LDL receptor-related protein 1 (LRP1) is a member of the LDL receptor family and is a multifunctional receptor that binds to multiple ligands; LRP1 plays important roles in orchestrating different cellular and molecular functions. We previously identified a homozygous missense mutation in this Lrp1 in mice cause congenital heart defects (CHDs), including atrioventricular septal defect (AVSD) and double outlet right ventricle (DORV) [1]. We provided evidence that LRP1 function in neural cardiac crest cells is required for normal outflow tract (OFT) alignment and atrioventricular cushion (AVC) development. To explore the potential mechanism by how LRP1 mediates cardiovascular diseases, expression patterns in adult human tissues, human embryos, and mouse embryos were investigated.
To investigate the role of LRP1 in human CHD, a large patient cohort of 1922 CHD cases from the Pediatric Cardiac Genomics Consortium (PCGC) [2] and the UPMC Children's Hospital of Pittsburgh (CHP), plus 2602 controls from the Alzheimer's Disease Sequencing Project (ADSP) [3] were analyzed. We accessed whole-exome sequencing data and conducted the gene-burden analysis to study the association of rare putative damaging variants (PDVs) (resulting in nonsense, start-loss, splice-site, frameshift indel, non-frameshift indel, and missense disruption) in LRP1 with CHD phenotypes. Putative damaging missense (D_Mis) was called likely damaging by at least 4 of 9 prediction algorithms. Published single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics data were employed to uncover spatial-resolved and cellular expression of LRP1 in mice [4] and human embryonic hearts [5].

Analysis of Publicly Available Single-Cell RNA Sequencing Data
Publicly available raw single-cell RNA sequencing (scRNA-seq) data of the mouse embryonic hearts from E8.5 to E10.5 were downloaded from the NCBI GEO database under accession number GSE76118 [4]. Expression levels were quantified using RSEM v1.3.3. We generated a cell × gene transcript per million (TPM) matrix at the gene level after aggregating the expression of all cells together. Expression of all genes in the LRP1 cluster was combined to generate the expression of the LRP1. Downstream analyses were performed as described previously [5,6].

Human Study Participants
All data access requests, and human studies were approved by the Institutional Review Board of the University of Pittsburgh School of Medicine and the UPMC Children's Hospital of Pittsburgh (CHP). All consenting participants were approved by the relevant review committees. The personal identities of the study participants were encrypted and secured in accordance with approved guidelines and regulations. This research was partly supported by the University of Pittsburgh Center for Research Computing through computing resources provided. We analyzed whole-exome sequencing (WES) data from 471 CHD patients from the UPMC Children's Hospital of Pittsburgh UPMC (Pitt), 1451 CHD patients from the Pediatric Cardiac Genomics Consortium (PCGC) [2], and 2602 controls from the Alzheimer's Disease Sequencing Project (ADSP) [3] with European ancestry. To investigate phenotype-specific effects, patients were grouped into those with conotruncal defect (CTD), left outflow tract obstruction (LVOTO), and atrioventricular septal defect.

Recovery of Rare Predicated Pathogenic LRP1 Variants
For Pitt subjects, whole-exome sequencing (WES) was carried out on Illumina HiSeq2000 (BGI genomics, Cambridge, USA) with 100 paired-end reads at 80-100× coverage using Agilent V4 or V5 exome capture kit (Agilent, Santa Clara, CA, USA). For samples obtained from the PCGC (dbGaP phs001194.v2.p2) and healthy control samples obtained from the ADSP (NG00067.v2), SRA files were downloaded from the NCBI SRA database and con-verted to FASTQ files using SRA-toolkit (BIOWULF, Bethesda, MD, USA). BWA-MEM [7] was used to align reads in FASTQ files to the human reference genome GRCh38. BAM files were further processed using GATK4 Best Practices workflows [8]. The intersection of the WES capture kit intervals used to sequence each cohort was taken, and single nucleotide variants (SNVs) and small indels (InDels) were detected individually using GATK HaplotypeCaller (BIOWULF, Bethesda, MD, USA) and jointly called using GATK Geno-typeGVCFs (BIOWULF, Bethesda, MD, USA). Further quality filtering was applied using bcftools 1.9 [9] and qctool 2.0.6. High-quality variants were recovered that: (1) have excess heterozygosity p-value > 3.4 × 10 −6 ; (2) passed GATK Variant Score Quality Recalibration (VSQR) with 99.95% sensitivity; (3) have SNV or indel genotype quality ≥ 20 or ≥ 60, respectively; (4) are SNVs or InDels not within 10 bp or 5 bp of an indel, respectively; (5) have missing rate < 10% and differential missingness p-value > 10 −6 ; and (6) have control HWE p-value > 10 −6 . Variants were annotated using Ensembl VEP v102 [10] with variant identifiers, gene symbol in NCBI RefSeq v109 [11], a variant consequence of the most severely affected transcript, allele frequency in gnomAD exomes v2.1.1 [12], ClinVar [13] significance, and variant deleteriousness predictors such as SIFT [14] and PolyPhen [15]. Phred-scaled CADD scores [16] were obtained from CADD v1.6. Only variants identified in LRP1 were used for this analysis. Samples with a FREEMIX score [17] greater than 0.075 were considered contaminated and removed before filtering. Samples with missingness greater than 10% and outliers in the number of variants present were removed before analysis. To remove pairs with cryptic relatedness, one sample was removed for each pair found to be related by pedigree or KING kinship analysis [18] (PLINK, cutoff = 0.09375 for second-degree relatives), and samples with 5 or more relationships were removed. Principal component analysis (PCA) was performed using genotypes of common variants with AF > 0.05 in PLINK 1.9 [19] to determine samples with European ancestry similar to CHP in PCGC and ADSP cohorts. A total of 471 Pitt cases, 1451 PCGC cases, and 2602 ADSP controls passed sample-level filtering. In addition, only protein-altering variants were retained for analysis, including predicted loss-of-function (LoF) mutations (nonsense, canonical splice-site, frameshift indels, and start loss), inframe indels, and predicted damaging missense mutation (D_Mis). As many missense variants are tolerant and would affect the degree of enrichment of pathogenic variants in a case-cohort, only predicted D_Mis called likely pathogenic by at least 4 of 9 prediction algorithms (SIFT, Polyphen2_HDIV, LRT, MutationTaster, MutationAssessor, FATHMM, PROVEAN, MetaSVM, M_CAP) were kept for downstream analyses (Supplementary Table S1).

Gene-Based Burden Testing
The Genome Aggregation Database (gnomAD) exome v2.1.1 (125,748 exomes) or ExAC (60,706 exomes) databases integrating large-scale exome sequencing projects with variable ancestry backgrounds were used as controls for gene burden analysis, similar to our previous published studies [6]. We tested whether there is a significant excess of LRP1 rare damaging variants in a case-cohort compared to the control cohort (gnomAD exome v2.1.1) using only high-confidence pathogenic variants as described above. For the control cohort (gnomAD exome v2.1.1), only LRP1 variants with high-quality calls (PASS filter value) and with coverage at >10× in >90% of samples were retained for downstream analyses. The rare predicted pathogenic LRP1 variants were extracted as described above. The total number of alleles evaluated in LRP1 was taken as the median of the allele numbers recovered for all rare damaging LRP1 variants as previously described [20,21]. Fisher's exact test was used to estimate the p-value and the odds ratio (OR) with 95% confidence intervals for the Bonferronicorrected significance threshold [10,22]. Similar burden analyses were conducted for rare synonymous variants in LRP1, which is not expected to be disease-related. This showed a significant increase in burden in cases vs. controls (Supplementary Table S2).

LRP1 Is Expressed in the Developing Cardiac OFT and AVC
Analyzing published single-cell RNA data in mouse embryos [4] and LRP1 immunostaining ( Figure 1A) showed that Lrp1 is predominantly expressed in OFT and AVC ( Figure 1B), consistent with our previous discoveries that Lrp1 is expressed in the developing AVC, developing ventricle, atria, and OFT in the E10.5 mouse heart [1]. By analyzing two independent published single-cell RNA datasets, we found that LRP1 is highly expressed in developing human [5] hearts. Spatial transcriptomics of the developing human heart demonstrated LRP1 is expressed in mesenchymal cells beginning at post-conception week (PCW) 4.5 to 5 and highly expressed in the developing outflow tract (OFT) at PCW 6.5, spanning the time of great arteries development [5] (Figure 2A,B).

LRP1 Is Highly Expressed in the Human Aorta
We used the GTEx database to investigate the transcriptional expression of LRP1 in a wide range of human adult tissues, including the heart (aorta, ventricle, atria) [23,24]. Examining all the tissues from European-American induvial in the GTEx database demonstrated the highest expression of LRP1 in the human aorta ( Figure 3A). This finding is consistent with the LRP1 expression in the outflow tract of human [5] and mouse hearts [1,4]. Analysis of scRNA-seq data of human adult aorta [25] demonstrated that LRP1 is highly expressed in fibroblasts, mesothelial, and vascular smooth muscle cells (VSMCs) in the human adult aorta [25] ( Figure 3B) consistent with the previous discovery that LRP1 maintains arterial integrity [26].

Nonsynonymous, Rare, and Putative Damaging Variants in LRP1 Are Significantly Associated with CHD
A total of 1922 subjects with CHD (comprised of subjects from PCGC 2 and CHP) and 2602 control subjects from the Alzheimer's Disease Sequencing Project (ADSP) (NI-AGADS) [3] as a control for background population variation were analyzed. Healthy controls from Alzheimer's Disease Sequencing Project (ADSP) were treated as the control group in our study. The GnomAD database also employs whole-exome sequencing of healthy samples from ADSP as a GnomAD control subset [3]. In ASDP, cognitively healthy controls were selected with the goal of identifying alleles associated with the increased risk of or protection from late-onset Alzheimer's disease. All potential controls were at least 60 years old and were either judged to be cognitively normal or did not meet pathological criteria for Alzheimer's disease following brain autopsy [27]. Further, human exome data from the Exome Aggregation Consortium (ExAC) database of >60,000 individuals [10] with and without CHD showed that a PLI score (indicating the likelihood that a gene is intolerant to a loss of function mutation) and Z score for missense mutation of LRP1 are 1 and 8.25, respectively. It demonstrates that LRP1 is highly intolerant to loss-of-function and missense mutations, confirming that this gene is essential for human viability. This shows that rare nonsynonymous variants in LRP1 affect the cardiac formation and provide important mechanistic insights into gene function and protein domains.  Pancreas. GTEx database contains RNA-seq datasets of 54 adult human tissues or organs or cell lines. In Figure 2C, we removed three cell lines and only kept 51 human tissues/organs.

Rare Potentially Pathogenetic Variants (PPV) Are Enriched in CHD
As many missense variants are neutral and would impair the degree of enrichment of pathogenic variants, we filtered in all CHD cohorts (PCGC [2] and CHP) and NIAGADS [3] control cohort by using the following criteria: (1) variant filtering criteria: MAF in geno-mAD v211 exome <0.0001; (2) for missense variants, it should be deleteriously predicted by at least 4 of 9 prediction algorithms (SIFT, LRT, Polyphen2_HDIV, LRT, MutationTaster, Mu-tationAssessor, FATHMM, PROVEAN, MetaSVM, M_CAP) to be retained for downstream analyses; and (3) loss of function variants were kept for downstream analyses. PPVs in LRP1 were identified in 58 unrelated individuals with various CHDs. These patients harbor rare PDVs in LRP1 with a minor allele frequency <0.01% (Figure 4, Supplemental Table S3).
Gene burden analysis showed that rare potential damaging missense variants in LRP1 are the main contributor to CHD rather than loss of function (LoF) (Supplemental Table S2). We observed a significant excess of rare damaging LRP1 variants (missense variants + loss of function) in all CHD (OR = 2.22, p = 0.000192), conotruncal defects (CTDs) (OR = 2.37, p = 0.00177), left outflow tract obstructions (LVOTO, OR = 1.86, p = 0.0307), and AVSD (OR = 3.14, p = 0.0194) compared with controls ( Figure 5A,B, Supplemental Table S4). Similar association results of rare LRP1 D_Mis variants in CHD were found. We observed that conotruncal defects have the highest odd ratios in relation to rare LRP1 variants compared with controls. These observations held when we analyzed the rare potentially damaging missense variants in LRP1 associated with CHD (p = 0.000226), CTD (p = 0.00309), LVOTO (p = 0.0341), and AVSD (p = 0.0121) (Supplementary Table S5). The results are similar to the rare damaging LRP1 variants combined with missense variants and loss of function variants ( Figure 5, Supplementary Tables S4 and S5). Gene burden analysis also demonstrated a significant excess of rare LRP1 PDVs. The highest OR (3.24) was observed in Tetralogy of Fallot in line with the observation that LRP1 is highly expressed in adult aorta across 51 human adult tissue sites from Genotype-Tissue Expression (GTEx) database [23,24] as well as scRNA-seq data [25] (Figures 2 and 3).  The statistical analysis compared with LRP1 rare potential damaging variants with different congenital heart defects using the normal control from ADSP/NIAGADS [28] as a control reference.

Distribution of Rare Damaging Variants in LRP1
Distinct rare putative damaging variants (PDVs) from CHD cohorts mapped along the protein sequence identify putative hotspots of pathogenic damaging mutations. Most PDVs are in the extracellular domain, rarely in the transmembrane or cytoplasmic domains (Figure 4, Supplementary Table S3). These damaging variants include a calcium-binding domain (R3014L) and an N-glycosylation site (L4074F). N-glycosylation occurs when sugars are added to the nascent polypeptide chain in the endoplasmic reticulum. After cleavage of glucose and mannose residues, the LRP1 protein is transferred to the Golgi apparatus. Two stop mutations, E2920X and E3802X, are of interest as they remove the C-terminal domain with the NPxY motif essential for clathrin-mediated internalization that would disrupt the endocytic recycling of surface receptors [29]. We note most of the missense mutations in the human genome, including LRP1, are heterozygous, indicating they are likely dominant mutations causing gain of function or dominant-negative loss of function. We also identified five patients with compound heterozygous mutations if we used allele frequency <0.01 (Supplementary Table S6). In addition, a diverse array of CHD phenotypes is observed with the LRP1 mutations ( Figure 4, Supplementary Table S3), likely a reflection of the modifying effects of the genetic background of each patient.

Lrp1 Variants in Patients with AVSD Are Significantly Associated with Their Cases
Previous work has shown that a homozygous missense mutation in LRP1 in mice results in AVSD and DORV [1]; we sought to determine if a significant relationship exists between LRP1 variants and AVSD and DORV in the CHP and PCGC population. Of the 1922 patients with CHD, 142 of them have AVSD. The missense variants in these patients are significantly associated with AVSD (p = 0.0194). Of the 1922 patients, 111 of them have DORV. There is no significant relationship between variants with an allele frequency under 0.01% and DORV.

Discussion
Data from projects such as the ExAC provide evidence that rare protein-altering variation is far more common in the general population than we are previously aware of. By using the 2602 healthy controls from the Alzheimer's Disease Sequencing Project (ADSP) (NIAGADS) as a control for background population variation, we demonstrated a significant association of rare, potentially damaging LRP1 variants with CHD, especially CTD. The significant enrichment of rare LRP1 variation in the CHD cohort constitutes evidence of pathogenicity. LRP1 is expressed in the OFT and AVC in developing humans and mice and is highly expressed in the human aorta. The association of rare damaging LRP1 variants with CHD, especially with conotruncal anomalies, is consistent with the observations of the expression of LRP1 in the heart at single-cell resolution.
LRP1 is an endocytic trafficking protein. LRP1 is expressed as a 600 kDa precursor cleaved by furin, resulting in a 515 kDa extracellular ligand binding α-chain and a noncovalently bound 85 kDa membrane-bound cytoplasmic β-chain [30]. LRP1 is recognized as a multifunctional receptor that binds to multiple ligands which plays important roles in orchestrating different cellular and molecular aspects: (1) as a scavenger receptor that internalizes multiple extracellular ligands; (2) as a regulatory receptor, it regulates cellular signaling in response to extracellular stimuli; and (3) as a scaffold receptor, LRP1 can partner with and modulate the activity of other membrane proteins such as integrins, bone morphogenetic protein 4 (BMP4), and receptor tyrosine kinases [31]. The importance of Lrp1 is demonstrated by the early lethality of Lrp1 gene deletion, as it arrests mouse embryo development at an early stage [32]. The endocytic function and signaling properties confer a major role to LRP1 in the pathophysiology of numerous diseases such as hepatic steatosis, pulmonary hypertension, kidney fibrosis, acute respiratory distress syndrome, Alzheimer's disease, atherosclerosis, and left ventricular modulation after acute myocardial infarction [31]. We have identified a role of LRP1 in the pathophysiology of CHD by uncovering a mutant mouse line, 1554 (MGI 96828), that results from a missense (C4232R) mutation in the region encoding the epidermal growth factor (EGF) repeat domain located in the β-chain [1,33]. In this study, we observe an association between rare, potentially damaging LRP1 variants with human CHDs. These variants are located in different domains/regions of LRP1 protein with potentially damaging interaction with the known pathways associated with CHD pathogenesis, such as BMP4 [34], Notch [35][36][37], and WNT [38][39][40] pathways.

Conclusions
We reported rare protein-alternating variants in LRP1 implicated in CHD in a large cohort and identified a significant association with different subtypes of CHD, including CTD, LVOTO, and AVSD. The contribution of LRP1 rare damaging variants to CHD has the potential diagnostic yield of sequencing that an uncharacterized LRP1 variant identified in an individual with CHD is pathogenic and informative regarding clinical inheritability of variation in LRP1.

Limitations
The major limitation in this study is that it is only focused on variation in protein-coding or close to protein-coding regions; therefore, we do not fully characterize other variant classes such as non-coding regions, epigenetic, and large structural variants in LRP1.