A 69 kb Deletion in chr19q13.42 including PRPF31 Gene in a Chinese Family Affected with Autosomal Dominant Retinitis Pigmentosa

We aimed to identify the genetic cause of autosomal dominant retinitis pigmentosa (adRP) and characterize the underlying molecular mechanisms of incomplete penetrance in a Chinese family affected with adRP. All enrolled family members underwent ophthalmic examinations. Whole-genome sequencing (WGS), multiplex ligation-dependent probe amplification (MLPA), linkage analysis and haplotype construction were performed in all participants. RNA-seq was performed to analyze the regulating mechanism of incomplete penetrance among affected patients, mutation carriers and healthy controls. In the studied family, 14 individuals carried a novel heterozygous large deletion of 69 kilobase (kb) in 19q13.42 encompassing exon 1 of the PRPF31 gene and five upstream genes: TFPT, OSCAR, NDUFA3, TARM1, and VSTM1. Three family members were sequenced and diagnosed as non-penetrant carriers (NPCs). RNA-seq showed significant differential expression of genes in deletion between mutation carriers and healthy control. The RP11 pedigree in this study was the largest pedigree compared to other reported RP11 pedigrees with large deletions. Early onset in all affected members in this pedigree was considered to be a special phenotype and was firstly reported in a RP11 family for the first time. Differential expression of PRPF31 between affected and unaffected subjects indicates a haploinsufficiency to cause the disease in the family. The other genes with significant differential expression might play a cooperative effect on the penetrance of RP11.


Introduction
Retinitis pigmentosa (RP) is one of the most common forms of inherited retinal degenerations (IRDs) and affects 1:4000 of the population worldwide [1]. This disease affects predominantly rod photoreceptor cells with later involvement of cone photoreceptor cells. Phenotypic and genotypic heterogeneity are common. Night blindness is usually the first symptom and eventually leads to tunnel vision, even total blindness [1]. The inheritance modes of RP include autosomal dominant RP (adRP), autosomal recessive RP (arRP) and X-linked RP (xlRP) [2]. Non-mendelian inheritance patterns including digenic inheritance and maternal (mitochondrial) inheritance are also found in RP [3][4][5]. Another type of RP, non-syndromic RP, is mainly related to those genes expressed specifically in retina and/or playing an important role in the functions of retinal pigment epitheliums (RPEs) or photoreceptors (PRCs). [2] To date, more than 300 genes have been identified for RP [2,6]. The subtype of RP caused by PRPF31 has been identified and named to be RP11. Until now, there is not known DNA fragments with length in the range 100-300 bp were size-selected using AMpure XP Beads (Beckman Coulter, Indianapolis, IN, USA). The selected DNA fragments were subject to paired-end adaptor ligation and underwent end-repairing, phosphorylation, and A-tailing reactions, then followed by purification and amplification by PCR. The PCR products were heat-denatured to obtain single-strand DNAs, followed by circularization with DNA ligase, and the remaining linear molecule was digested with the exonuclease. After the formation of the DNA nanoballs, sequencing was performed on a BGISEQ-500M (BGI, Shenzhen, China) platform with read length of paired-end 100 bp, and the average sequencing coverage for each sample was~40-fold.
Sequence reads were mapped to the human reference genome (hg38/GRCh38) using Burrows-Wheeler Aligner (BWA, version 0.7.17, http://bio-bwa.sourceforge.net/ (accessed on 11 December 2018)) with default parameters. Standard next generation sequencing (NGS) analysis of the raw data included sequence alignment and variant calling and annotation. SNVs and INDELs calling were performed with the software Genome Analysis Toolkit (GATK v3.70, https://gatk.broadinstitute.org/ (accessed on 13 June 2017)). AN-NOVAR (https://annovar.openbioinformatics.org/ (accessed on 1 February 2020)) was used to annotate SNPs and insertions/deletions. The Integrative Genomics Viewer (IGV, https://igv.org/ (accessed on 1 June 2020)) was used to visualize the coverage and the quality of the reads and for the visualization of structural variation (SV). After the quality control (QC) of variant and sample, variants from family members were filtered for subsequent analyses as rules: (1) located in exonic and splicing regions; (2) the variant frequency < 0.05 in UCSC, dbSNP, 1000 genomes project, and ExAC database. The results of the SNV/INDEL, CNV, and SV analyses were integrated and reviewed for the interpretation of pathogenicity. Sanger sequencing was conducted to verify the potential pathogenic variants. The variant classification was evaluated according to the guidelines of the American College of Medical Genetics and Genomics (ACMG) standards in association with clinical phenotypes.

Multiplex Ligation-Dependent Probe Amplification (MLPA)
Genomic DNA was isolated from peripheral blood obtained from participants with/without the pathogenic variant by QIAamp ® DNA Mini Kit (Qiagen, Hilden, Germany). We first utilized the application of SALSA MLPA EK1 reagent kit (MRC Holland, Amsterdam, The Netherlands) for the detection of deletions/duplications of four RP-related genes (RHO, IMPDH1, RP1, and PRPF31). MLPA (MRC-Holland P235-025R) was performed with the following steps: denaturation, hybridization, ligation, and amplification, according to the manufacturer's protocol. PCR products were separated with a 50 cm capillary electrophoresis system. Results were analyzed with Genemapper4.0 (ABI) and Coffalyser.Net (MRC Holland, Amsterdam, The Netherlands) were used for identification of copy number variation.

Parametric Linkage Analysis and Haplotype Construction
DNA were amplified with fluorescent PCR primers and DNA polymerase in standard multiplex reaction conditions. All DNA samples were diluted with 1× TE buffer to 50-ng/uL concentration, and 200 ng of DNA was used for SNP genotyping. PCR products were pooled and diluted then run on an Illumina ASACH chip (ver: ASACH_20022517X351729_A1, Illumina, CA, USA). After genome-wide SNP genotyping, genetic map positions were generated for all SNPs on the autosomal chromosomes. Based on the deletion region found by WGS, we designed a marker panel including 6827 TagSNPs in 1-22 chromosomes to search for and determine if there was a candidate region existing linkage with phenotype. These genetic markers were filtered based on frequency, Mendelian genetic law and Hardy-Weinberg proportions. Additionally, the nocall SNPs were filtered. There were 2 SNPs per CM. Finally, linkage analysis was performed with 6827 SNP markers flanking each of the known TagSNP loci in the family. The SNPs were then pruned for linkage disequilibrium (LD). SNPs with residual LD were clustered into further research. MERLIN (version 1.1.2, University of Michigan, Ann Arbor, MI, USA) was used to perform linkage analysis and construct haplotypes in the pedigree. Haplotype analysis was performed using informative genotype data of the TagSNPs. Phenotype was analyzed as an autosomal dominant trait with complete penetrance (100%) and a frequency of 0.0001 for the affected allele.

RNA-Sequencing and Data Analysis
RNA was extracted from the whole blood samples using the PAXgene Blood RNA System Kit employing an amended version of the manufacturer's guidelines. The product was measured using a NanoDrop ND-1000 UV-visible spectrophotometer (Labtech International, Ringmer, UK). RNA integrity was additionally assessed using the Agilent 2100 Bioanalyser (Agilent Technologies, Santa Clara, CA, USA). Samples were loaded on to the Eukaryote total RNA nano chip/the Eukaryote total RNA pico chip. Then, the enriched mRNA was fragmented into short fragments and reverse transcribed into cDNA with random primers. After synthesis of second-strand cDNA, fragments were then purified, end repaired, polyadenylated, and ligated to Illumina sequencing adapters. Ligation products were size selected by agarose gel electrophoresis, PCR amplified, and sequenced using HiSeq 4000 SBS Kit (Illumina Inc., San Diego, CA, USA). Clean reads were further filtered by STAR (v2.4.2a). We used RNA-Seq by Expectation-Maximization (RSEM, V1.2.29, http://deweylab.github.io/RSEM/ (accessed on 1 April 2016)) to identify RNA differential expression between two different groups. The transcripts (the parameter of false discovery rate (FDR) < 0.05 and absolute fold change ≥ 2) were considered as differentially expressed gene. Corrected p-values (padj) and logarithm based on 2 (log2FC), (i.e., the logarithm of the fold) were used for screening genes expressing significantly differently, (i.e., padj < 0.05 and |log2FC| ≥ 1). Gene Oncology (GO, http://geneontology.org/ (accessed on 4 November 2021)) and Kyoto Encyclopedia of Genes and Genome (KEGG, https://www.kegg.jp/ (accessed on 5 November 2021)) enrichment analysis were then used to identify the significantly enriched metabolic or signal transduction pathways in differentially expressed genes (DEGs) comparing with the whole-genome background.

Ocular Examinations
We recruited a large adRP pedigree of 18 affected individuals (13 alive, 11 enrolled) and 55 unaffected individuals (51 alive, 16 enrolled) ( Figure 1). All patients in the family manifested night blindness at an early age (about 3 years old) as their onset symptom. The symptoms of RP developed slowly in this pedigree. The fundus exams revealed the typical RP characteristics in patients ( Figure 2, Table 1). In the youngest patient (V-3, 2 years old), who already presented with night blindness, we found bull's-eye maculopathy in fundus autofluorescence, characteristic RP fundus changes in OCT ( Figure 2). The IOP of proband (III-6) tested by Tonopen was 54.2 mmHg (right eye) and 25.2 (left eye). He was diagnosed with RP and POAG, supported by evidence of fundus photography (cup/disk = 1.0 and pale optic disc) and open angle confirmed by slit lamp.

Ocular Examinations
We recruited a large adRP pedigree of 18 affected individuals (13 alive, 11 enrolled) and 55 unaffected individuals (51 alive, 16 enrolled) ( Figure 1). All patients in the family manifested night blindness at an early age (about 3 years old) as their onset symptom. The symptoms of RP developed slowly in this pedigree. The fundus exams revealed the typical RP characteristics in patients ( Figure 2, Table 1). In the youngest patient (V-3, 2 years old), who already presented with night blindness, we found bull's-eye maculopathy in fundus autofluorescence, characteristic RP fundus changes in OCT ( Figure 2). The IOP of proband (III-6) tested by Tonopen was 54.2 mmHg (right eye) and 25.2 (left eye). He was diagnosed with RP and POAG, supported by evidence of fundus photography (cup/disk = 1.0 and pale optic disc) and open angle confirmed by slit lamp.
We then performed the MLPA to further check if there was a large deletion of PRPF31 exon 1 in affected and unaffected members from this RP family. Through MLPA analysis, we proved that the heterozygous deletion in PRPF31 exon1 existed in all affected members and three NPCs with a CNV dosage quality (DQ) value in 0.40-0.65 (Figures 4 and S1). Additionally, we excluded three possible genes related to RP (including RHO, IMPDH and RP1) out of latent CNV mutation. with PCR product of 240 bp ( Figure 3). These findings indicated that there was a heterozygous deletion in patients. The deletion position is consistent with the results of WGS. We then performed the MLPA to further check if there was a large deletion of PRPF31 exon 1 in affected and unaffected members from this RP family. Through MLPA analysis, we proved that the heterozygous deletion in PRPF31 exon1 existed in all affected members and three NPCs with a CNV dosage quality (DQ) value in 0.40-0.65 (Figure 4, Figure  S1). Additionally, we excluded three possible genes related to RP (including RHO, IMPDH and RP1) out of latent CNV mutation.  After the parametric linkage analysis in, only one region indicated a LOD score > 3.0 suggesting linkage with 24 markers 14 family members, which was located on chr19: 99.03-110.49 (Table 3, Figure S2). Haplotype analysis was then performed in the candidate region. A consistent haplotype was identified in the affected individuals but not in the After the parametric linkage analysis in, only one region indicated a LOD score > 3.0 suggesting linkage with 24 markers 14 family members, which was located on chr19: 99.03-110.49 (Table 3, Figure S2). Haplotype analysis was then performed in the candidate region. A consistent haplotype was identified in the affected individuals but not in the unaffected individuals ( Figure 5).

RNA-Sequencing and Data Analysis
To analyze transcriptional changes among the patient, NPC, and control subject, we performed RNA-sequencing analysis in the family members (III-6, III-7, IV-5, and IV-7). Compared to the healthy control (III-7), RNA sequencing revealed 532 up-regulated genes and 588 down-regulated genes in the patient (III-6) and 24 up-regulated genes and 29 down-regulated genes in NPCs (IV-5 and IV-7). While compared to NPCs, 40 up-regulated genes and 42 down-regulated genes in the patient (III-6) (Table 4, Figure 6). The expression levels of genes in the deletion region and linkage region were listed in Table S3. We found the expression of PRPF31 was significantly different between mutation carriers and healthy individual (padj < 0.05), but showed no difference between patient and NPCs (padj > 0.05). GO and KEGG analysis suggested that the main gene location and signal pathway pointed to immunological response and inflammatory ( Figure S3).   expression levels of genes in the deletion region and linkage region were listed in Table  S3. We found the expression of PRPF31 was significantly different between mutation carriers and healthy individual (padj < 0.05), but showed no difference between patient and NPCs (padj > 0.05). GO and KEGG analysis suggested that the main gene location and signal pathway pointed to immunological response and inflammatory ( Figure S3). Patient, III-6; healthy control, III-7; NPCs, IV-5 and IV-7.

Figure 6.
Volcano plot of the significantly differential expressive genes (DEGs) in the core family. xaxis, log2FC; y-axis, the negative log10FDR (adjusted p-value). Only the top 10 differentially expressed genes are shown in the legend.

Discussion
In this study, we recruited 11 patients and 16 healthy controls from the largest RP11 pedigree with large PRPF31 deletion. As a classical inherited retinal dystrophy, RP is characterized with striking phenotypic and genetic heterogeneity. Only about 50-70% of RP patients have confirmed genetic defect [1,21]. This is the first time to report the largest adRP pedigree with a large deletion in PRPF31, which will provide more genotype-phenotype information for further investigation of RP11.
The patients of this pedigree presented with characteristic RP symptoms, such as night blindness at early ages, visual changes accompanied with characteristic RP-fundus appearance. Though the youngest patient (V-3) with night blindness did not show a significant change in pigment, loss of peripheral PRC outer segment and RPE atrophy were identified in his macular OCT. These finding provide strong evidence for his RP diagnosis. All patients in this pedigree manifested night blindness at about 3 years old. The early onset in some PRPF31-related RP patients has not yet been well explained. Figure 6. Volcano plot of the significantly differential expressive genes (DEGs) in the core family.
x-axis, log2FC; y-axis, the negative log10FDR (adjusted p-value). Only the top 10 differentially expressed genes are shown in the legend.

Discussion
In this study, we recruited 11 patients and 16 healthy controls from the largest RP11 pedigree with large PRPF31 deletion. As a classical inherited retinal dystrophy, RP is characterized with striking phenotypic and genetic heterogeneity. Only about 50-70% of RP patients have confirmed genetic defect [1,21]. This is the first time to report the largest adRP pedigree with a large deletion in PRPF31, which will provide more genotype-phenotype information for further investigation of RP11.
The patients of this pedigree presented with characteristic RP symptoms, such as night blindness at early ages, visual changes accompanied with characteristic RP-fundus appearance. Though the youngest patient (V-3) with night blindness did not show a significant change in pigment, loss of peripheral PRC outer segment and RPE atrophy were identified in his macular OCT. These finding provide strong evidence for his RP diagnosis. All patients in this pedigree manifested night blindness at about 3 years old. The early onset in some PRPF31-related RP patients has not yet been well explained.

Discovery of Pathogenic Mutation in the Pedigree
Due to the limitation of NGS, the pathogenic variant was not detected through panelbased NGS in an initial screening even including PRPF31 gene. One possible reason is that the length of exon 1 in PRPF31 (more than 300 bp) is out of range for the pathogenic variant to be read based on NGS; the other possibility is that the deletion is located in an intron region where the primers of panel-based NGS do not capture.
Our WGS analysis identified a large region locus at chr19: 54048499-54118055 with a high LOD score in linkage analysis, harboring PRPF31, VSTM1, TARM1, OSCAR, NDUFA3 and TFPT. Furthermore, the deletion positions point at the intron 1 of VSTM1 and intron 1 of PRPF31. The region showed a high LOD score value in linkage analysis.
The reason why large sequence adjacent to PRPF31 seems more likely to be deleted is that there is a high density of homologous Alu repeats in chr19 [22]. Chromosome 19 is the richest area in Alu repeats associated with the density of GC, which is biologically active [22]. Alu repeats were found to be related to the deleted position [23]. Alu is defined as a class of retroelements termed SINEs (short interspersed elements). It is the main reason for polyadenylation, splicing, and adenosine deaminase that acts on RNA editing [24]. With the evolution of Alu elements in genomic variants, different subfamilies of Alu elements were discovered and classified. The deletion breakpoints identified in this study are both within Alu repeats, adding novel evidence to the literature [25,26]. The nearest Alu repeats adjacent to the deletion are AluJb (chr19: 54048108-54048414) and AluSx1 (chr19: 54117565-54117872). J subfamily is the earliest Alu elements and S subfamily is a really active one [27].
In this pedigree, PRPF31 is the most likely candidate gene to explain the phenotype of adRP among the 6 genes. Additionally, members with mutated PRPF31 showed typical incomplete penetrance phenotype, which is supported by the haploinsufficiency theory. Several modifier sequences have been reported to influence expression of PRPF31, such as CNOT3 and MSR1 [16,[28][29][30][31][32][33]. CNOT3 encodes a subunit of the Carbon Catabolite Repression-Negative On TATA-less (CCR4-NOT) transcriptional complex, and MSR1 is minisatellite element 1 in the promoter of PRPF31. Whether other potential regulators could influence expression of PRPF31 expression in this pedigree is worthy of further exploration.

Gene Expression of PRPF31 and Potential Mechanisms of RP
As reported by previous studies, the PRPF31 mRNA expression level measured from peripheral blood sample indicated that the patient had a higher level of PRPF31 expression than healthy control [34]. Our findings suggest that pathogenic variant in PRPF31 exon 1 could decrease PRPF31 expression, supporting the haploinsufficiency hypothesis. The PRPF31 mRNA level from peripheral blood mononuclear cells (PBMCs) of NPC showed no significant difference from the patient in this study, consistent with results in previous studies on leukocytes, fibroblasts and retina organoids [32,34]. Additionally, the expression of PRPF31 from retina and RPE has been found to be a little higher than the expression from fibroblast [18,32], which shows comparability and variability of the PRPF31. This difference in PRPF31 expression may be influenced by regulators, age, and specific expression level in different tissues. However, RP11 RPE displayed a slightly lower mRNA level and more RP-related shortening of primary cilia compared with NPCs, which indicates there may be a threshold in retina for manifestation of RP phenotypes [32]. Further investigation on expression level of PRPF31 in retina is necessary to explain the incomplete penetrance.
In addition, it is worth determining if other genes in the deletion region will result in the acceleration of RP development and finally induce the early onset in patients in this pedigree. So far, there is no study about the level of transcriptome of genes adjacent to PRPF31. We provide meaningful evidence to show the mRNA expression of genes in a large deletion region adjacent to PRPF31 by RNA-seq. The expression of genes in this region except for TARM1 showed a reduction in NPCs compared to patient and healthy control, especially for VSTM1, NDUFA3, OSCAR and TFPT. The non-penetrance of the variant in carriers may be related to impairing the energy metabolism, inflammatory and apoptosis.
TFPT (FB1), the nearest one adjacent to PRPF31, is reported to be related to development of leukemia [35,36], acting as a molecular fusion partner of TCF3 (E2A). TFPT was discovered to be connected with the cell cycle [37], induction of programmed cell death (PCD) and proliferation by promoting caspase 9-dependent apoptosis [38]. Furthermore, there was a study on a rat model proving TFPT had an important role in modulating cerebral apoptosis [39]. TCF3 is a kind of transcription factor joining at cellular apoptosis by up-regulating p21 expression and promoting the downstream response of p53 [40]. TFPT shares the exon 1 with PRPF31, suggesting the existence of a head-to-head structure on DNA strands. So, the heterozygous non-allelic mutation of TFPT may induce the apoptosis in retina by influencing the cell cycle and co-regulation with TCF3 (E2A). TFPT is related to caspase (3, 7 and 9) apoptosis and PCD. Caspase (3, 7 and 9)-dependent apoptosis, one of PCD, is considered to correlate with rapid photoreceptor degeneration [41]. An RP patient has been reported to carry mitochondrial mutation and show isolated complex I deficiency [42].
NDUFA3, the gene in the upstream of TFPT, codes for ubiquinone oxidoreductase (complex I), compounding the first enzyme in the electron transport chain of mitochondria [43]. NDUFA3 was reported to be linked with telomere length and aging [44], and it may be a necessary component for stability of transferring Q module in the peripheral arm of the complex I [45]. Variants related to complex I such as NDUFS8, NDUFS7, and NDUFA2 are found to be explanations for Leigh syndrome, a neurodegeneration showing degenerative foci in the central neural system [46]. Down regulation of the family of NDUF in cortex can promote the energy metabolism disorder in rat brain [47]. With regard to the electron transport function related to NDUFA3, it has been believed that mitochondrial dysfunction and metabolism can play distinct-and indispensable-roles in RP development [48,49].
Except for the apoptosis and metabolism stated above, inflammation response may be involved in RP as well. The neural parenchyma of the retina is populated almost solely by microglia, and other immune cells seem almost not to exist in the retina. The recruitment of PBMCs, phagocytes, and microglia in retina can arouse inflammation and phagocytosis so that the development of RP and AMD may be promoted [50,51]. Furthermore, the peripheral inflammatory monocytes are effector cells that mediate cone cell death in RP [52]. Both systemic inflammation and local inflammation are involved in progression of the degenerative retinal diseases [41]. VSTM-1, TARM-1, and OSCAR are related to inflammation in PBMCs. VSTM1 encodes V-set and transmembrane domain-containing protein 1, a signal inhibitory receptor on leukocytes 1 (SIRL-1), which can negatively regulate oxidative burst in phagocytes [53,54]. VSTM1 was recognized to be related to inflammation, apoptosis, and necrosis in PBMCs [55,56]. TARM1 encodes T cell-interacting activating receptor in myeloid cells 1, an arginine residue-containing protein that primarily expressed by monocytes and neutrophils [57]. TARM1 can be up-regulated in monocytes, macrophages, activated neutrophils, and dendritic cells (DCs) in some inflammatory conditions, and activate the receptor of osteoclast-associated receptor (OSCAR) [57]. TARM1 was located close to VSTM1, so they may be a pair of co-receptors that acquired antithetical functions in terms of cellular activation. As for OSCAR, it encodes osteoclast-associated receptor, which was a member of the leukocyte receptor complex family. The cell line which it is differentiated from is the same as that of monocytes and DCs. OSCAR also can strengthen the inflammatory response and release of cytokine [58,59]. OSCAR and TARM1 share sequence homology and both are activated in the presence of TLR ligands.
In addition, the reduction in PRPF31 can alter exon usage of genes involved in pre-mRNA splicing (PRPF3, PRPF8, PRPF4, and PRPF19) [60]. In order to explore whether there exists an influence of other pre-mRNAs splicing genes, we performed RNA-seq but found the genes related to the process of pre-mRNA splicing (SNRPD3, PHF5A, PRPF6, SF3A2, PRPF3, LSM8, SNRPA1, TXNL4A, SNRNP200, SART1) or the components of U4/U6 small nuclear ribonucleoprotein (snRNP) subunits of the spliceosome (SNU13, PRPF3, SNRNP27, PRPF4, USP39, SART1) showed no significantly different expression, except for SART1, LSM8, SF3A2. This may indicate that another potential regulating pathway of RP or different expression pattern in blood sample compared to retina.

Incomplete Penetrance of RP
We sought to elucidate the influential factors of penetrance RP11 in this pedigree using an RNA-seq analysis. Genes in this linkage region (TNNT1, LILRB3, RFPL4A, KIR2DL1, KIR2DL3, KIR3DL1, KIR3DL2, CACNG6, TMC4, ZNF761, ZNF550) were down-regulated or up-regulated in patient compared to the NPCs or healthy control. TNNT1 is recognized as a novel marker of immortalized RPE [61], which is known as skeletal muscle-specific troponin. LILRB3 is associated with the leukocyte immunoglobulin-like receptor. RFPL4A encodes the ret finger protein, which is predicted to enable ubiquitin-protein transferase activity and regulation of transcription. KIR2DL1, KIR2DL3, KIR3DL1, and KIR3DL2 encode killer cell immunoglobulin-like receptor. CACNG6 is engaged in forming alpha-1 subunits of voltage-gated calcium channel. TMC4 encodes transmembrane channel like 4. ZNF761 and ZNF550 are zinc finger proteins, known for combining with target structure to regulate the expression of gene and cell differentiation. The functions of genes in the linkage region are related to immunology, transmembrane channel, or protein involved in regulation of gene expression. Among them, genes engaged in regulation of transcription were found to be down-regulated in the patient by RNA-seq. It indicated splicing process influenced by PRPF31 may results in transcription pathways these genes participated in. Based on the GO and KEGG analysis of RNA-seq data, we can figure out the function enrichment is mainly focused on the immunology systems and inflammatory response. Compared to the patient, the immune response and inflammatory apoptotic process were differently regulated in NPCs and healthy control. It provides an indication to guide us to explore the potential mechanism and influence factors from aspects stated above.
For this reason, genes associated with pre-mRNA splicing process, immunology response and inflammation were selected from the differential expression gene list, and those considered as the most relative genes with our study were selected for further research in order to explain the mechanism of incomplete penetrance (Table S4). We additionally chose TNNT1, GSTM1, GSTM3, SLIT1, IRF7, MARCKS, and ERAP2 as the potentially relevant gene because of their significantly differential expression and some selection criteria. In conclusion, those genes screened out for further study were based on the principle of its close association with immunology, oxidative stress, ophthalmological disease, and metabolism. GSTM1 and GSTM3 are related to glutathione S-transferase Mu, acting as an important factor in redox metabolism [62,63]. SLPI is involved in Th2 immune responses and allergic diseases [64]. IRF7 is a potent regulator of interferon, which shows an effect on antiviral immunity [65]. ERAP2 is associated with Birdshot uveitis by influencing the immunopeptidome across HLA class I allotypes [66]. MARCKS served as a marker of diacylglycerol-activation and expressed highly in developmental processes of retina [67,68].
There is an interesting finding that the proband was diagnosed with RP and POAG, which prompted a correlation with these two diseases. Some findings were proposed relating to the potential pathogenic gene. GSTM1 has been reported as the potential gene involved in open-angle glaucoma [69]. Additionally, both PRPF8 and PRPF31 are the members of pre-mRNA processing factor related to RP. PRPF8 was found to be genotypephenotype correlation with POAG [70]. It has been hypothesized that the mutation at the C-terminus was associated with RP and the mutation at the N-terminus was associated with POAG. The manifestation of POAG in the proband may be explained by the higher gene expression related to POAG or shown as another phenotype of PRPF31, which may lead to POAG and adRP by influencing splicing process in the eye like PRPF8.
In summary, we have identified a large deletion region (69 kb) in a five-generation adRP pedigree, in which PRPF31 was considered as the pathogenic gene. However, we were not able to investigate this gene at the proteome level for further explanation on regulating way of incomplete penetrance in RP. Now we are working on designing a cell model for further study to answer these questions. Based on the iPSC technology, we can establish a stem cell model with a large deletion, which is difficult if using the traditional gene editing method. In addition, we can preserve the intracellular environment of the derived model, so that it will be possible to comprehensively and exactly revalidate the differently expressive gene at the proteomic and transcriptomic levels. Inducing iPSC differentiation into retinal cell will be helpful to elucidate the correlations between the six genes identified in this study and RP phenotype in specific cells types, and to reveal the potential mechanism of incomplete penetrance in RP.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/jcm11226682/s1. Figure S1: the results of multiplex ligationdependent probe amplification (MLPA); Figure S2: the parametric analysis of autosomal 1-22; Figure S3: gene oncology (GO) term and Kyoto encyclopedia of genes and genomes (KEGG) function enrichment of differential expressive genes (DEGs) analysis; Table S1: panel of next generation sequencing for fundus diseases; Table S2: the number of mutations in PRPF31 and related phenotypes; Table S3: expression level of genes in deletion region and linkage region; Table S4: expression level of genes related to pre-mRNA splicing and function of genes in linkage region.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.