The Genetics of Primary Ciliary Dyskinesia in Puerto Rico

Primary ciliary dyskinesia (PCD) has been linked to more than 50 genes that cause a spectrum of clinical symptoms, including newborn respiratory distress, sinopulmonary infections, and laterality abnormalities. Although the RSPH4A (c.921+3_6delAAGT) pathogenic variant has been related to Hispanic groups with Puerto Rican ancestry, it is uncertain how frequently other PCD-implicated genes are present on the island. A retrospective chart review of n = 127 genetic reports from Puerto Rican subjects who underwent genetic screening for PCD variants was conducted from 2018 to 2022. Of 127 subjects, 29.1% subjects presented PCD pathogenic variants, and 13.4% were homozygous for the RSPH4A (c.921+3_6delAAGT) founder mutation. The most common pathogenic variants were in RSPH4A and ZMYND10 genes. A description of the frequency and geographic distribution of implicated PCD pathogenic variants in Puerto Rico is presented. Our findings reconfirm that the presence of PCD in Puerto Rico is predominantly due to a founder pathogenic variant in the RSPH4A (c.921+3_6delAAGT) splice site. Understanding the frequency of PCD genetic variants in Puerto Rico is essential to map a future genotype-phenotype PCD spectrum in Puerto Rican Hispanics with a heterogeneous ancestry.


Introduction
Primary ciliary dyskinesia (PCD) is a rare genetic disease that causes a spectrum of oto-sinopulmonary symptoms secondary to more than 50 implicated genes that result in ciliary dysmotility [1,2]. PCD can lead to chronic bacterial colonization of the upper and lower airways, recurrent otitis media, and in some cases, infertility and laterality defects [3]. At least two of four cardinal features, including organ laterality defects; unexplained neonatal respiratory distress; year-round, daily, wet cough; and year-round nasal congestion, increase the likelihood of PCD diagnosis as per the American Thoracic Society (ATS) clinical guidelines [4,5]. Recently, the global prevalence of PCD has been estimated to be one in 7554 individuals, and in Hispanics, the prevalence is one in 16,309 individuals [6]. Genetic testing plays an essential role in evaluating PCD in countries where nasal nitric oxide (nNO) and transmission electron microscopy (TEM) are limited or not available [7]. A founder mutation in the RSPH4A (c.921+3_6delAAGT) was confirmed in native individuals living in the United States and Puerto Rico [7,8].
Due to genetic drift and the subsequent population expansion, the island of Puerto Rico has a heterogeneous genetic composition [9]. The topography of an island that is 100 miles long by 35 miles wide and surrounded by a central mountain range promotes the isolation and risk of consanguinity of communities in specific regions in Puerto Rico. In the past, other autosomal recessive disorders have been more frequent in the northwest region of the island [10]. Today, the geographic distribution and frequencies of PCD related variants are still unknown. A previous study suggested that the RSPH4A (c.921+3_6delAAGT) founder mutation was brought by European conquistadors and distributed across the island [11]. Similarly, other founder pathogenic mutations may be present in our population. The frequency of variants of unknown significance (VUS) and their clinical role in PCD diagnosis remain debatable [12]. Identifying VUS frequency in ciliary-related genes across different populations may help understand their role in PCD as a spectrum disorder [13]. Although the prevalence of PCD in Puerto Rico is still unknown, an analysis of the geographic dissemination of pathogenic variants across different healthcare regions could provide a better idea of genetic distribution for PCD across the 78 municipalities. Our study presents the geographic frequency and distribution of PCD implicated genetic variants in Puerto Rico in a cohort of subjects screened for PCD. In addition, the demographics and clinical characteristics for 17 homozygous patients with the RSPH4A (c.921+3_6delAAGT) founder mutation are presented.

Materials and Methods
A descriptive retrospective chart review of n = 127 Puerto Rican subjects previously screened for PCD by genetic testing was completed. Fifteen subjects were siblings from four different families. The retrospective study was conducted between September 2018 and February 2022 in an outpatient community clinic with expertise in rare pulmonary disorders, recently accredited as a PCD Center by the PCD Foundation ( Figure 1). Analysis of genetic reports was completed in subjects that previously were evaluated by a physician following the diagnostic algorithm suggested for the PCD diagnosis, and it was part of the differential diagnosis [4]. Subjects with pathogenic variants or VUS in PCD genes were included in the study. Subjects with genetic variants related to other ciliopathies or cystic fibrosis were excluded. Saliva or buccal swabs samples were analyzed by a commercial genetic testing laboratory (Invitae Corporation, San Francisco, CA, USA). Complete genetic sequence analysis for 42 PCD-related genes, including deletions and duplications associated with the CFTR gene, was conducted (Table A1). Only pathogenic variants or VUS were considered in our cohort. All PCD genetic variants were considered and included in the study analysis. VUS variants were defined as genetic sequences for which there is an unclear association to disease today due to the lack of information. A pathogenic variant was defined as an illness-causing disease previously reported on PCD in the population. In-silico computational analysis as presented by Invitae's Sherloc variant interpretation framework was included. Sherloc variant interpretation framework relies on a point-based evidence scoring system built upon the joint consensus guidelines from the American College of Medical Genetics and Genomics and the Association for Molecular Pathology [14].
Descriptive mapping of the PCD frequency of pathogenic variants was compiled and segregated across the seven healthcare regions per the Puerto Rico department of health (Metro, Caguas, Bayamón, Arecibo, Fajardo, Ponce, and Mayagüez). Metro and Bayamón regions have more access to specialized healthcare services due to the proximity to the capital of San Juan to urban areas. Arecibo, Fajardo, Ponce, and Mayaguez regions are mainly composed of municipalities with mainly rural areas with limited access to specialized health care. The genetic variants identified were systematically organized according to the geographic region of the subject's home address using zip codes for municipality allocation from data obtained from genetic reports. The data collection and analysis for the protection of human subjects was approved by the institutional review board (IRB) from the University of Puerto Rico, Medical Sciences Campus in San Juan, Puerto Rico (B1730120). Descriptive mapping of the PCD frequency of pathogenic variants was compiled and segregated across the seven healthcare regions per the Puerto Rico department of health (Metro, Caguas, Bayamón, Arecibo, Fajardo, Ponce, and Mayagüez). Metro and Bayamón regions have more access to specialized healthcare services due to the proximity to the capital of San Juan to urban areas. Arecibo, Fajardo, Ponce, and Mayaguez regions are mainly composed of municipalities with mainly rural areas with limited access to specialized health care. The genetic variants identified were systematically organized according to the geographic region of the subject's home address using zip codes for municipality allocation from data obtained from genetic reports. The data collection and analysis for the protection of human subjects was approved by the institutional review board (IRB) from the University of Puerto Rico, Medical Sciences Campus in San Juan, Puerto Rico (B1730120).

Results
Of 127 subjects, 37 (29.1%) subjects presented PCD pathogenic variants, and 17 (13.4%) subjects were homozygous for the RSPH4A (c.921+3_6delAAGT) founder mutation. Thirty-four subjects were excluded due to negative genetic test results or cases where genetic variants were reported for other non-PCD ciliopathies or cystic fibrosis. A benign variant on DNAH5 c.1250C > G (p.Thr417Ser) was excluded.

Distribution of PCD Genetic Variants in Puerto Rico
Out of the 37 cases of pathogenic genetic variants associated with PCD within the seven designated health regions in Puerto Rico, Mayagüez was most prevalent with eleven (29.7%) cases. The closest regions were Bayamón (21.6%) and Arecibo (21.6%) with eight subjects, followed by Metro with six (16.2%), Caguas (13.5%) with five, and Fajardo (2.7%) with one case. Finally, the Ponce region showed no PCD pathogenic variant ( Figure  2).

Results
Of 127 subjects, 37 (29.1%) subjects presented PCD pathogenic variants, and 17 (13.4%) subjects were homozygous for the RSPH4A (c.921+3_6delAAGT) founder mutation. Thirtyfour subjects were excluded due to negative genetic test results or cases where genetic variants were reported for other non-PCD ciliopathies or cystic fibrosis. A benign variant on DNAH5 c.1250C > G (p.Thr417Ser) was excluded.

Distribution of PCD Genetic Variants in Puerto Rico
Out of the 37 cases of pathogenic genetic variants associated with PCD within the seven designated health regions in Puerto Rico, Mayagüez was most prevalent with eleven (29.7%) cases. The closest regions were Bayamón (21.6%) and Arecibo (21.6%) with eight subjects, followed by Metro with six (16.2%), Caguas (13.5%) with five, and Fajardo (2.7%) with one case. Finally, the Ponce region showed no PCD pathogenic variant ( Figure 2). The genetic distribution showed a considerably increased frequency of the RSPH4A (c.921+3_6delAAGT) pathogenic variant. Alleles for the RSPH4A (c.921+3_6delAAGT) pathogenic variant were found 27 times throughout the seven healthcare regions in Puerto Rico ( Figure 3). Mayagüez was the region with the highest frequency for the RSPH4A (c.921+3_6delAAGT) pathogenic variant showing above 90% of the total pathogenic alleles found. The frequency of the total RSPH4A pathogenic alleles in other regions was found as follows: Bayamón (75%), Arecibo (50%), Metro (83%), and Fajardo (100%) regions. The Ponce region had 0% frequency for the RSPH4A (c.921+3_6delAAGT) pathogenic variant or any other variant associated with PCD. found. The frequency of the total RSPH4A pathogenic alleles in other regions was found as follows: Bayamón (75%), Arecibo (50%), Metro (83%), and Fajardo (100%) regions. The Ponce region had 0% frequency for the RSPH4A (c.921+3_6delAAGT) pathogenic variant or any other variant associated with PCD. with 11 subjects. Two cases presented two different pathogenic variants in the same subject. * PCD clinic location.

Frequency Percentage by Zygosity
To better understand the distribution of PCD-related variants, we considered analyzing the total frequency of individual genetic variants by looking at their frequency, including homozygous, heterozygous, and compound heterozygous.

Frequency Percentage by Zygosity
To better understand the distribution of PCD-related variants, we considered analyzing the total frequency of individual genetic variants by looking at their frequency, including homozygous, heterozygous, and compound heterozygous.

Pathogenic Frequency in PCD Implicated Genes
Regarding the frequency of PCD implicated pathogenic variants, the RSPH4A (c.921+3-_6delAAGT) variant was the highest in our cohort, exhibiting a 66.7% frequency. If the RSPH4A (c.1103T > G p.Val368Gly) gene variant is taken into consideration (2.6%), the RSPH4A gene represents 69.3% of all pathogenic variants in Puerto Rico implicated in PCD. ZMYND10 (c.85T > C (p.Ser29Pro)) pathogenic variant was the second most frequent, with 7.7%. CCNO (c.875_897del (p.Asp292Alafs*71)), DNAH1 (c.10468_10471del (p.Arg3490Glnfs*4)), and DNAH9 (c.308dup (p.Leu104Profs*45)) pathogenic variants all exhibiting a 5.1% frequency in our cohort. Mutations in DNAH11, DNAH5, and DNAI1 heterozygous pathogenic variants represented 2.6% frequency, respectively. As a reference, the allele frequency of PCD pathogenic variants found in Puerto Rico compared to Latinos, and the general population is presented in Table 1. As a reference, the allele frequency of the most common PCD VUS found in Puerto Rico compared to Latino, and the general population is presented in Table 2. A complete list of all VUS is presented in Table A2.

Homozygous Subjects with the RSPH4A (c.921+3_6delAAGT) Founder Mutation
Seventeen subjects were homozygous for the RSPH4A (c.921+3_6delAAGT) founder mutation in our cohort. Bronchiectasis was present in 15 (88%) subjects and was more frequent in adults (100%) than in children (71%). No laterality defects were identified in our cohort (0%). Forced expiratory volume in 1 s (FEV1) was lower in adults (44 + 8.8) as compared with children (67 + 25.2). Year-round wet cough and daily nasal congestion were found in all subjects (100%). The clinical characteristics and demographics of homozygous subjects for the RSPH4A (c.921+3_6delAAGT) founder mutation are summarized in Table 3. Table 3. Clinical characteristics and demographics of homozygous subjects for the RSPH4A (c.921+3_6delAAGT) founder mutation.

Discussion
In Hispanics, the genetic information available regarding PCD is limited. The frequency of the PCD genes and the associated genetic variants in Puerto Rico were largely unknown. We studied a large cohort of 127 subjects who completed genetic screening for PCD as part of the differential diagnosis considering patient history per the ATS clinical guidelines [5]. Genetic characterization and the geographic distribution of subjects with suspected PCD were retrospectively analyzed to comprehend the frequency and extent of PCD across Puerto Rico. Descriptive clinical and demographic data for 17 subjects with the RSPH4A (c.921+3_6delAAGT)founder mutation is presented.
Analysis of the distribution of PCD pathogenic variants in our cohort in Puerto Rico found a genetic frequency of the RSPH4A (c.921+3_6delAAGT)across almost all healthcare regions in Puerto Rico. Interestingly, there were 11 subjects with pathogenic variants in the Mayagüez region, a well-known region with an increased frequency of autosomal recessive disorders due to the consanguinity of families. Out of fifteen in this region, only five municipalities proved to have subjects with a pathogenic variant for PCD. The same pattern was identified in the Bayamón region, where, out of eleven municipalities that compose this region, five have subjects with PCD pathogenic variants. We could argue about the presence of areas of increased incidence or hotspots for PCD in the Mayagüez and Bayamón regions, but additional research is needed to confirm our observations. No pathogenic variants were identified in the Ponce region. A hypothesis for his observation may be related to the fact that since the last years, the accessibility to medical services for referral in this region is limited after the local impact of earthquakes, the pandemic, and hurricanes, which reduced the actual population in the region. A larger cohort may be required to address this observation.
Above 50% of PCD genes affected in Europe and North America are due to genetic mutation in DNAH5, DNAH11, CCDC39, and CCDC40 genes [16]. Previous studies showed that in Hispanics, the most common pathogenic or likely pathogenic variants were associated with DNAAF4, DNAH11, DNAH5, DNAAF3, and ODAD1 [6]. Comparing the identified PCD pathogenic variants within our cohort with those published by Hannah et al. shows that the PCD Puerto Rican genetic pool may not be similar to what was previously reported for Latinos [6]. Apart from the RSPH4A gene, the pathogenic variants reported were in the following PCD genes: ZMYND10, CCNO, DNAH1, DNAH9, DNAI1, DNAH11, and DNAH5. The CCDC40 and CCDC39 genes are linked with a clinical severe PCD phenotype in previous publications [17]. In our cohort, only 1.5% of the pathogenic variants were related to CCDCD40, which is positive considering the increased severe spectrum of pulmonary involvement [17]. In comparison with Europe and North America, pathogenic variants in two of the most common genes (DNAH5 and DNAH11) were present in our cohort. Findings may suggest the importance of PCD genetic testing in specific geographic locations apart from the general ethnicity of the subject. Additional studies to explore and compare the geographic frequency of PCD pathogenic variants among different islands in the Caribbean should be conducted. Compared with the allelic frequency published in the gnomAD database, our frequencies were higher as expected. Due to the presence of the RSPH4A (c.921+3_6delAAGT) founder mutation in Puerto Rico, a higher frequency (0.667) was expected as compared with the general (0.00003204) and Latino allele frequency (0.0001738). Compared with gnomAD, higher than expected frequencies were identified on other PCD genes. A reason for this observation is that we reviewed the genetic results of a cohort screened for PCD as part of the differential diagnosis, which falsely elevated the frequency of PCD variants in our sample. A large study screening the general population for the RSPH4A (c.921+3_6delAAGT) pathogenic variant is needed to know the actual prevalence of this founder mutation in Puerto Rico.
Our Puerto Rican heritage is an admixture of European, African, and Taino natives [18]. A previous publication identified the presence of the RSPH4A (c.921+3_6delAAGT) founder mutation in Puerto Rico [7]. Excluding VUS, the reported carrier frequency of RSPH4A mutations was 1 in 632 individuals. By extrapolation of the carrier frequency to our Puerto Rican 2021 population of 2.8 million people living on the island, we can anticipate 4430 individuals who are carriers of RSPH4A mutations. Our data support the elevated frequency of the RSPH4A (c.921+3_6delAAGT) pathogenic variant in Puerto Rico. Additional RSPH4A genetic variants were also present. We found one subject with a heterozygous RSPH4A (c.1103T > G (p.Val368Gly)) pathogenic variant presenting as compound heterozygous subjects concomitant with the RSPH4A (c.921+3_6delAAGT) pathogenic variant. Interrelationships among different RSPH4A heterozygous variants are present in Puerto Rican subjects, resulting in a spectrum of PCD in Puerto Rico. This confirms the previous trend stating that the mutations in the RSPH4A gene are largely dominant within our cohort and PCD community on the island.
The presence of VUS for PCD genes in Puerto Rico was analyzed. Previous studies identified a more significant number of PCD genes with VUS [6]. The authors discussed the importance of more extensive databases to explore the role of VUS in different ethnicities and geographic locations. Puerto Rico is not an exception. Due to a heterogeneous genetic pool, 59.8% of the subjects in our cohort presented VUS. Analysis of additional subjects is needed to understand genetic variants in genes such as DNAH8 that are linked to infertility, and current data are not clear about their role in the respiratory motile cilia [19].
Our study has limitations. The study retrospectively analyzed 127 individuals screened for PCD from one single PCD clinic. Additional pulmonary clinics in Puerto Rico with PCD patients were not included, and data from multiple clinics may expand or alter the genetic results of this study, even though our clinic serves as the only PCD referral center from other pulmonary clinics on the island for both adults and pediatrics with the capability to perform nNO using a chemiluminescence analyzer. Furthermore, genetic variants in more than 50 genes are linked with PCD, and we analyzed only 42 genes, including the CFTR gene. Genotype-phenotype studies are needed to understand the role of pathogenic variants and implicated VUS in the clinical setting or PCD diagnosis. Reanalysis of new data in the future may help to unmask new PCD genetic variants and classify VUS as disease-causing for the Puerto Rican population. Finally, existing classifications of VUS identified in this study may be reclassified as likely benign or likely pathogenic as more cases are identified in the near future.

Conclusions
Our study provides an overview of the actual geographic distribution of PCD pathogenic variants in Puerto Rico as well the clinical characteristics of the homozygous subjects with the RSPH4A (c.921+3_6delAAGT) founder mutation. Although the information presented may change in the future due to the increasing availability of genetic testing, we reconfirm the presence of the RSPH4A (c.921+3_6delAAGT) founder mutation as the primary variant responsible for PCD cases across the island. Our work provides a better understanding of Informed Consent Statement: Patient consent was waived due to the retrospective nature of this study.

Data Availability Statement:
All data analyzed in this study are included in this published article.

Conflicts of Interest:
The authors declare no conflict of interest. Testing conducted by Invitae Corporation. * Gene limitations. NOTCH2: deletion/duplication and sequencing analysis is not offered for exons 1-4. ARMC4: Deletions/duplication and sequencing analysis was not conducted on exon nine. RPGR: Only the transcript associated with X-linked PCD was analyzed. CFTR testing was conducted: Sequencing analysis for exon seven includes only the 288 coding sequences of the gene + 10 base pairs.