Robust Detection of Somatic Mosaicism and Repeat Interruptions by Long-Read Targeted Sequencing in Myotonic Dystrophy Type 1

Myotonic dystrophy type 1 (DM1) is the most complex and variable trinucleotide repeat disorder caused by an unstable CTG repeat expansion, reaching up to 4000 CTG in the most severe cases. The genetic and clinical variability of DM1 depend on the sex and age of the transmitting parent, but also on the CTG repeat number, presence of repeat interruptions and/or on the degree of somatic instability. Currently, it is difficult to simultaneously and accurately determine these contributing factors in DM1 patients due to the limitations of gold standard methods used in molecular diagnostics and research laboratories. Our study showed the efficiency of the latest PacBio long-read sequencing technology to sequence large CTG trinucleotides, detect multiple and single repeat interruptions and estimate the levels of somatic mosaicism in DM1 patients carrying complex CTG repeat expansions inaccessible to most methods. Using this innovative approach, we revealed the existence of de novo CCG interruptions associated with CTG stabilization/contraction across generations in a new DM1 family. We also demonstrated that our method is suitable to sequence the DM1 locus and measure somatic mosaicism in DM1 families carrying more than 1000 pure CTG repeats. Better characterization of expanded alleles in DM1 patients can significantly improve prognosis and genetic counseling, not only in DM1 but also for other tandem DNA repeat disorders.


Introduction
More than 40 different human disorders are caused by tri-, tetra-, penta-or hexanucleotide repeat expansions localized either in coding or non-coding regions of the target gene [1]. The pathogenic mechanisms for repeat diseases involve either a loss of protein function or a gain of function at the RNA or protein level, depending on the type and location of the repeat [1]. Among the trinucleotide repeat (TNR) diseases, Fragile X syndrome (FXS (MIM: 300624)), Huntington's disease (HD (MIM: 143100)), several spinocerebellar ataxias (SCAs) and myotonic dystrophy type 1 (DM1 (MIM: 160900)) have been reported. DM1 is a highly multisystemic disorder caused by an unstable CTG repeat expansion within the 3 -untranslated region (UTR) of the myotonic dystrophy protein kinase (DMPK) gene that usually increases across generations and in tissues [2,3]. DM1 is mainly characterized by a broad clinical spectrum of symptoms such as myotonia, muscle weakness, cardiac conduction defect, respiratory insufficiency, dysphagia, gastrointestinal symptoms, somnolence or cataracts [4]. Several parameters such as the CTG repeat length and gender contribute to the phenotypic variability of DM1, resulting in five distinct clinical forms from late onset to the most congenital cases, which are often associated with the largest size of inherited disease-associated allele [4,5]. Facial dysmorphisms, muscle weakness and cognitive impairment are more frequent symptoms in earlier onset form while cardiac defects and cataracts are seen more in DM1 patients with later forms of the disease. Interestingly, gastrointestinal problems or dysphagia and insomnia are found in all five forms of DM1 described in a large French DM1 cohort [4]. However, many DM1 patients develop unusual DM1 symptoms or remain asymptomatic despite the presence of a large CTG repeat expansion in their cells. This suggests that contributing factors such as somatic mosaicism, gene modifiers and environmental factors may affect the evolution of the clinical and mutation aspects [3,6,7]. The somatic mosaicism observed in blood is strongly biased towards expansions and contributes not only to the progressive nature of the different symptoms in several DM1 ethnic groups, but also to the variation in the age of onset [6,[8][9][10]. Two studies also revealed that single-nucleotide polymorphisms in the MutS Homolog 3 (MSH3) DNA mismatch repair gene may reduce somatic mosaicism levels but also delay onset in DM1 patients [11,12].
Accurate estimation of the size of inherited CTG repeat expansion, somatic mosaicism and identification of interrupted alleles are crucial to better characterize the genotypephenotype correlation in DM1. Inherited CTG repeat expansion size and the level of somatic mosaicism are traditionally evaluated by Southern blot, polymerase chain reaction (PCR) and small pool PCR [33]. These methods do not provide any information on the sequence of CTG repeat expansion. Only triplet-primed PCR may detect the presence of interruptions at the 5 and 3 ends of the CTG repeat expansion [34]. Identification of interruptions may be resolved by short-read sequencing or by enzymatic digestion [17,34]. However, the information obtained is limited to the end of the sequence and gives no information about the middle of the sequence.
Long-read sequencing has recently been successfully applied in HD and Fragile X patients and also in DM1 patients [18,[35][36][37][38]. The Monckton group analyzed the CTG repeat expansions in different DM1 patients with less than 400 CTG repeats using a new long-read technology, single molecule real-time (SMRT) sequencing by Pacific Biosciences (PacBio), and the penultimate PacBio RSII System [18]. They showed that CCG interruptions are exclusively localized at the ends of the sequence and are associated with milder symptoms in DM1 patients with <400 CTG repeat tracts. Today, no method has been described to study both the size of large CTG repeats, the presence of interruptions and somatic mosaicism in DM1.
The present study is an extension of previously reported results in DM1 patients using the latest generation of long-read sequencing developed by PacBio and amplicons as resources. We have shown that the new PacBio technology can sequence at least 1000 CTG repeats, detect a single CAG and multiple CCG interruptions and estimate somatic mosaicism at the same time with sufficient depth. In this study, we described a paternal de novo CCG interruption in a new DM1 family. We also characterized the CTG repeat sequence and its haplotype in seven individuals of a large family (three generations) carrying a DM1 intermediate allele (37 repeats). We have identified the stable interrupted hexamer allele (CCGCTG) associated with the European DM1 haplotype A. More complete characterization of the expanded allele in DM1 patients will improve our knowledge on the genotype-phenotype correlation in DM1 as well as the prognosis and genetic counseling in this disease and other TNR disorders.

De Novo CCG Interruption in New DM1 Family Identified with the Sequel II System
Family E was recruited for prenatal genetic counseling. E2.1 was identified as a DM1 patient carrying interrupted CTG repeat expansion with atypical symptoms. In order to improve the genetic counseling in this family, we mainly characterized the DMPK mutation in DM1 members of this family. PCR amplification of the CTG repeat tracts revealed a decrease in CTG repeat size across two paternal transmissions ( Figure 1). Bidirectional triplet-primed polymerase chain reaction (TP-PCR) showed several interruptions at the 3 end of the CTG repeat in individuals E2.1 and E3. However, the TP-PCR trace is different between E2.1 and E.3 ( Figure 2). Two distinct gaps were found in both TP-PCR traces, whereas a third gap was only observed in individual E.3, introducing the presence of additional interruptions in the fetus E3. Interestingly, the 3 end of the repeat appeared free of interruption in individual E.1 ( Figure 2). By cloning and sequencing the CTG repeat tracts, we identified a majority of expanded DMPK alleles with two and six non-consecutive CCG interruptions in the 3 of the CTG expansion of E2.1 and E3, respectively, whereas no interruption was identified in individual E1 in the ends of the CTG repeat expansion ( Figure 3). These first results suggest de novo mutation in this family. However, the TP-PCR method and cloning sequencing did not allow the characterization of the middle of the expanded repeats in this family, particularly in individual E1 with the largest expanded allele. In order to definitively exclude the presence of interruption in the unsequenced CTG repeat region by conventional methods in E1, we analyzed around 10,000 molecules of DMPK alleles from the blood of family E using SMRT sequencing on the Sequel II System. First, we sequenced the full repeat expansion for each patient and accurately estimated the size of the CTG expansion. The mode of the CTG repeat size frequency distribution was 447, 383 and 173/215 CTG repeats in E1, E2.1 and E3 individuals, respectively (Table 1). In the E3 fetus, the CTG repeat length distribution was bimodal with a mode at~173 and~215 compared to the individuals E1 and E2 which show a unimodal distribution. We identified CCG interrupted alleles exclusively in individuals E2.1 and E3 whereas the individual E1 carried a pure expanded allele ( Figure 4 and Table 1). Interestingly, the SMRT sequencing revealed cells carrying a majority of expanded DMPK alleles with two or three CCG interruptions in individual E2.1, suggesting that the number of interruptions varies between cells in blood (Table 1 and data not shown). The results of SMRT sequencing confirm that de novo interruptions occur during the E1 and E2.1 paternal transmission.   Using data generated by the Sequel II System, we were able to sequence CTG repeat expanded allele in individuals E1, E2.1 and E3 and then accurately estimate the size of the repeat and the number of interruptions. Interestingly, SMRT sequencing also allowed for quantifying the degree of somatic mosaicism in these DM1 patients. The level of somatic mosaicism is higher in individuals E1 than E3 ( Figure 5). To support the efficiency of SMRT sequencing in DM1 and to strengthen our data, we also analyzed the CTG repeat expansions in patients A4.1 (single CAG interruption) and B2 (3 CCG interruption) and two DM1 patients (1201 and 5289) with pure CTG repeat expansions published in Tomé et al. (Table 1 and [16]). As previously described, we identified a single CAG repeat interruption in individual A4.1 and three CCG interruptions in individual B2 ( Figure 6a and Table 1). In addition, somatic mosaicism was observed in four DM1 patients, with lower somatic mosaicism in individuals A4.1 and B2 compared to the respective DM1 patients with pure repeats as reported in Tomé et al. (Figure 6b). For the first time, we also succeeded in sequencing the DM1 locus and estimated somatic mosaicism in the DM1 family (patients L2 and L3) carrying more than 1000 pure CTG repeats using the Sequel II System ( Figure 7 and Table 1). The mode of the CTG repeat size frequency distribution was 957 and 1156 repeats in individuals L2 and L3, respectively (Table 1). No interruption was identified ( Figure 7a) and a high somatic mosaicism was observed in these two patients where the largest expanded allele contained 2138 CTG repeats (Figure 7b). We have shown that the Sequel II System successfully analyzes the sequence of CTG repeats in DM1 patients carrying more than 1000 CTG repeats, the degree of somatic mosaicism and the variant in a single analysis.            Long-read sequencing results in individuals L2 and L3 carrying more than 1000 CTG repeats. (a)-Waterfall plots outline the repeat structure of the normal and expanded alleles. The y-axis shows the number of CSS reads, whereas the x-axis shows the length of the CTG repeat expansion in base pairs. The CTG repeat is represented in blue. The highest peaks at the far left of the distribution represent the normal allele. (b)-CTG repeat size distribution in DM1 patients with more than 1000 CTG repeats estimated at diagnosis. The y-axis shows the number of CSS reads in the solid blue distribution, whereas the x-axis shows the length of the CTG repeat expansion. The grey line represents a kernel density estimation of the underlying blue distribution of CCS reads.

Figure 7.
Long-read sequencing results in individuals L2 and L3 carrying more than 1000 CTG repeats. (a)-Waterfall plots outline the repeat structure of the normal and expanded alleles. The y-axis shows the number of CSS reads, whereas the x-axis shows the length of the CTG repeat expansion in base pairs. The CTG repeat is represented in blue. The highest peaks at the far left of the distribution represent the normal allele. (b)-CTG repeat size distribution in DM1 patients with more than 1000 CTG repeats estimated at diagnosis. The y-axis shows the number of CSS reads in the solid blue distribution, whereas the x-axis shows the length of the CTG repeat expansion. The grey line represents a kernel density estimation of the underlying blue distribution of CCS reads.

Stable CCG-Interrupted Allele with 37 Repeats Is Associated to DM1 Haplotype in a Large Family
DM1 disease was suspected during maternal and fetal monitoring in the individual G3.2 from the G37 family (Figure 8a). By classic PCR and Sanger sequencing, we identified a 37-CTG repeat allele in the individuals G3.1 and G3.2 as well as in the other members of the family, excluding the presence of DM1 in this family ( Figure 8). This allele is stably transmitted across successive generations as expected. In order to better characterize the nature of the CTG repeat locus in this family, we utilized TP-PCR at the 3 ends of the CTG repeat in blood samples from different members of the G37 family. The 3 TP-PCR experiment revealed an unexpected pattern of the electrophoretic peak with a large gap, suggesting the presence of several interruptions in the largest CTG repeat allele (Figure 8b). By direct sequencing, we identified (CCGCTG) hexamer interruptions in the repeat. All members of this family carry a stable allele 5 -(CTG) 6 (CCGCTG) 13 (CTG) 5 -3 ( Figure 8c and data not shown). In order to understand the origin of interruption in DM1 disease, we genotyped different polymorphic markers in the DM1 locus. First, we showed that a (CTG) 6 (CCGCTG) 13 (CTG) 5 interrupted allele is associated with the Alu insertion polymorphism in the G37 family (Table 2 and data not shown). We completed our haplotype analysis by genotyping other polymorphisms in the DM1 locus (Table 2). Our results showed that the interrupted 37 CTG repeat alleles are associated with haplotype A and are shared by the majority of pure and interrupted DM1 alleles [16,17,29,[39][40][41].

Discussion
Genetic counseling for DM1 is very complex due to the highly variable clinical presentation and technical difficulties in determining the size and variant repeat interruptions of the large CTG repeat expansions. For several years, the size of the repeat expansions, the degree of somatic mosaicism and DMPK-interrupted alleles have been established as genetic modifiers of DM1 symptoms [3]. A decrease in somatic mosaicism and CTG repeat length is usually associated with a decrease in the severity and age of onset of DM1 symptoms [6,8]. The interruptions are associated with a stabilization of the repeat and a modification in the progression of the DM1 symptoms [16][17][18][19][21][22][23]32]. It is, therefore, crucial to develop a simple and rapid analysis of large CTG repeat expanded sequences to significantly improve our knowledge of DM1. Recently, another study analyzed CTG repeat expansions in different DM1 patients using the penultimate PacBio RSII System. They showed that CCG interruptions are exclusively localized at the end of the sequence in DM1 patients with less than 400 repeats [18]. However, no data were obtained in DM1 patients with larger repeats. Here, we have analyzed, for the first time, two DM1 families carrying CTG repeats ranging from 170 to over 1000 CTG repeats using the Sequel II System. The Sequel II System generates longer reads, enabling higher CCS accuracy, and has higher throughputs than the Sequel and RSII Systems (data not shown and [18,42,43]). Here, the Sequel II System and bioinformatic tools give us the ability to simultaneously measure repeat numbers with high resolution, to resolve the complete sequence complex repeat expansions and to measure the degree of somatic mosaicism. We have shown that the Sequel II System allows for sequencing a large repeat expansion as large as 2000 CTG repeats in DM1 patients of the family L with more accuracy than conventional PCR (data not shown). In the family E, we identified a major allele with two de novo CCG interruptions occurring across the E1 and E2.1 paternal transmission by the Sequel II System. Interestingly, the number of interruptions increased from one generation to the next in family E, as previously described in several analyses (Table 3). Using the Sequel II System, we have reported that CCG interruptions at the 3 end of CTG repeat expansions are associated with CTG stabilization/contraction across generations. Here, the Sequel II System makes it possible to estimate the frequency distribution of the CTG repeat. The level of somatic mosaicism is the highest in L2 and L3 with the largest repeat. In family E, the degree of somatic mosaicism is higher in E1 than in E3, suggesting an age-and size-dependent effect on the somatic mosaicism of the DM1 locus. Strikingly, our new data are consistent with previous studies showing that the dynamics of CTG repeat instability are altered by repeat interruptions and the size and age of DM1 patients using conventional PCR or small pool PCR [10,16,17,22,[44][45][46]. The E3 fetus exhibits two major CTG repeat lengths with approximately 170 and 215 CTG repeats, suggesting that the CTG repeat instability is already detectable at the early stage of embryogenesis as previously reported in the literature [47,48].         The Sequel II System successfully sequences CTG repeat expansions in our DM1 families. This new technology is a straightforward way to detect clinically significant repeat changes and estimate the size of the repeat in blood using targeted sequencing with PacBio SMRT sequencing. Despite the advanced PacBio technology, amplicon-based long-read sequencing still depends on PCR and the inherent bias towards preferential amplification of smaller repeats. To overcome this limitation, amplification-free targeted sequencing has been first described in a Fuchs' endothelial corneal dystrophy-associated Transcription Factor 4 (TCF4) CTG triplet repeat [49]. The procedure consists of sequencing targeted genomic regions, without amplification, on a PacBio System by using Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-Cas9 enrichment technology [38]. This approach should improve the analysis of CTG repeat expansions and somatic mosaicism in DM1 and also in other TNR diseases.
As noted above, we have identified family E with de novo CCG interruptions at the 3 end in the CTG repeat expansion ( Table 1). The percentage of interrupted expanded alleles has been estimated at 3-8% in the non-African DM1 population (Table 3 and [16][17][18][19][20]22,[24][25][26][27][28][29][30][31][32]). To date, the origin of the interruptions remains very obscure. In our study, we reported a normal interrupted allele with 37 stable CTG repeats (5 -(CTG) 6 (CCGCTG) 13 (CTG) 5 -3 ) associated with haplotype A in a large French family (two paternal transmissions and one maternal transmission). Our data suggest that the size and interruption pattern of this allele remain stable through generations. Two other analyses showed that stable CCG-interrupted 37 or 41 repeat alleles share a common haplotype A with DM1 mutation [17,29]. The DM1 haplotype A was also found in DM1 patients with interrupted expanded alleles, suggesting that the normal allele might be a source of imperfect expanded alleles found in less than 10% of DM1 patients [16,17,29,[39][40][41]. However, the profile/type of interrupted alleles found in patients carrying 37, 38, 41, 43 or more than 50 repeats is extremely variable within DM1 families and also between DM1 families (Table 3), which does not suggest any haplotype specificity of the interrupted alleles. In addition, stabilization of a repeat by interruptions does not favor the hypothesis of an interrupted normal allele as the source of the interrupted expanded allele in the DM1 population [10,16,17,22,31]. The heterogeneity of the number and type of interruptions observed in the interrupted expanded alleles suggests new mechanisms leading to base substitution in the sequence and/or duplication of existing interruptions in the repeated sequence ( Table 3). The emergence of interruptions can be caused by multiple processes including spontaneous DNA damage, DNA repair and DNA polymerase errors occurring in germ cells and somatic cells throughout embryogenesis and the lifetime of DM1 patients.
To conclude, we used the latest generation of the long-read sequencing system in DM1 patients with more than 1000 CTG repeats that allows detection of a single nucleotide change in the sequence and estimates the size of the large repeated sequence and somatic mosaicism at the same time. SMRT sequencing opens new avenues for DM1 disease and will provide a better understanding of the clinical and genetic variability observed in DM1 through global analysis. Growth in users of SMRT sequencing and reduction in its price will enable SMRT sequencing to be implemented as a routine molecular diagnostic method offering the best diagnostics and prognosis for patients in the near future. Our study reinforced the idea that interrupted alleles do not originate from an ancestral/normal allele but from unknown mechanisms occurring both in the germline and in somatic cells.

Patient Recruitment
Individuals from family G37 and DM1 patients were recruited by the Genetics Department of the Hospital of Nantes, the Genetics Department of the Hospital of Toulouse, the Genetics Department of the Necker-Enfants Malades Hospital and the DM-Scope registry [50] in France. Major clinical data are available for DM1 patients from family E. Each patient gave informed consent stating that their DNA samples could be used for research purposes. The individuals A4.1, 1201, B2 and 5289 described in Tomé et al. and analyzed by SMRT sequencing in this study are not related to each other [16].

CTG Repeat Amplification
To precisely estimate the inherited CTG repeat length in all individuals, 5 ng of DNA from blood or trophoblast was amplified in a 25-µL reaction using 0.4 µM ST300F (5 -GAACTGTCTTCGACTCCGGG-3 ) and ST300R (5 -GCACTTTGCGAACCAACGAT-3 ) primers, 1× Custom master mix (Thermo Fisher Scientific, Courtaboeuf, France) and 0.04U Thermoperfect Taq polymerase (Peak International Products b.v, LZ Eerbeek, Netherlands). The following cycling conditions were used: 5 min at 96 • C; 45 s at 96 • C, 30 s at 60 • C and 3 min at 72 • C (30 cycles); 1 min at 60 • C and 10 min at 72 • C (1 cycle). PCR product was mixed with orange DNA loading dye and run on a 1.5% agarose gel at 120 V. The size of the PCR products was measured using Bio-Rad's Image Lab software. The approximate number of triplet repeats can be obtained by subtracting 361 bp (corresponding to the size of 5 and 3 flanking regions) from the PCR product length divided by 3.

3 Triplet-Primed PCR (TP-PCR)
To analyze the purity of the CTG repeat tract at the 3 end CTG repeat array, TP-PCR was performed on both strands of the CTG repeat [32,34]. The 3 end CTG repeat was amplified using the primer downstream of the CTG repeat Somy4R-FAM (5 -FAM-CGG GTT TGG CAA AAG CAA ATT TCC CGA-3 ), P3R (5 -TAC GCA TCC CAG TTT GAG ACG-3 ) and P4CTG (5 -TAC GCA TCC CAG TTT GAG ACG TGC TGC TGC TGC TGC T-3 ) primers as described in Tomé et al. [33]. Briefly, 20-100 ng of DNA from blood and trophoblast was amplified using 0.4 µM Somy4R-FAM and P3R primers and 0.04 µM P4CTG primer, 1× Custom master mix (Thermo Fisher Scientific, Courtaboeuf, France) and 0.06U Thermoperfect Taq polymerase (Peak International Products b.v, LZ Eerbeek, Netherlands)). The conditions of TP-PCR were as follows: denaturation at 94 • C for 5 min followed by 30 cycles at 94 • C for 1 min, 68 • C for 1 min 30 s and at 72 • C for 2 min and a final extension step at 72 • C for 10 min (1 cycle). The amplified product was analyzed using a 3500 XL genetic analyzer (Applied Biosystems, Foster City, CA, USA) and Gene Mapper software (Thermo Fisher Scientific, Courtaboeuf, France).

CTG Repeat Sequencing
CTG repeat tracts were sequenced as described in Tomé et al. [16]. Briefly, normal and expanded CTG repeat alleles were amplified by PCR using ST300F and ST300R primers and sequenced on a 3500 XL genetic analyzer (Applied Biosystems, Foster City, CA, USA). When it was necessary, purified PCR products were cloned using a TOPO-TA cloning kit (Thermo Fisher Scientific, Courtaboeuf, France) and each clone was sequenced using M13F (-20) 5 -GTA AAA CGA CGG CCA G-3 and M13R 5 CAG-GAA-ACA-GCT-ATG-AC-3 primers. MacVector software was used to analyze the sequence.

Sequel II System from PacBio (PacBio and Sequel Are Trademarks of Pacific Biosciences)
Generation of amplicons with the CTG repeat expansion. Normal and expanded CTG repeat alleles were amplified by PCR using barcoded ST300-F and ST300-R primers (Table 4). After amplification, the PCR products for each sample were pooled and purified using the 0.5X (DM1 patients <900 CTG) or 0.45X (DM1 patients >1000 CTG) AMPure PB beads (Pacific Biosciences, Menlo Park, CA, USA) clean-up procedure. AMPure PB beads were used to remove unbound primers and the PCR product corresponding to DMPK normal allele. The PCR product corresponding to expanded alleles was quantified by Qubit fluorometric quantification (Thermo Fischer Scientific, Courtaboeuf, France). The quality of each purified PCR product pool was tested on an agarose gel of 1.5%.  [51]. Binding was performed with the Sequel II Binding Kit 2.1. Sequel II System run conditions included a 1-h pre-extension and 25-h movie time per SMRT Cell.
Bioinformatic analyses. Single molecule circular consensus sequences (CCS or HiFi reads) were generated from raw sequencing data using CCS version 5.0.0 (https://github. com/PacificBiosciences/ccs). Consensus reads were filtered for sequences having ≥3 passes and a minimum mean read accuracy of QV20, and sample reads were demultiplexed using lima version 2.0.1. HiFi reads were aligned to the reference using pbmm2 version 1.4.0. Repeat motifs were counted and clustered by allele using RepeatAnalysisTools (https: //github.com/PacificBiosciences/apps-scripts/tree/master/RepeatAnalysisTools).

Haplotype Analyses
Polymorphisms flanking the CTG repeat expansion were genotyped using targeted re-sequencing developed by the Genomic Platform of the Imagine Institute. Briefly, 1 µg of genomic DNA was prepared and quantified by Qubit fluorometric quantification (Thermo Fischer Scientific, Courtaboeuf, France). In order to sequence the DM1 locus, a cosmid with a large human fragment of 45 kb containing the DMPK gene with 55 repeats and the flanking sequence was used to target the region of interest. The DM1 locus was sequenced using Illumina sequencing technology.  Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki. Blood samples were collected accordingly following the consent procedure that was approved by the national ethic committee CCTIRS (Advisory Committee on Information Processing in Material Research in the Field of Health); adult patients or legal guardians received an information letter presenting the study.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Data Availability Statement: All the data necessary to evaluate the conclusions in this study are present in the article. Additional data related to this paper may be requested from the authors.