Precise Epigenetic Analysis Using Targeted Bisulfite Genomic Sequencing Distinguishes FSHD1, FSHD2, and Healthy Subjects

The true prevalence of facioscapulohumeral muscular dystrophy (FSHD) is unknown due to difficulties with accurate clinical evaluation and the complexities of current genetic diagnostics. Interestingly, all forms of FSHD are linked to epigenetic changes in the chromosome 4q35 D4Z4 macrosatellite, suggesting that epigenetic analysis could provide an avenue for sequence-based FSHD diagnostics. However, studies assessing DNA methylation at the FSHD locus have produced conflicting results; thus, the utility of this technique as an FSHD diagnostic remains controversial. Here, we critically compared two protocols for epigenetic analysis of the FSHD region using bisulfite genomic sequencing: Jones et al., that contends to be individually diagnostic for FSHD1 and FSHD2, and Gaillard et al., that can identify some changes in DNA methylation levels between groups of clinically affected FSHD and healthy subjects, but is not individually diagnostic for any form of FSHD. We performed both sets of assays on the same genetically confirmed samples and showed that this discrepancy was due strictly to differences in amplicon specificity. We propose that the epigenetic status of the FSHD-associated D4Z4 arrays, when accurately assessed, is a diagnostic for genetic FSHD and can readily distinguish between healthy, FSHD1 and FSHD2. Thus, epigenetic diagnosis of FSHD, which can be performed on saliva DNA, will greatly increase accessibility to FSHD diagnostics for populations around the world.


Introduction
Facioscapulohumeral muscular dystrophy (FSHD) is typically an adult onset myopathy with variable penetrance that affects males and females of all ages [1,2]. There are two genetic classes of FSHD that are clinically indistinguishable [3]. The most common form, FSHD1, accounts for~95% of cases and is caused by autosomal dominant heterologous deletions within the chromosome 4q35.2 D4Z4 macrosatellite array [4,5] that lead to a local epigenetic dysregulation of the contracted array [6,7]. The remainder of cases are classified as FSHD2, which has a digenic inheritance and is caused by mutations in genes that encode epigenetic repressors of the 4q35 and 10q26 D4Z4 arrays when combined with specific genetic requirements on 4q35 [8][9][10][11][12]. Therefore, the epigenetic disruption in FSHD2 is more widespread than in FSHD1 [13], affecting the D4Z4 arrays on all four alleles, while only the D4Z4 array on the contracted chromosome 4 is affected in FSHD1 [9,13]. Regardless, in all cases of FSHD, these mutations lead to the epigenetic disruption of the 4q35 D4Z4 array and increased expression of the pathogenic DUX4 gene from the distal-most repeat unit (RU) within the array [14][15][16][17][18][19][20].
The pathogenic stable expression of the 4q35 DUX4 gene highlights another unifying genetic feature of FSHD1 and FSHD2, the requirement for an FSHD permissive distal sequence [17,21,22]. Although hundreds of homologous D4Z4 repeats encoding similar DUX family sequences are scattered throughout the human genome, only the 4q35 and 10q26 D4Z4 arrays are epigenetically dysregulated in FSHD [23], and only gene expression from the 4q35 D4Z4 array is associated with FSHD [24]. Despite the entire DUX4 protein coding sequence residing within each 4q35 and 10q26 D4Z4 repeat [14], these all lack the polyadenylation signal (PAS) required for the production of a stable mRNA. However,~50% of human chromosome 4s contain a sequence immediately distal to the chromosome 4q35 D4Z4 array, termed 4qA [21], that contains a third exon with a PAS that, when spliced into the DUX4 transcript, creates a mature DUX4 mRNA leading to DUX4 protein production [17]. These PAS-containing 4qA alleles are termed FSHD permissive chromosomes [17,21]. The remaining~50% of chromosome 4s have an entirely different distal sequence, termed 4qB [21], and are FSHD nonpermissive since they do not contain a PAS and therefore cannot produce a stable DUX4 transcript [17,21]. Contractions on 4qB alleles do not cause FSHD [22]. Further complicating matters, there are some specific sequence variations of 4qA that are not associated with FSHD. Thus, when performing FSHD genetic diagnostics, one must assay the specific D4Z4 array (chromosome 4q35), the specific D4Z4 distal sequence [25], and potentially the state of the FSHD2 genes (most commonly SMCHD1) [10].
As one might expect with such complicated genetics, FSHD is not amenable to a typical genetic analysis, such as using neuromuscular disease gene panels or whole exome sequencing, and instead requires specialized techniques specifically targeting FSHD [10,[26][27][28][29][30]. To date, all the approved methods for FSHD1 genetic testing physically measure the size of the 4q35 D4Z4 arrays and thus require specially prepared high quality and high molecular weight genomic DNA (gDNA) and cannot be performed on saliva DNA or most biobanked DNAs [10,[26][27][28][29][30]. These issues account for the fact that FSHD genetic testing is expensive, only performed at a few sites worldwide, and therefore inaccessible to many at-risk individuals around the world. More accessible FSHD genetic diagnostics are sorely needed.
Since the epigenetic dysregulation of 4q35 is common to both forms of FSHD and distinguishes FSHD from healthy subjects or those with non-FSHD myopathies [6,9,31], we developed a bisulfite sequencing (BSS)-based PCR (BS-PCR) approach for analyzing the methylation state of the key FSHD regions for use as a potential diagnostic (Figure 1, green, blue, and orange) [32]. While DNA hypomethylation has been used to help diagnose FSHD2 for a number of years [9,31,33], our method (Jones et al.) is currently the only one reported that can specifically identify both forms of FSHD using DNA methylation while concurrently distinguishing FSHD1 from FSHD2 [32]. Thus, we concluded that DNA methylation assayed by targeted BSS could be used as a molecular diagnostic for FSHD1 and FSHD2. Subsequently, a similar targeted BSS approach (Gaillard et al.) was published ( Figure 1, pink) that identified gross differences in methylation levels between samples from healthy and clinically affected FSHD populations but with high overlap between the two groups, and did not find significant methylation differences between a cohort of genetically healthy (non-FSHD) individuals and asymptomatic subjects with genetic FSHD1 [34]. This result was in contrast to two other studies on the epigenetics of asymptomatic subjects that reported significantly higher D4Z4 methylation than in clinically affected FSHD1 subjects, but significantly lower methylation than in genetically healthy subjects [35,36]. In addition, although FSHD2 has been generally characterized by the FSHD field as distinctly different from FSHD1 and healthy subjects by hypomethylation of both the 4q35 and 10q26 D4Z4 arrays, shown by <30% DNA methylation using methylation sensitive restriction assay for the proximal FseI site [10,13,37], Gaillard et al. reported no significant methylation differences between groups or individual cases of FSHD1 and FSHD2 while reporting strikingly high levels of DNA methylation (40-70%) in their FSHD2 samples [34]. The authors concluded that clinical FSHD is epigenetically distinct from healthy controls; however, DNA methylation does not distinguish all individuals with an FSHD D4Z4 contraction from those without and does not distinguish FSHD1 from FSHD2. Further studies continued to show significant differences between the epigenetic signatures obtained using the two protocols [34,36,[38][39][40][41][42], leading some in the field to dismiss DNA methylation as not being diagnostic for, or even relevant for FSHD [43], with the discrepancies accounted for by different patient populations and potential technical concerns. the epigenetic signatures obtained using the two protocols [34,36,[38][39][40][41][42], leading some in the field to dismiss DNA methylation as not being diagnostic for, or even relevant for FSHD [43], with the discrepancies accounted for by different patient populations and potential technical concerns.  [32]). The genomic regions assayed by the three most commonly used methods for FSHD diagnostics are shown. The methylation sensitive restriction enzyme (MSRE) digestion and Southern blotting method assays the methylation state of the most proximal FseI restriction enzyme site (yellow highlight) on both alleles for the 4q and 10q chromosomes when probed with p13E-11 [25,30]. The DR1 BSS assay identifies FSHD2-specific DNA hypomethylation [33]. The BSS assays used by that are also present on both alleles of the 4q and 10q chromosomes [34]. In contrast, the BSS assays used by Jones et al. amplify two products that are specific for the distal-most chromosome 4q, one for 4qA alleles (BSSA, blue), and the other for 4qA-L alleles (BSSL, orange), and one product upstream of the DUX4 ORF that is present on all four alleles (BSSX, green) [32]. The genomic region amplified by the BIS-3′ assay is completely within the BSSA amplified region and the DR1 region is completely within the BSSX region.
Recently, the genetic connection between FSHD2 and arhinia/Bosma arhinia microphthalmia syndrome (BAMS), caused by mutations in the SMCHD1 gene [38,39], has led to a wider interest in the epigenetic analyses of the 4q35 locus using these two protocols. Thus, it is even more important and timely to clarify the discrepancies between the analyses. Here, we investigated the differing results and interpretations between these two techniques by performing a direct comparison using the original published protocols [32,34] on the same gDNA samples. We found that a lack of amplicon specificity for the BS-PCR primers used to amplify the 4q35 and 10q26 D4Z4 regions in the assay by Gaillard  [32]). The genomic regions assayed by the three most commonly used methods for FSHD diagnostics are shown. The methylation sensitive restriction enzyme (MSRE) digestion and Southern blotting method assays the methylation state of the most proximal FseI restriction enzyme site (yellow highlight) on both alleles for the 4q and 10q chromosomes when probed with p13E-11 [25,30]. The DR1 BSS assay identifies FSHD2-specific DNA hypomethylation [33]. The BSS assays used by Gaillard et al. separately amplify three different regions of the D4Z4 (BIS-5 , BIS-mid, BIS-3 ; pink) that are also present on both alleles of the 4q and 10q chromosomes [34]. In contrast, the BSS assays used by Jones et al. amplify two products that are specific for the distal-most chromosome 4q, one for 4qA alleles (BSSA, blue), and the other for 4qA-L alleles (BSSL, orange), and one product upstream of the DUX4 ORF that is present on all four alleles (BSSX, green) [32]. The genomic region amplified by the BIS-3 assay is completely within the BSSA amplified region and the DR1 region is completely within the BSSX region.
Recently, the genetic connection between FSHD2 and arhinia/Bosma arhinia microphthalmia syndrome (BAMS), caused by mutations in the SMCHD1 gene [38,39], has led to a wider interest in the epigenetic analyses of the 4q35 locus using these two protocols. Thus, it is even more important and timely to clarify the discrepancies between the analyses. Here, we investigated the differing results and interpretations between these two techniques by performing a direct comparison using the original published protocols [32,34] on the same gDNA samples. We found that a lack of amplicon specificity for the BS-PCR primers used to amplify the 4q35 and 10q26 D4Z4 regions in the assay by Gaillard et al. [34] accounts for the different results between the two techniques. Importantly, we show that DNA methylation of the 4q35 D4Z4 arrays, when accurately and specifically assayed [32,36], is in fact diagnostic for FSHD and distinguishes FSHD1 from FSHD2. In addition, DNA methylation data obtained using oligonucleotide sequences (including more recent NGS-based analysis) from the Gaillard et al. study [34] should be interpreted with extreme caution [38,[40][41][42].

DNA Methylation Analysis
Genomic DNA samples (1.5 µg) were bisulfite converted using the EpiTect Bisulfite Kit (Qiagen, Germantown, MD, USA) following the manufacturer's protocol. All PCRs were performed using a BioRad C1000 Touch thermal cycler. The 4qA specific BSSA (300 ng of converted DNA), 4qAL specific BSSL (150 ng of converted DNA), and 4q/10q D4Z4 specific BSSX (150 ng of converted DNA) BS-PCRs were performed as described [32,36]. The BIS-3 , BIS-5 , and BIS-Mid BS-PCRs (150 ng of converted DNA for each) used published oligonucleotide primers (Table S1, with corrected orientation) [34] and appropriate thermocycler conditions to amplify these D4Z4 regions. Briefly, the BIS-3 , BIS-5 , and BIS-Mid regions were amplified as follows: 94 • C for 2 min, 35 cycles of 94 • C for 15 s, 54 • C for 15 s, 72 • C for 25 s, followed by 72 • C for 10 min. All PCRs used GoTaq HS Polymerase (Promega) and all oligonucleotide primers used are listed in Table S3. All PCR products were cloned into the pGEM-T Easy vector (Promega) and at least 15 independent colonies were sequenced for each region and analyzed using the bisulfite analysis website, BISMA (http://services.ibc.uni-stuttgart.de/BDPC/BISMA/, accessed on 30 June 2021) with default parameters [45]. Any sequences corresponding to 4A166 or 10A166, if present due to rare aberrant amplification, were readily identified by SNPs and eliminated from the analysis.

Results
Here, we addressed the conflicting published data regarding FSHD-relevant DNA methylation of the chromosome 4q35 and 10q26 D4Z4 macrosatellite repeat regions from two similar approaches [32,34] by directly comparing the results of the protocols performed on the same genomic DNA samples. The Jones et al. method analyzes two regions of the FSHD-associated D4Z4 array, one specific for the distal-most chromosome 4qA or 4qAL RU (BSSA or BSSL) and another that specifically amplifies all 4q and 10q D4Z4 RUs (BSSX) ( Figure 1). Concerns raised with the Jones et al. protocol are: (1) the inclusion of a CpG in one of the primer sequences may bias the results with respect to methylation status and (2) the distal-most D4Z4 repeat may not be representative of the epigenetics of the whole D4Z4 array [32,40,43]. In contrast, the Gaillard et al. method analyzes three regions of the D4Z4 RU (BIS-5 , BIS-mid, and BIS-3 ), all of which are present in all chromosome 4q and 10q D4Z4 RUs (Figure 1), and does not include any CpG in the primer design. However, one serious concern is the large number of missing predicted CpGs in the products assayed despite a >90% sequence identity cut-off for analysis, suggesting amplification of alternative D4Z4 loci [23,34]. The two protocols assay slightly different regions of D4Z4 and a direct comparison cannot be made for all of them ( Figure 1); however, the BIS-3 amplicon of the Gaillard et al. protocol is completely contained within the BSSA amplicon of the Jones et al. protocol (Figures 1-4), thereby allowing at least one direct comparison of specificity and interpretation of results between the two assays. An additional complication of comparing the two approaches is the metric each uses for determining if a sample is hypomethylated or not; the Gaillard et al. approach uses the overall average methylation of all BS-PCR products analyzed in each assay, while the Jones et al. approach uses a quartile analysis for distinguishing between FSHD and healthy (BSSA or BSSL assay) and the average methylation for distinguishing FSHD1 from FSHD2 (BSSX assay). Therefore, for this comparative study, we used both sets of metrics on both sets of assays (Tables 2 and 3).
were almost identical to ours, showing hypermethylation from all three amplicons and many expected CpGs missing from the BIS-Mid and BIS-3' assays despite the same 90% lower threshold sequence identity cutoff [34], validating our results using their primer sets. Regardless, despite these concerns, in these healthy subjects, all assays tested and analyzed by either method indicated hypermethylated D4Z4s, as expected [9,13,31,48]. From an FSHD diagnostics standpoint, each healthy individual would be correctly characterized by both assays (Tables 2 and 3).          [32,34]. Methylation analysis of the gDNAs from five healthy individuals (Figure 2, Figure S1 and S2) showed that both sets of assays similarly found all five samples to be hypermethylated > 35%, regardless of using average DNA methylation or quartile analysis (Table 2, Table 3, and Table S2), as was expected for healthy controls and consistent with prior published results using the assays [9,23,32,34,36,46]. However, upon viewing the individual chromosome methylation map reads for the two assays, we found a disconcerting difference in the data obtained from the two approaches. The Jones et al. assay showed almost complete coverage (>98%) of all expected CpGs in both regions amplified based on the chromosome 4q35 D4Z4 sequence, as indicated by mostly blue and red squares (Figure 2, Figure S1 and S2). In contrast, two of the assays in the Gaillard et al. protocol, BIS-Mid and BIS-3 , produced methylation signatures that were missing many CpGs based on the reference D4Z4 sequences being amplified, as indicated by the numerous white squares. In fact, in the five control subjects analyzed, the BIS-Mid assay only identified 75.8%, 82.1%, 77.0%, 79.1% and 83.6% of the expected CpGs, while the BIS-3 assay identified only 67.9%, 69.3%, 70.9%, 71.8% and 65.6% of the expected CpGs in the amplified regions from the chromosome 4q35 D4Z4 ( Figure  S3). This is despite the BISMA analysis software using the standard metrics for inclusion of >95% conversion rate and a 90% lower threshold for sequence identity [45], which indicates that the amplified sequences are highly similar to the expected sequences over their entire length but not with respect to the CpG sites. It should be noted that the Gaillard et al. study utilized the BiQ Analyzer software [47] for their methylation analysis, presumably under default parameters, instead of BISMA, although this should not change the output data. Regardless, since we also slightly altered the PCR conditions from the published method and used a different bisulfite conversion protocol for standardization [34], we reviewed the primary data in the original Gaillard et al. paper for a comparison. The figures presented show that their control methylation data and graphic methylation representations were almost identical to ours, showing hypermethylation from all three amplicons and many expected CpGs missing from the BIS-Mid and BIS-3' assays despite the same 90% lower threshold sequence identity cutoff [34], validating our results using their primer sets. Regardless, despite these concerns, in these healthy subjects, all assays tested and analyzed by either method indicated hypermethylated D4Z4s, as expected [9,13,31,48]. From an FSHD diagnostics standpoint, each healthy individual would be correctly characterized by both assays (Tables 2 and 3).
We similarly analyzed gDNA obtained from five FSHD1 subjects (Figure 3 and Figure S4) and six FSHD2 subjects (Figure 4 and Figure S5) using both assays. Again, the BSSA, BSSX, and BIS-5'assays identified nearly 100% of the expected CpGs in all five samples while the BIS-MID and BIS-3' assays failed to identify~25% of the expected CpGs (74.6% to 83.6% and 70.5% to 74.6%, respectively, Figure S3) for each subject, as indicated by white boxes in the graphic representations. However, for these FSHD samples, the reported methylation profiles produced significantly different results and diagnostic interpretations. The BSSA assay was designed to distinguish FSHD from healthy methylation based on <25% methylation for the first quartile (Q1) if an individual has two FSHD permissive 4A161 chromosomes or the second quartile (Q2) if an individual has one FSHD permissive 4A161 chromosome (Table 1). It should be noted that the 4A166 allele, while technically FSHD permissive due to the presence of the DUX4 PAS in exon 3, is not associated with FSHD and is not amplified by the BSSA assay [32,36,49]. Thus, for subjects such as F1-03 and F2-06 that are 4A161/4A166, Q2 is used for the key methylation assessment. Overall, this quartile methylation metric revealed FSHD levels of DNA hypomethylation for all fourteen genetically confirmed FSHD subjects in this study (Table 2), consistent with prior reports [32,36]. The BSSX assay was designed to distinguish FSHD1 methylation from FSHD2 methylation using the average methylation for all 4q35 and 10q26 D4Z4 BS-PCR products analyzed. This assay showed the expected FSHD1 methylation signature (>35%) for all five FSHD1 subjects and the expected FSHD2 methylation signature (<30%) for all six FSHD2 subjects (Table 2, Figure 5 and Figure S6). Thus, the analysis using the Jones et al. assays was consistent with prior published data [32,36,39,50] and consistent with known FSHD epigenetics [9,10,13,31,46,48]. In contrast, the BIS-5 , BIS-Mid, and BIS-3 assays found no differences in DNA methylation between the FSHD1 and healthy subjects (Table 3). Those assays fared slightly better with FSHD2. While the BIS-5 assay found characteristic FSHD2 hypomethylation (<30%) in four out of the six FSHD2 samples, the BIS-Mid and BIS-3 assays found no differences between FSHD2 and healthy subjects (Table 3 and Figure 5). Similarly, these results are consistent with the published data using these assays [34,38,40,42].
of the 4q35 D4Z4 array including the D4Z4 (orange), DUX4-fl ORF (green), DUX4 Intron 1, Exon 2, Intron 2 (purple), and Exon 3 (pink), with exact location indicated by base pairs (bp) downstream from the KpnI site of the distal D4Z4 RU. Blue boxes indicate unmethylated CpGs, red boxes indicate methylated CpGs, and white boxes indicate no CpG where one is expected. Average methylation percentages are denoted "Avg" and have values of 15.2% for BSSX, 51.3% for BIS-5′, 61.9% for BIS-Mid, 61.2% for BIS-3′, and 13.1% for BSSA. This FSHD2 subject has a haplotype of 4A161/4B163. Therefore, the diagnostic methylation percentages of the second quartile (Q2) are included for BSSA (10.7%) and BIS-3′ (70%) for comparison.  The discrepancy between the techniques with respect to FSHD2 methylation is particularly concerning as DNA hypomethylation (<30% on average) of all four of the relevant chromosome 4q35 and 10q26 D4Z4 arrays, as determined by MSRE assay, has long been considered diagnostic for FSHD2 [10,25]. Similarly, the DR1 BSS assay independently validated 4q35 and 10q26 D4Z4 hypomethylation as a signature of FSHD2 [33]. Therefore, we plotted the FSHD2 data obtained from all five assays for comparison ( Figure 5). Conceptually, the BSSX, BIS-5 , BIS-Mid, and BIS-3 each amplify from all four relevant D4Z4 arrays (Figure 1). Consistent with previous independent data, the BSSX assay reports <30% average methylation for all six FSHD2 samples (Table 2 and Figure 5). However, the average methylation as well as the vast majority of BSS reads from the BIS-Mid and BIS-3 assays never reach near to the predicted FSHD2 levels of hypomethylation. The BIS-5 assay, which has the best concordance of analyzed CpGs to expected CpGs of the three Gaillard et al. amplicons (Figure S3), reports median FSHD2 levels of methylation for four of the six FSHD2 samples (Table 3 and Figure 5). This further supports that the Jones et al. amplicons are specifically assaying the relevant FSHD regions and the Gaillard et al. amplicons, and the BIS-mid and BIS-3' in particular, are likely not specific to the FSHD region.
Overall, the presented data supports that the reported discrepancies for FSHD1 and FSHD2 methylation between the two BSS methods are not due to differences between the individuals being assayed or to epigenetic and genetic differences within the 4q35 D4Z4 RUs [40], but are instead due to differences in the assays themselves, especially with regard to specificity. It should be noted that there are several other D4Z4 arrays in the human genome, as discussed below, that have highly similar sequences to the 4q and 10q D4Z4s, but are genetically distinct and not epigenetically dysregulated in FSHD1 or FSHD2 [23].
As shown here and in previous studies [32,36,50], the BSSA assay very accurately identifies FSHD-specific hypomethylation in both forms of FSHD, but not controls. However, a concern was raised that the BSSA results may not be a true representation of the potentially varied methylation states of the chromosome 4 D4Z4s in all cells and instead could be selectively amplifying hypomethylated chromosomes due to the design of the oligonucleotide primers used [43]. The basis of this concern is that some of the oligonucleotide primers used in the Jones et al. assays encompass sequences that contain CpGs. Since CpGs can exist as either methylated or unmethylated, and the two states would give a different DNA sequence in the bisulfite-converted DNA (CpG when methylated or UpG when unmethylated), a primer designed against one or the other would preferentially amplify that state of DNA. Thus, CpGs are typically avoided in BSS primer design when an unbiased result is desired. However, this bias can be avoided, as we have done. In the BSSA amplicon, the two nested reverse primers are in a region that contains a single CpG, which was necessary to achieve the desired specificity for the distal 4qA D4Z4 RU. Similarly, the reverse primer for the BSSX amplicon contains two CpGs. Since specificity for the correct D4Z4 repeat array was the most important factor in primer design, these CpG base pairs were included, but only after taking into consideration the two possible states. As shown in Table S1, primers BSSA-3742R, BSSA-3626R, and BSSX-1036R contain an "R" designation at the CpGs in question, which corresponds to either a G or A at that base randomly included in the primer synthesis. Since there is only one "R" base pair per BSSA primer sequence, statistically 50% of the primers will be 100% complementary to the methylated state (a G to base pair with a C) and 50% of the primers will be 100% complementary to the unmethylated state (an A to base pair with the U) and they will be in equal abundance in the BS-PCR. The difference in melting temperatures between the "A" and "G" primers is negligible. However, to determine experimentally if there is any preference for a methylation state in this assay, four additional subjects (for a total of five) with genetically confirmed FSHD1, a single contracted 4qA allele (hypomethylated), and a second noncontracted 4qA allele (hypermethylated) and three additional healthy subjects (for a total of four) with two noncontracted 4qA alleles were tested using the BSSA assay. If there was a preference for one state or the other, we would expect a shift in the results away from 50%. As expected, we found two roughly equivalent pools of chromosomes with respect to methylation state in FSHD1 ( Table 2, Table S3 and Figure S6), with no apparent preference for the methylation state of the amplified chromosome, as indicated by the ready amplification of chromosomes with between 0 to 80% methylation in FSHD1 or 20 to 78% in the controls.
Overall, this direct comparison of two commonly used BSS methods for FSHD indicates that the Jones et al. methodology produces an accurate assessment of the DNA methylation state of the FSHD-associated chromosome 4q35 D4Z4 array that is consistent with prior epigenetic analyses of FSHD using other methods [6,7,9,13,31,46], and accurately distinguishes FSHD from healthy, as well as distinguishing FSHD1 from FSHD2. In contrast, the Gaillard et al. methodology does not distinguish FSHD from healthy, does not accurately assess FSHD1 methylation, and only occasionally correctly identifies an FSHD2 methylation state with one of their three assays. Therefore, published data produced from the Gaillard et al. assays, including the recent NGS version based on the same primer sequences, should be interpreted with extreme caution [34,40,42,43,51].

Discussion
The question has been raised, "Does DNA methylation matter in FSHD?" [43] There are two key requirements for addressing this important question: (1) the DNA methylation status of the FSHD locus must be accurately and specifically analyzed, and (2) the subjects analyzed must have a correct clinical diagnosis. Here, we further validated our BSS-based epigenetic diagnostic protocol for FSHD [32] and directly addressed a controversy in the field regarding the utility of BSS assays for accurately assessing the epigenetic status of the chromosome 4q35 and 10q26 D4Z4 repeat arrays. A prior attempt to reconcile the differences between two commonly used assays found the same reported discrepancy and thus came to the conclusion that the distal-most D4Z4 RU, which is specifically assayed in the Jones et al. BSSA method, must be epigenetically distinct from the rest of the contracted (FSHD1) 4q35 or dysregulated 4q35 and 10q26 (FSHD2) D4Z4 RUs [40]. However, their analysis only showed summaries of the methylation data that could not be evaluated for sequence integrity. We believe the simpler and more likely explanation is that the Gaillard et al. methodology results in amplification of divergent D4Z4s that are irrelevant to FSHD.
There are many similar D4Z4 repeat arrays in the human genome, yet only the chromosome 4q35 and 10q26 D4Z4 arrays, which have >98% sequence identity between their respective D4Z4 RUs and are the only D4Z4 RUs to encode an intact DUX4 open reading frame (ORF) [5,28,52,53], are relevant to FSHD genetics and epigenetics [23]. Thus, when investigating FSHD epigenetics, it is extremely important to assay only these specific D4Z4 arrays and RUs. To accomplish this difficult feat, the original technique to assess the epigenetic state of the D4Z4 regions, methyl-sensitive restriction enzyme digestions (MSRE) using FseI digestion of the proximal 4q and 10q D4Z4 RU (Figure 1), utilized Southern blotting probed with the p13E-11 DNA sequence that is specific for the region centromeric to these two D4Z4 arrays [5,6,9,10,13,31,46]. These studies showed that only the contracted 4q35 D4Z4 array was hypomethylated in FSHD1 while all four 4q35 and 10q26 D4Z4 arrays were hypomethylated in FSHD2. Subsequently, multiple independent BSS assays (Hartweck et al., Jones et al., and Calandra et al.) were used to confirm the allelespecific FSHD1 hypomethylation profile [7,32,36] and the broader FSHD2 hypomethylation profile [7,32,33]. Only the Gaillard et al. BSS analysis has continually failed to produce these distinct hypomethylation profiles, and the graphic representations of the data, when shown, strongly support that the nonspecific amplification of divergent D4Z4 RUs is to blame.
The D4Z4 RUs from chromosomes 4q and 10q are nearly identical (>98%) across their entire 3.3kb DNA sequences [5,28,52,53], including most of the CpGs, with only a couple of nucleotide variants [23], and this is supported by the near 100% coverage of expected CpGs shown in the graphic representation of the data from the Jones et al. BS-PCR amplicons (Figures 2-4 and Figures S1-S6, red and blue boxes). In addition, long-read sequencing through a 13 RU 4qA D4Z4 BAC shows a nearly identical sequence for each D4Z4 RU [28]. Conversely, the D4Z4 arrays on chromosomes 3, 13, 14, 15, 21, 22, and the Y chromosome, which are not epigenetically dysregulated in either form of FSHD, have 30-60 nucleotide variants per RU just in the~500 bp region corresponding to their putative DUX4 ORFs, and 70% of those variants are C/G to A/T transitions, thus eliminating potential methylation sites [23]. BS-PCR amplification and analysis from these divergent D4Z4 RUs would not be eliminated by a 90% lower threshold cut-off for sequence identity and would allow for the inclusion of these hypermethylated BSS reads in all samples, which would skew the analysis towards more reported methylation. A tell-tale sign that this is happening would be the absence of multiple predicted CpGs (shown as white boxes in the graphic representations of methylation data) and suspiciously high methylation in FSHD2 samples. As shown in Figure 4 and Figure S4, the BIS-Mid and BIS-3 assays likely amplify these divergent and perpetually hypermethylated D4Z4 RUs. Interestingly, in the graphic representations of the FSHD2 data for the BIS amplicons (Figure 4 and Figures S4), which are ordered by highest (top) to lowest (bottom) methylation, the few sequences with 100% CpG coverage (no white boxes) are also those with the lowest percent methylation (bottom) and are more consistent with FSHD2 levels (<30%) of methylation. In addition, considering that the region amplified by the BIS-3' assay, with numerous missing CpGs in the analysis, is completely contained within the BSSA assay region (Figures 1-4, Figures S1 and S3-S5), which identified >98% of expected CpGs, it is clear that the BIS-3 assay is not specific for the 4q D4Z4, while the BSSA assay is specific. Thus, we conclude that the data obtained from these two amplicons is nonspecific for the 4q and 10q D4Z4 arrays and thus not relevant for FSHD. The BIS-5 assay appears more specific than the other Gaillard et al. assays; however, since it analyzes a region that can only be used to assess FSHD2, the results of this assay have no bearing on FSHD1 epigenetics.
The above data supports that, when properly analyzed, the BSS analysis of the FSHD locus is a diagnostic for genetic FSHD1 and FSHD2 [32], which would represent a significant breakthrough in the field with respect to global accessibility. Traditional FSHD diagnostics, using pulsed-field gel electrophoresis and Southern blotting, and the new single molecule fluorescent techniques physically measure the sizes of the 4q35 D4Z4 arrays [25][26][27]29,30] and long-read diagnostic DNA sequencing assays being developed for sequencing through an entire FSHD1-contracted D4Z4 array [28,54] all require high quality HMW gDNA and cannot be performed on the gDNA isolated from saliva samples or on biobanked gDNA, greatly limiting their applicability. In contrast, epigenetic analysis using BSS is PCR-based and can be performed on gDNA isolated from any source, including saliva, which can be collected through the mail, making this type of genetic testing highly accessible. However, due to the referenced conflicting published BSS data for FSHD between Jones et al. (2014) [32] and Gaillard et al. (2014) [34], there has been a disagreement in the field as to which, if any, BSS-based analysis accurately represents the epigenetic status of the chromosome 4q35 disease locus. We have now clarified the FSHD BSS landscape and conclude that, in theory, BSS using the Jones et al. method could be used for accurate molecular diagnosis of genetic FSHD1 and FSHD2. However, larger controlled cohorts need to be analyzed and reported before this technique is put into clinically relevant practice.

Conclusions
We directly addressed two conflicting reports on the utility of BSS analysis for FSHD diagnostics and interpretations for FSHD and arhinia epigenetic studies. We determined that the discrepancies are due to differences in specificity of the amplicons for the FSHD locus. We conclude that the original Jones et al. protocol (2014) is highly specific for the FSHD-associated chromosome 4q35 D4Z4 array and is diagnostic for distinguishing FSHD from healthy subjects and FSHD1 from FSHD2 subjects. In contrast, the Gaillard et al. protocol (2014) does not accurately report the methylation state of the FSHD locus and instead nonspecifically amplifies related but unlinked D4Z4 repeats that are not epigenetically dysregulated in FSHD, thereby skewing the data towards reporting an inaccurate, more methylated state [23]. Thus, DNA methylation data obtained for FSHD and arhinia studies based on the Gaillard et al. BS-PCR amplicons should be interpreted with caution, if not revisited.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/diagnostics11081469/s1, Table S1: Oligonucleotide primers used for BS-PCR (5 →3 ), Table  S2: Epigenetic analysis using the Jones et al. BSSL method for 4qAL alleles, Table S3: Epigenetic analysis using the Jones et al. BSSA method on subjects with a 4A161/4A161 haplotype, Figure S1: BSS analysis comparing healthy epigenetic signatures, Figure S2: BSS analysis using the BSSL assay for two healthy subjects, Figure S3: Graphs plotting the percentage of CpGs analyzed compared with those in the reference sequence for each BSS assay are indicative of the specificity of each amplicon for the 4q35 and 10q26 D4Z4 arrays, Figure S4: BSS analysis comparing FSHD1 epigenetic signatures, Figure S5: BSS analysis comparing FSHD2 epigenetic signatures, Figure S6: BSSA analysis of FSHD1 and healthy controls with 4A161/4A161 haplotype finds no bias for the methylation state.  Institutional Review Board Statement: The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the University of Nevada, Reno Institutional Review Board (#1316095, approved on 9 October 2018) and all qualified subjects provided written informed consent. De-identified samples provided were determined by the IRB to be exempt (Exemption Category #4).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
All data is reported in this manuscript and supplementary information.