Diagnostic Allele-Specific PCR for the Identification of Candida auris Clades

Candida auris is an opportunistic pathogenic yeast that emerged worldwide during the past decade. This fungal pathogen poses a significant public health threat due to common multidrug resistance (MDR), alarming hospital outbreaks, and frequent misidentification. Genomic analyses have identified five distinct clades that are linked to five geographic areas of origin and characterized by differences in several phenotypic traits such as virulence and drug resistance. Typing of C. auris strains and the identification of clades can be a powerful tool in molecular epidemiology and might be of clinical importance by estimating outbreak and MDR potential. As C. auris has caused global outbreaks, including in low-income countries, typing C. auris strains quickly and inexpensively is highly valuable. We report five allele-specific polymerase chain reaction (AS-PCR) assays for the identification of C. auris and each of the five described clades of C. auris based on conserved mutations in the internal transcribed spacer (ITS) rDNA region and a clade-specific gene cluster. This PCR method provides a fast, cheap, sequencing-free diagnostic tool for the identification of C. auris, C. auris clades, and potentially, the discovery of new clades.

Currently, the most reliable, efficient, and therefore, recommended methods for C. auris identification are matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) and rDNA sequencing [4,8]. The disadvantage of these methods is the cost, high-tech equipment, and skilled labor they require. Recently, several PCR amplification of a C. auris-specific ITS2 amplicon but are insensitive to the species that C. auris is commonly misidentified as. For clade identification, we designed a duplex PCR that targets a region of ITS1 that is divergent for four of the five described clades (clades I and III have identical ITS sequences), while we used a clade III/V-specific gene cluster to discriminate clades I and III.
This assay can be used as a tool for epidemiological research and is, due to its lowcost and low-tech necessities, ideal for lower budget research settings including developing regions in which C. auris outbreaks have frequently occurred [14,41]. Moreover, clade-associated phenotypic traits such as virulence and drug resistance can make clade diagnostics an important tool for the clinic, by enabling a quick assessment of the risk for resistance-induced treatment failure or proneness to hospital outbreaks. Lastly, our clade diagnostic tool, along with a more in-depth analysis of the ITS region, can serve to identify potential new clades of C. auris.

Sequence Analysis for Species-and Clade-Specific Allele Selection
For the identification of C. auris, the ITS region was investigated for species-specific SNPs and/or indels. An alignment of 285-709 bp covering the ITS regions of 32 species was created in CLC Main Workbench v8.1 (Qiagen ® , Hilden, Germany), by using TYPE/reference strain sequences from NCBI GenBank (ncbi.nlm.nih.gov/genbank/ (accessed on 8 February 2021)). Information about all sequences included in this alignment is listed in Table A1 (Appendix A). The selection of species was based on literature concerning the (mis)identification of C. auris [4,7,[42][43][44] and includes most pathogenic Candida species [45]. The ITS alignment was manually searched for C. auris-specific regions to design a C. auris specific AS-PCR primer.
For the identification of C. auris clades, a 212-214 bp ITS alignment was created that contained 121 sequences: 96 typed C. auris strains from clade I, II, III, and IV as reported by Vatanshenassan et al. [31] (NCBI GenBank accessions MN242989-MN243084), two Iranian clade V strains [21,22] (NCBI GenBank accessions: MW019910.1 and MZ389242), and 23 C. auris strains from an in-house collection of isolates from all five clades. The latter consists of 15 clade I strains, three clade III strains, three clade IV strains, one clade II strain, and one clade V strain and includes at least one typed strain per clade for which whole-genome sequencing data are available (e.g., clade I: B8441, clade II: B11220, clade III: B11221, clade IV: B11244 [2,3,13,17], and clade V [12]: B18474/IFRC2087; see GenBank accessions PRJNA328792 and PRJNA541007, respectively). Strain information of the in-house collection is summarized in Table A2 (Appendix A). This alignment was manually searched for clade-specific regions to design AS-PCR primers. To illustrate clade divergence based on ITS, an UPGMA (unweighted pair group method with arithmetic mean) phylogenetic tree with Jukes-Cantor correction and 1000 bootstraps was created in CLC Main Workbench v8.1 (Qiagen ® , Hilden, Germany). To differentiate between clade I and III isolates, which have identical ITS sequences, an L-rhamnose-1-dehydrogenase (RHA1) gene was targeted. RHA1 is part of a rhamnose assimilation gene cluster present in clade III but not in clades I, II, and IV [46]. As data on the presence of this gene cluster were lacking for clade V, this sequence was extracted from the clade III C. auris isolate B11221 genome (accession PRJNA328792) and used as a query for an NCBI BLAST ® search against the SRA Illumina data of one Iranian clade V isolate B18474/IFRC2087 (SRX5786024) [12]. Reads were assembled in CLC Main Workbench v21.0.3 (Qiagen ® , Hilden, Germany).

Strains and Media
To test the AS-PCR assay for C. auris identification, a panel of 15 species was used. This panel was composed of species for which the ITS region shows similarities to the C. auris ITS region and/or species that C. auris has been misidentified as [4]. This panel To test the AS-PCRs for clade identification, a panel of 23 C. auris strains was used. Strain origin, strain reference nr (if assigned), and strain clade as identified based on ITS sequencing, microsatellite typing, and/or whole-genome sequencing of each C. auris strain used in this project are listed in Table A2 (Appendix A).
All strains were grown on yeast peptone dextrose (YPD, 2% glucose) agar at 30 • C or 37 • C and stored in YPD liquid medium containing 25% glycerol at −80 • C.

PCR and Sequencing
The ITS region of all C. auris and non-C. auris strains used in this study (see Section 2.2) was sequenced to confirm targeted variable regions. The targeted sequences were amplified by PCR, using Q5 ® High-Fidelity DNA polymerase (New England Biolabs Inc., Ipswich, MA, USA). The total reaction volume of 50 µL contained 500 ng of purified DNA, 5 µL of dNTPs (2.5 mM), 10 µL of 5× Q5 buffer, 0.5 µL of Q5 polymerase, and 0.4 µL of both universal fungal barcoding primers ITS1 and ITS4 (100 µM) ( Table 1) [47]. The PCR program consisted of initial denaturation at 98 • C for 30 s, 30 cycles of 98 • C for 10 s, 59 • C for 25 s, and 72 • C for 30 s, and a final elongation step at 72 • C for 2 min in a Labcycler Basic thermocycler (Bioké, Leiden, The Netherlands). Correct amplification was verified by gel electrophoresis of 5 µL of the PCR product on a 1% agarose gel. Sanger sequencing (TubeSeq service) was performed by Eurofins (Nazareth, Belgium).

Allele-Specific Primer Design
Primers for AS-PCR were designed following the method of Liu et al. [34]. All allelespecific primers were designed in silico using CLC Main Workbench v8.1 (Qiagen ® ) and are listed in Table 1. To further increase the primer specificity, specific mismatches at the 3 end of the allele-specific primer were implemented in several primers (see Figure 1). By including additional mismatches, the specificity of the primers for the right allele can be increased at higher annealing temperatures [34]. Specific polymorph nucleotides are shown in bold, primers are shown in gray, and mismatches contained within the primer are shown in red. 1 Nucleotide position from ITS1 forward primer. 2 Length based on ITS region of clade I.

C. auris Clade Identification
The ITS alignment of 121 C. auris strains from all five C. auris clades showed no intraclade variability within the ITS region. Between clades, in both the ITS1 and the ITS2 regions, clade-specific polymorphisms were found. This resulted in four main ITS-based clusters (clades I and III, clade II, clade IV, and clade V), in a phylogenetic tree ( Figure A1, Appendix A). A rhamnose assimilation gene cluster, reported to be present in clade III but deleted in clades I, II, and IV [46], was found to be present in the three clade V isolates that were investigated. Targeting this gene cluster and a clade V-specific ITS region enabled to discriminate clade III from clade I ( Figure 2C,E).

AS-PCR
All AS-PCRs were performed using Taq DNA polymerase with Standard Taq buffer (New England Biolabs Inc.). The total reaction volume of 20 µL contained 2 µL of purified DNA extract (10 ng/µL), 1.6 µL of dNTPs (2.5 mM), 2 µL of 10× Standard Taq buffer, 0.1 µL of Taq DNA polymerase, and the following primers: 0.1 µL of the ITS1 forward primer (100 µM) and the ITS_Cau_R reverse primer (100 µM) for species identification (simplex PCR) or 0.2 µL of the ITS1 forward primer (100 µM), 0.1 µL of the ITS_Cau_R reverse primer (100 µM), and 0.2 µL of the clade-specific reverse primer (100 µM) for clade identification (duplex PCR). The PCR program consisted of initial denaturation at 95 • C for 30 min, 35 cycles of 95 • C for 20 min, primer annealing at a primer specific temperature (Ta, see Table 1) for 30 min, and amplicon elongation for 30 min at 68 • C. The PCR was terminated by a final elongation at 68 • C for 5 min. All reactions were performed using a Labcycler Basic thermocycler (Bioké). Correct amplification was verified by 2% agarose gel electrophoresis of 10 µL of the PCR product. The specific annealing temperature of all AS-primers was identified by performing the same procedure as described above but using a 12-step temperature gradient from 50 • C to 70 • C and 60 • C to 80 • C as annealing temperature. From the window of specific amplification, one temperature was selected as annealing temperature (Ta).

C. auris Identification
The alignment of 32 species showed great variation within the ITS regions. Figure 1A shows a fragment of the ITS2 region on which the C. auris specific reverse primer (ITS_Cau_R) was designed, amplifying a fragment of 296-293 bp when paired with the universal fungal barcoding forward primer ITS1 [47]. The allele-specific reverse primer contains a G-T mismatch at the third position of the 3 end to increase specificity for C. auris. At an annealing temperature of 78 • C, C. auris but not C. haemulonii, C. pseudohaemulonii, C. duobushaemulonii, C. albicans, C. glabrata, C. dubliniensis, C. tropicalis, C. parapsilosis, C. orthopsilosis, C. metapsilosis, C. lustianae, C. krusei, C. sake, or S. cerevisiae DNA is amplified, as shown in Figure 2A. Sequencing the ITS region of our panel of 15 species confirmed the variability in the ITS2 region which was targeted. This region did not show variability between C. auris clades. Several strains from our in-house collection were typed on the basis of whole-genome sequencing data (e.g., strain 14 (B8441, clade I), strain 23 (B11220, clade II), strain 3 (B11221, clade III), strain 2 (B11245, clade IV), and strain 22 (B18474, clade V); see Table  A2, Appendix A). We optimized and tested our AS-PCR primers using these strains and confirm correct clade identification ( Figure 2B-E). Moreover, correct placement of the sequenced ITS region of these strains within the correct clade cluster reconfirmed clade phylogeny ( Figure A1, Appendix A).
To test our AS-PCR for clade identification, 23 C. auris strains (15 clade I strains, one clade II strain, three clade III strains, three clade IV strains, and one clade V strain) were screened. Information about these strains is summarized in Table A2 (Appendix A). The results of the four allele-specific multiplex PCRs in which the C. auris-specific amplicon (primers ITS1 and ITS_Cau_R) is duplexed with the clade II (primers ITS1 and ITS_Cau-CII_R)-, clade III and V (primers RHA1_CauCIII/V_F and RHA1_CauCIII/V_R)-, clade IV and V (primers ITS and ITS_CauCIV/V_R)-, and clade V (primers ITS1 and ITS_CauCV_R)-specific amplicons, is shown in Figure A2 (Appendix A). This shows that our multiplex AS-PCR assays for clade detection are 100% specific. Additionally, our multiplex design decreases the chance for false-negative results, as the C. auris-specific amplicon works as a positive control for successful PCR amplification.  I  II  III  IV  V  I  II  III  IV  V   I  II  III  IV  V   1  2  3  4  5  6

C. auris Clade Identification
The ITS alignment of 121 C. auris strains from all five C. auris clades showed no intra-clade variability within the ITS region. Between clades, in both the ITS1 and the ITS2 regions, clade-specific polymorphisms were found. This resulted in four main ITS-based clusters (clades I and III, clade II, clade IV, and clade V), in a phylogenetic tree ( Figure A1, Appendix A). A rhamnose assimilation gene cluster, reported to be present in clade III but deleted in clades I, II, and IV [46], was found to be present in the three clade V isolates that were investigated. Targeting this gene cluster and a clade V-specific ITS region enabled to discriminate clade III from clade I ( Figure 2C,E).
Several strains from our in-house collection were typed on the basis of whole-genome sequencing data (e.g., strain 14 (B8441, clade I), strain 23 (B11220, clade II), strain 3 (B11221, clade III), strain 2 (B11245, clade IV), and strain 22 (B18474, clade V); see Table A2, Appendix A). We optimized and tested our AS-PCR primers using these strains and confirm correct clade identification ( Figure 2B-E). Moreover, correct placement of the sequenced ITS region of these strains within the correct clade cluster reconfirmed clade phylogeny ( Figure A1, Appendix A).
To test our AS-PCR for clade identification, 23 C. auris strains (15 clade I strains, one clade II strain, three clade III strains, three clade IV strains, and one clade V strain) were screened. Information about these strains is summarized in Table A2 (Appendix A). The results of the four allele-specific multiplex PCRs in which the C. auris-specific amplicon (primers ITS1 and ITS_Cau_R) is duplexed with the clade II (primers ITS1 and ITS_CauCII_R)-, clade III and V (primers RHA1_CauCIII/V_F and RHA1_CauCIII/V_R)-, clade IV and V (primers ITS and ITS_CauCIV/V_R)-, and clade V (primers ITS1 and ITS_CauCV_R)-specific amplicons, is shown in Figure A2 (Appendix A). This shows that our multiplex AS-PCR assays for clade detection are 100% specific. Additionally, our multiplex design decreases the chance for false-negative results, as the C. auris-specific amplicon works as a positive control for successful PCR amplification.

Discussion
In this study, we show that combining the variability in the ITS region with the presence or absence of a rhamnose assimilation gene cluster can be used to identify C. auris and identify to which of the five main clades the C. auris strain belongs. The four AS-PCR assays for clade identification each consist of a duplex PCR reaction with a C. aurisspecific amplicon and a clade-specific amplicon. This significantly reduces the chance of false-negative results in screening assays, as the C. auris-specific amplicon serves as an internal control. AS-PCR provides a rapid, low-cost, low-tech alternative to other clade identification methods such as (genome) sequencing or microsatellite typing. Nevertheless, we do recommend validating this diagnostic assay with sequencing and typed reference strains before use in screenings, as PCR-based methods can show variability due to technical discrepancies.
Molecular diagnostics are some of the most reliable identification methods for microorganisms. However, sequencing of molecular barcodes such as ITS requires time, specialized equipment, and analysis. Therefore, several sequencing-independent DNA-based diagnostic assays have been developed. These can be divided into three groups: end-point PCR assays (simplex or multiplex PCR and gel electrophoresis), quantitative PCR (qPCR) methods, and nonconventional detection methods such as loop-mediated isothermal amplification (LAMP) or T2 nuclear magnetic resonance measurement [9]. The method we report here belongs to the first group and only requires standard PCR reagents, a thermocycler, and gel electrophoresis setup. Such a method is ideally fit for low-budget research settings. Nevertheless, our method has some potential disadvantages. It is, like most diagnostic methods, a culture-dependent assay. Additionally, no non-auris Candida species can be detected, and typing is limited to the five currently described clades. Nevertheless, AS-PCR can be optimized as a qPCR assay for culture-independent diagnosis, as reported with other qPCR methods for detection in clinical samples [8], and other species-specific primers can be designed.
We used the ITS region for species identification as this is the primary fungal barcoding marker, proposed by the Fungal Barcoding Consortium [48]. The ITS region consists of two spacer sequences surrounding the 5.8S rRNA gene in the ribosomal cistron and allows successful identification of a broad range of fungi with a clearly defined barcoding gap between inter-and intraspecific variation [48]. Moreover, single-copy protein-coding regions often show lower PCR amplification and sequencing success compared to the multicopy ITS region, which yields a PCR amplification success of 100% for Saccharomycotina, the subphylum to which Candida species belong [48]. ITS sequencing analyses [4], as well as ITS-based diagnostic PCR assays [44,49,50], have been widely reported for C. auris identification. In addition to species-level identification, ITS sequencing has been used for typing C. auris and other fungal species in the past [31,48,51]. Here, we show that the ITS sequence harbors sufficient interclade diversity to discriminate four out of five clades. An L-rhamnose gene cluster was used to discriminate clade III from clade I, as they share the same ITS sequence. The L-rhamnose gene cluster contains seven genes (RHA1, LRA1, LRA2, LRA3, two copies of TRC1, and an MFS transporter) which are absent in clade I, II, and IV isolates but present in clade III [46]. This pattern was discovered by testing an updated Vitek 2 yeast identification system, in which all clade III isolates but hardly any other C. auris isolates or C. haemulonii showed the ability to assimilate L-rhamnose [46]. This phenotype has not been validated for clade V isolates, but here we show that the L-rhamnose gene cluster is present in three clade V strains, including the type specimen [12].
Clade typing can be of significant epidemiological value, as it provides information on the origin of a strain and can help to monitor nosocomial transmission. Moreover, clade diagnosis can have a clinical value, as clade-associated virulence and resistance implications have been reported [2,3,12,17,[20][21][22][23][24][25][26][27]30,52]. Identification of the clade to which C. auris isolates could belong in an outbreak might, thus, have implications for the choice of treatment and the use of infection control and prevention measures. In scientific research, clade identification is also essential. Molecular and pharmaceutical research of C. auris implies the use of isolates from different clades due to the outspoken phenotypic difference between clades and the possible implications for scientific conclusions on a species level [2,3,17,[20][21][22][23][24][25][26][27]30,52]. Another useful purpose of screening isolates with this diagnostic AS-PCR assay is the potential to identify new clades. When the patterns of clade-specific PCRs differ from an expected outcome, this could be an indication of a novel, undescribed clade. Whole-genome sequencing and microsatellite typing will provide more in-depth insight in such circumstances. Detection of new clades could have a profound impact on our understanding of the emergence and epidemiology of this novel fungal pathogen.
In conclusion, we provide a molecular diagnostic assay to identify C. auris and the five currently described C. auris clades, which can be used for epidemiological, pharmaceutical, clinical, and molecular research. The low-cost and low-tech necessities and fast readout make it a convenient tool, ideal for low-budget research settings. As the number of reported C. auris cases and outbreaks is still on the rise, the use and development of up-to-date, reliable identification and typing tools is essential.  Table A1. All species included in the alignment to identify a C. auris-specific ITS region for the design of primer ITS_Cau_R. The species and strain names are given, along with the ITS sequence NCBI GenBank accession and the size of the sequence in the alignment after trimming.  Figure A1. UPGMA dendrogram of a 248-251 bp ITS alignment of sequences of 121 C. auris strains: 96 typed strains from four clades [31] and three clade V strains [12,21,22], represented by their GenBank accessions, and 23 strains from our inhouse collection, represented by the numbers 1 to 23 (see Table A2). Four ITS-based clusters are shown: clades I and III, clade II, clade IV, and clade V. The scale bar represents the percentage of nucleotide variation. Figure A1. UPGMA dendrogram of a 248-251 bp ITS alignment of sequences of 121 C. auris strains: 96 typed strains from four clades [31] and three clade V strains [12,21,22], represented by their GenBank accessions, and 23 strains from our in-house collection, represented by the numbers 1 to 23 (see Table A2). Four ITS-based clusters are shown: clades I and III, clade II, clade IV, and clade V. The scale bar represents the percentage of nucleotide variation. Figure A2. AS-PCR results of the four clade-specific PCRs for 23 strains of C. auris from our in-house collection, represented by the numbers 1-23 (see Table A2)   Figure A2. AS-PCR results of the four clade-specific PCRs for 23 strains of C. auris from our in-house collection, represented by the numbers 1-23 (see Table A2)