Kompetitive Allele Speciﬁc PCR (KASP) Markers for Potato: An Effective Tool for Increased Genetic Gains

: Potato virus Y (PVY) and Phytophthora infestans (Mont.) de Bary that causes potato late blight (LB), pose serious constraints to cultivated potatoes due to signiﬁcant yield reduction, and phenotyping for resistance remains challenging. Breeding operations for vegetatively propagated crops can lead to genotype mislabeling that, in turn, reduces genetic gains. Low-density and low-cost molecular marker assessment for phenotype prediction and quality control is a viable option for breeding programs. Here, we report on the development of kompetitive allele speciﬁc PCR (KASP) markers for LB and PVY resistance, and for routine quality control assessment of different breeding populations. Two KASP markers for LB resistance and two for PVY Ryadg were validated with an estimated assay power that ranged between 0.65 and 0.88. The developed QC KASP markers demonstrated the capability of discriminating tetraploid calls in breeding materials, including full-sibs and half-sibs. Routine implementation of the developed markers in a breeding program would assist with better allocation of resources and enable precise characterization of breeding material, thereby leading to increased genetic gains.


Introduction
Vegetatively propagated, cultivated potato (Solanum tuberosum L.) is an autotetraploid crop (2n = 4x = 48), which leads to high levels of genetic heterogeneity and increases the complexity of genomic studies and breeding [1]. Phenotypic evaluations of several disease-related traits are challenging, costly, and depend heavily on the environmental conditions in test sites.
Tetraploid potato recurrent selection, as applied at the International Potato Center (CIP) breeding programs, consists of creating variability from parents with known breeding values and selecting for traits of interest (including disease resistance, yield, tuber quality traits, and environmental adaptation) through various selection stages [2]. Hence, heterozygous individuals are multiplied and tested through several stages in different environmental conditions depending on the breeding stage, with the aim of identifying parents for the next recurrent selection cycle, advanced clones for material sharing, and promising clones for variety release. Therefore, clones evaluated in advanced and multi-environment trials are expected to be no different from those in early observational trials. Tubers from field trials are generally not used as seeds to prevent the spread of viral diseases; therefore, tuber multiplication is conducted in parallel with field trials, and tuber seeds of the breeding populations are maintained in controlled environments. However, this process can increase the odds of genotype mislabeling and sampling errors occurring throughout the breeding cycle and across environments in the same breeding stage. Such mistakes lead to a waste of time and resources in addition to a reduction in achieved genetic gains because of low selection accuracy during the breeding cycle. Therefore, quality control (QC) analysis is essential to ensure the traceability and identification of clones at key stages during the breeding process (e.g., identity control and hybrid purity), during conservation Developed KASP markers are now routinely used in the CIP breeding programs for characterizing breeding material with respect to these important diseases affecting potato cultivation. Due to their discriminatory ability, SNP markers are ideal for developing lowcost QC KASP marker sets. Discrimination of breeding material is possible using a relatively small set of appropriately selected SNP markers. Our objective is to report on the development and use of KASP markers for LB and Ryadg resistance and QC analysis for CIP breeding programs.

Trait KASP Marker Development
The M6 marker allele sequences representing resistant and susceptible alleles, previously described by [17], were aligned for SNP identification to develop markers linked to the Ryadg gene. In total, 9 SNP or indels were selected for KASP marker design, and the amplification results were compared with a high-resolution melting (HRM) assay using probe M6P2 [17] on a quality evaluation set of 72 tetraploid potato genotypes from the CIP breeding program. The late blight markers were selected from previous association studies using genotyping by sequencing (GBS) [38] or SolCAP [39] markers, and the KASP results were compared with the original marker scores using a set of 73 and 78 tetraploid potato clones from the CIP breeding program, respectively. Validated SNP markers in the Figure 1. Workflow for KASP marker development and validation from association analysis or SNP marker selection. QTL, quantitative trait loci; GWAS, genome-wide association study; Concentrations, two different DNA concentrations (standard and diluted), whereby results from the two concentrations allow for the determination of the assay's stability and, therefore, sensitivity to DNA concentration; SNP, single nucleotide polymorphism; PCN, potato cyst nematodes.
Developed KASP markers are now routinely used in the CIP breeding programs for characterizing breeding material with respect to these important diseases affecting potato cultivation. Due to their discriminatory ability, SNP markers are ideal for developing low-cost QC KASP marker sets. Discrimination of breeding material is possible using a relatively small set of appropriately selected SNP markers. Our objective is to report on the development and use of KASP markers for LB and Ryadg resistance and QC analysis for CIP breeding programs.

Trait KASP Marker Development
The M6 marker allele sequences representing resistant and susceptible alleles, previously described by [17], were aligned for SNP identification to develop markers linked to the Ryadg gene. In total, 9 SNP or indels were selected for KASP marker design, and the amplification results were compared with a high-resolution melting (HRM) assay using probe M6P2 [17] on a quality evaluation set of 72 tetraploid potato genotypes from the CIP breeding program. The late blight markers were selected from previous association studies using genotyping by sequencing (GBS) [38] or SolCAP [39] markers, and the KASP results were compared with the original marker scores using a set of 73 and 78 tetraploid potato clones from the CIP breeding program, respectively. Validated SNP markers in the late blight resistance locus on chromosome 9 and in the Ryadg M6 resistance locus on chromosome 11 were converted into KASP markers by LGC, BioSearch Technologies (https: //www.biosearchtech.com/, accessed on 10 September 2021) and analyzed by Intertek ® (https://www.intertek.com/agriculture/agritech/, accessed on 21 July 2020).

Assay Verification and Routine Analysis
Assay verification was conducted using a set of 78 advanced tetraploid clones (S 1 ) from the CIP breeding program with known resistance to PVY [40] and late blight [41]. The potato clones originated from the CIP breeding populations A (1 clone), B3 (7 clones), BW (3 clones), and LTVR (67 clones) (Supplementary Table S1 and S2). A detailed description of the CIP potato breeding populations is available in [38]. Assay verification of the PVY markers, snpST0052 and snpST00073, was conducted by comparing the 78 available PVY resistance phenotypes, for which results of enzyme-linked immunosorbent assay (ELISA) tests have previously been recorded as part of the breeding program, with their corresponding genotypes. Clones were classified into five categories: extremely resistant, resistant, hypersensitive, susceptible, and highly susceptible. For the verification calculations, the three first phenotypic categories are treated as resistant to PVY and the two last categories are considered to be susceptible. There were three possible categories for the markers snpST0052 and snpST0073: AA, AG, and GG, where A and G are the dominant favorable alleles linked to the resistance phenotype for snpST0052 and snpST0073, respectively.
Assay verification was similarly conducted for the LB markers snpST0020 and snpST0023 by comparing resistance phenotypes recorded in Peru and their respective genotypes. The LB phenotypic data were available for all S 1 samples. Clones were classified into seven categories: extremely resistant, highly resistant, resistant, moderately resistant, moderately susceptible, susceptible, and highly susceptible. For the verification calculations, the three first categories are considered to be resistant to late blight and the remaining categories are considered to be susceptible. There were three possible categories for each of the tested markers, i.e., snpST0020: CC, CA, and AA and snpST0023: TT, TG, and GG, where C and G are the favorable alleles linked to the resistance phenotype for snpST0020 and snpST0023, respectively.
The numbers of resistant and susceptible potato clones in S 1 were counted in each marker category. The false positive (α), and false negative (β) rates, as well as the assay power (s), were determined as follows: α = FP/(FP + TN); β = FN/(FN + TP), and s = 1 − β, where FP is a false positive, FN is a false negative, TN is a true negative, and TP is a true positive.
Furthermore, samples of 59 clones from the late blight heat tolerant (LBHT) [42] and the LBHT × LTVR (LTVR = lowland tropics virus resistance) [2] breeding populations (S 2 ) were tested using validated KASP makers for resistance to late blight and PVY ( Table 1). The genotypic data of the 59 clones were compared with previously recorded LB phenotypic data from field trials in Oxapampa (high late blight pressure environment) between 2012 and 2018.

QC Marker Selection Pipeline
The 8303 SNP SolCAP Infinium array [43] was used to genotype 206 clones from two CIP breeding populations, i.e., B3 [44] and LTVR [2]. Marker selection from the 3285 markers with no missing data is summarized in Figure 2. Markers with a minor allele frequency (MAF) < 0.2 were separately removed from both populations, and only the 1523 markers in both populations were considered for further steps.

QC Marker Selection Pipeline
The 8303 SNP SolCAP Infinium array [43] was used to genotype 206 clones from two CIP breeding populations, i.e., B3 [44] and LTVR [2]. Marker selection from the 3285 markers with no missing data is summarized in Figure 2. Markers with a minor allele frequency (MAF) < 0.2 were separately removed from both populations, and only the 1523 markers in both populations were considered for further steps. Low sequencing depth in next-generation sequencing methods can lead to an overestimation of homozygous calls and, therefore, to an underestimation of heterozygosity, in addition to genotyping errors and frequently missed data that, in turn, can lead to biases in subsequent population genetic analyses [3,[45][46][47]. Therefore, for further analyses, the markers were reduced to their diploid form (AAAA = AA, BBBB = BB, heterozygous (AAAB, AABB, and ABBB) = AB) to reduce heterozygous dosage bias. Further MAF filtering was performed at the diploid level, with 19 markers with MAF < 0.2 removed. Physical positions and the allele variants (DM v03) of the 1504 remaining markers were retrieved from the SolCAP database (http://solanaceae.plantbiology.msu.edu/, accessed on 23 January 2020). The linkage disequilibrium (LD) between markers on each separate chromosome and for each population was separately computed using the R package "genetics" [48]. Finally, two markers on each chromosome that had the lowest median LD (either in both populations or in one population) and lay in different linkage blocks were selected for KASP SNP Genotyping (Intertek ® ). The  QC marker assessment was conducted on a subset of the LBHT population (second cycle of recurrent selection). Samples from field trials were taken at two different stages of the breeding cycle. The preliminary trial consisted of about 2500 clones, selected from the greenhouse observational evaluation of the first clonal generation, and evaluated in Oxapampa in 2019. The intermediate trials with 500 clones selected from Oxapampa were planted, in 2020, at 3 locations, i.e., Huanuco, Huancayo, and Oxapampa, with two replications per location. A safeguard copy of all clones evaluated in the fields was maintained at the La Molina research station by the breeder; therefore, these genotypes are considered to be the reference for any breeding material identity verification.
The 114 clones sampled from the Huancayo intermediate trial (Hyo) were randomly selected from the 500 tested clones. Samples in Hyo (for a total of 224 samples) were taken from the first replication (r1, one single sample of 96 different clones) and the second replication (r2, 77 of the 96 r1 samples). Additionally, three samples (a, b, and c) of 17 clones were taken in 17 random plots of the first replication, in Hyo, to assess potential within plot variations. The 34 clones sampled from the Oxapampa preliminary trial were randomly selected from the 114 Hyo-sampled clones. As a reference for the 114 sampled clones in the field experiments, each of the 114 clones was sampled from the La Molina safeguard in 2020 (Table 2). The samples were collected from young leaves and desiccated in silica gel before storage. Hence, the 376 samples were prepared at once for the 21 QC KASP marker assessment. The DNA extraction and marker analysis were performed by Intertek ® .

QC Data Analysis
The snpST00174 marker was removed from the following analysis because it had a high number of missing genotypes (over 75%). Tetraploid allele dosage calling was performed from raw intensity data using the R [49] package fitTetra [50]. To evaluate the discriminatory effectiveness of the developed markers on the breeding material, dendrograms using diploid and tetraploid genotypes were generated from the 114 reference samples with neighbor joining for clustering [51]. Samples with more than 50% missing data (10 out of 20 markers) and markers with more than 30% missing data were filtered out and excluded from the QC analysis. For each clone, the genetic distance between test samples and the greenhouse reference samples was calculated using ape package [51].
Samples were considered to be different when their genetic difference from the reference was higher than the arbitrary 10% (~2 markers).

Trait Markers for Late Blight and PVY Resistance
Selected Markers and KASP Assay Verification The two KASP markers for PVY resistance, i.e., snpST0052 and snpST0073, show 100% concordance with the M6P2 HRM assay (Table 3). Interestingly, the snpST0050 marker that is diagnostic for the same SNP as M6P2 shows relatively lower concordance. The genotypic analysis of 78 clones in S 1 with the PVY resistance markers snpST0052 and snpST0073 classified 62 clones as resistant and 16 clones as susceptible to PVY (Supplementary Table S1). The very low type I error and relatively high assay power (s = 0.83), based on the count of clones of S 1 with the three different marker genotypes and their PVY resistance phenotypes ( Table 4), suggest that the PVY markers snpST0052 and snpST0073 are excellent markers for MAS in CIP potato breeding programs. An SNP marker for Ryadg was recently developed and found to perform more accurately than all previously used Ry markers in an Australian potato breeding germplasm [52]. A direct comparison with our assay is not straightforward, since the false positive, false negative, and assay power were not reported. Although the SNP markers from M6 were not reported in [52], they yielded a strong assay power when validated with our material. Since M45 is physically closer to the Ry locus than M6, it can be expected to explain more phenotypic variation. It would be worthwhile evaluating the performance of the KASP markers based on SNP3279 on the CIP germplasm to assess its informativeness and possible use in the breeding program. Table 3. Quality evaluation of the single nucleotide polymorphism (SNP) markers converted into the kompetitive allele specific PCR (KASP) system. Markers selected for breeding are indicated with an asterisk. SNP ID is the identification in the Intertek potato KASP marker system.  Eight of the eleven tested markers for late blight resistance showed over 90% concordance between the original marker score and the KASP marker score ( Table 3). The two best markers, snpST0020 and snpST0023, were selected as markers for routine genotyping.

SNP ID
The snpST0020 marker appears to be much better suited for the used gene pool compared with snpST0023. The assay efficiency for snpST0023 based on the number of potato clones with three different marker genotypes and their LB resistance phenotypes (Table 4) indicate the poor performance of this marker in the tested material. When used for different sets of breeding germplasm, the original markers that were converted into the KASP assay were found to be significantly associated with LB resistance. The snpST0023 marker, originally found when utilizing the B3 population [39], was represented by only seven samples in the validation panel (Supplementary Table S2), while the snpST0020 marker was discovered using a larger diversity panel that included both B3 and LTVR populations [38].

Markers Tested on Two Different Breeding Populations
We found resistant clones to LB and PVY in both LBHT and LBHT × LTVR S 2 samples (Figure 3). In both populations, 23 clones had the late blight resistance genotype with snpST0020 and snpST0023. Six of the seven genotypes with only the snpST0023 resistance allele were from the LBHT × LTVR population. Likewise, 15 of 59 clones showed resistance with both Ryadg markers. The poor correspondence between LB marker assessment and the phenotypes of the S2 samples from the LBHT × LTVR population (Figure 4) may be due to recombination between the markers and the resistance gene or bias when the phenotypic evaluation was conducted in the greenhouse.    These two LB markers have a physical distance of approximately 1 Mb in the potato DM_v6.1 reference genome, while the two Ryadg markers are separated by 106 bases along chromosome 11 (Table 1). The LB markers are tightly linked in the CIP B3 population, as These two LB markers have a physical distance of approximately 1 Mb in the potato DM_v6.1 reference genome, while the two Ryadg markers are separated by 106 bases along chromosome 11 (Table 1). The LB markers are tightly linked in the CIP B3 population, as they are always found together in the resistant genotypes (in S1 samples). In the LTVR population, however, there is recombination between the markers, and only snpST0020 is significantly associated with late blight resistance in this population. Screenhouse experiments are labor demanding and costly, and field evaluations heavily dependent on environmental conditions at the testing sites, which are often erratic [53,54]. These molecular markers will allow precise and cost-effective characterization of breeding material for PVY and late blight as compared with phenotypic evaluations. However, the genetic distance between the used markers and LB and PVY genes defines the accuracy of markerassisted selection. The suitability of LB markers for each breeding population must be assessed, particularly when there is recombination between the two resistance alleles.

Tetraploid Calls Enhance QC Marker Efficiency
The phylogenic tree constructed using the diploid calls could not separate every two clones, whereas the tetraploid-based tree clearly differentiated all genotypes, including full-sibs and half-sibs ( Figure 5). Although three markers could not be validated and another marker was removed due to missing data, by including the tetraploid allele dosage information for each marker-genotype combination, the levels of heterozygosity were increased. Therefore, three levels of heterozygosity could be obtained (simplex, duplex, and triplex), which allowed the separation of one-level diploid heterozygous samples. Further, half-sibs and full-sibs did not cluster together ( Figure 5), indicating the good discrimination ability of the used markers, despite the relatively low number of used markers and the fact that one marker had been filtered out due to missing data. Therefore, such markers can be used for identity analysis and are expected to efficiently serve fingerprinting purposes. Although breeding at the polyploid level is complex [55,56], four alleles at each locus appear to be beneficial for QC analysis since few discriminatory markers are needed to efficiently separate the material. Lower amounts of data are generated, and fewer markers should contribute to a relatively lower QC genotyping cost (marker design and routine use) and datapoints per genotype (computation) as compared with diploid breeding material [57][58][59][60].

Discrepancy in Breeding Material Genetic Identity Revealed by QC Markers
Considering the clones from the greenhouse as the reference genotypes, we found dissimilarities with clones from the preliminary and the intermediate trials as well as differences within the Huancayo intermediate trial (Figure 6). Among the 38 clones sampled in Oxapampa, 7 (18.4%) were different from the reference as compared with 25 of 224 samples (11.2%) in Huancayo. Within the Huancayo trial, seven sampled clones from the first replication were different from their respective greenhouse reference sample. There was intra-plot variation in 2 of the 17 plots where three samples were taken. Additionally, we found 2 dissimilarities with the 77 clones sampled in both replications of the Huancayo intermediate trail.

Discrepancy in Breeding Material Genetic Identity Revealed by QC Markers
Considering the clones from the greenhouse as the reference genotypes, we found dissimilarities with clones from the preliminary and the intermediate trials as well as differences within the Huancayo intermediate trial (Figure 6). Among the 38 clones sampled in Oxapampa, 7 (18.4%) were different from the reference as compared with 25 of 224 samples (11.2%) in Huancayo. Within the Huancayo trial, seven sampled clones from the first replication were different from their respective greenhouse reference sample. There was intra-plot variation in 2 of the 17 plots where three samples were taken. Additionally, we found 2 dissimilarities with the 77 clones sampled in both replications of the Huancayo intermediate trail. Mislabeled genotypes seem to occur with more frequency in breeding stages involving a large number of clones in the evaluation. Although mislabeled genotypes are common in vegetatively propagated crops and QC screening of all evaluated clones would increase operational costs, testing a subsample of the breeding material, if not all, at each stage of the breeding program and at the seed multiplication sites may be worthwhile for maximizing breeding outputs. The higher rate of mislabeled genotypes in the preliminary trial, as compared with the intermediate trial, could occur at any step, including seed multiplication, field operations, material sampling, and QC evaluation in the lab. Finding the problematic step or steps is an essential task for consistent selection and evaluation of breeding material throughout breeding cycles. Accurate clonal identity has important implications for the breeding progress since mislabeled clones can significantly affect the expected gains from breeding. Thus, the development of this QC marker set and its proper implementation in routine breeding programs is crucial and would be an effective strategy for reducing mislabeling and achieving targeted genetic gains. The threshold between the number of tested genotypes and the number of markers should be defined for each breeding program, taking into consideration genotyping costs and available implementation budget.

Routine Use of KASP Markers
SNP markers for late blight and PVY resistance, two important traits in potato, were successfully converted into KASP markers, which are amenable to application in a high throughput system. The dosage level for each marker can be computed, and selection of susceptible or simplex parents can be avoided to generate progenies with resistance to LB or PVY. More sources of resistance and markers for PVY and LB should be identified and introgressed into the breeding material, with the purpose of stacking several resistance genes with molecular-assisted selection. Identified resistance SNPs can be further incorporated in any targeted sequencing platform for more accurate genomic prediction. Mislabeled genotypes seem to occur with more frequency in breeding stages involving a large number of clones in the evaluation. Although mislabeled genotypes are common in vegetatively propagated crops and QC screening of all evaluated clones would increase operational costs, testing a subsample of the breeding material, if not all, at each stage of the breeding program and at the seed multiplication sites may be worthwhile for maximizing breeding outputs. The higher rate of mislabeled genotypes in the preliminary trial, as compared with the intermediate trial, could occur at any step, including seed multiplication, field operations, material sampling, and QC evaluation in the lab. Finding the problematic step or steps is an essential task for consistent selection and evaluation of breeding material throughout breeding cycles. Accurate clonal identity has important implications for the breeding progress since mislabeled clones can significantly affect the expected gains from breeding. Thus, the development of this QC marker set and its proper implementation in routine breeding programs is crucial and would be an effective strategy for reducing mislabeling and achieving targeted genetic gains. The threshold between the number of tested genotypes and the number of markers should be defined for each breeding program, taking into consideration genotyping costs and available implementation budget.

Routine Use of KASP Markers
SNP markers for late blight and PVY resistance, two important traits in potato, were successfully converted into KASP markers, which are amenable to application in a high throughput system. The dosage level for each marker can be computed, and selection of susceptible or simplex parents can be avoided to generate progenies with resistance to LB or PVY. More sources of resistance and markers for PVY and LB should be identified and introgressed into the breeding material, with the purpose of stacking several resistance genes with molecular-assisted selection. Identified resistance SNPs can be further incorporated in any targeted sequencing platform for more accurate genomic prediction.
The set of KASP markers tested and validated for quality control in the LBHT population may be suitable for the LTVR population. There is a need to define a QC analysis pipeline for routine use throughout the breeding cycle. Thousands of clones are tested in the early generations of a breeding program, and QC analysis may be a very costly activity. QC analysis on only a subset of a whole population will not reveal information about the untested clones. A mislabeled untested genotype can still be selected and, therefore, may show poor performance when the true tuber seeds (from multiplication plots) are used for subsequent field experiments. Although testing a subset of evaluated genotypes would be more efficient than not testing at all, for routine application, QC markers should be defined to include as many genotypes as possible. Systematic QC assessment of parental plants in crossing blocks, complemented by precocious crosses, is envisaged, though further analyses that test (and confirm) the abilities of the QC markers to assess parental purity in hybrids are needed.
Routine QC genotyping at all stages could be made possible by increasing genotyping budgets or reducing genotyping costs. Including QC markers in a targeted sequencing platform or selecting QC markers from such a platform and converting them into KASP markers could also reduce the long-term QC genotyping costs. When genotyping for genomic prediction in the early stages, the reference QC marker data would be generated from the targeted sequencing work, and KASP markers could be implemented in the QC test in the next stages. There must be maximal correspondence between these two marker-sets to allow a fair comparison. In any case, the KASP QC analysis could be a key molecular tool if applied at all stages of the breeding process that could contribute to more accurate selection of breeding products and, thus, increased genetic gains.
In the CIP potato breeding program, KASP marker assessment is outsourced and, thus, only minimal work is required for sample preparation. The turnaround time from sending the samples to Intertek ® and receiving the genotyping results is approximately 14 days, which is more than sufficient to allow a decision based on the marker genotype before the next planting or crossing season. Therefore, the developed markers present great potential, and their implementation will be beneficial to breeding programs.

Conclusions
The KASP markers for Ryadg and late blight that were developed using different CIP breeding materials are valuable tools for a time-effective characterization of breeding material with a reduction in operational costs. Although further inclusion of identified loci into a targeted genotyping platform may increase prediction accuracy in a genomic selection program, testing more markers located in the putative gene regions on a specific germplasm is worthwhile. The selection accuracy can be increased with routine use of developed QC markers in the identity verification of evaluated clones or parental lines in crossing blocks. The trait and QC marker assessment can be directly implemented in breeding programs using the same germplasm base and seed multiplication fields as that of the CIP-developed varieties; for application in any other germplasm, the efficiency of developed markers should first be assessed.

Supplementary Materials:
The following is available online at https://www.mdpi.com/article/10 .3390/agronomy11112315/s1. Table S1: Quality evaluation of the PVY KASP markers compared with the original HRM assay. Table S2: Quality evaluation of the LB KASP markers compared with the original GBS or Solcap assay. Funding: This research was funded from the CGIAR Research Program on Roots, Tubers, and Bananas (RTB), and USAID. The authors are thankful to the "Shared Industrial-Scale Low-Density SNP Genotyping for CGIAR and Partner Breeding Programs Serving SSA and SA" (High Throughput Genotyping-HTPG) BMGF funded project (OPP1130244) for genotyping support.