GoPrime: Development of an In Silico Framework to Predict the Performance of Real-Time PCR Primers and Probes Using Foot-and-Mouth Disease Virus as a Model.

Real-time PCR (rPCR) is a widely accepted diagnostic tool for the detection and quantification of nucleic acid targets. In order for these assays to achieve high sensitivity and specificity, primer and probe-template complementarity is essential; however, mismatches are often unavoidable and can result in false-negative results and errors in quantifying target sequences. Primer and probe sequences therefore require continual evaluation to ensure they remain fit for purpose. This paper describes the development of a linear model and associated computational tool (GoPrime) designed to predict the performance of rPCR primers and probes across multiple sequence data. Empirical data were generated using DNA oligonucleotides (n = 90) that systematically introduced variation in the primer and probe target regions of a diagnostic assay routinely used to detect foot-and-mouth disease virus (FMDV); an animal virus that exhibits a high degree of sequence variability. These assays revealed consistent impacts of patterns of substitutions in primer and probe-sites on rPCR cycle threshold (CT) and limit of detection (LOD). These data were used to populate GoPrime, which was subsequently used to predict rPCR results for DNA templates (n = 7) representing the natural sequence variability within FMDV. GoPrime was also applicable to other areas of the FMDV genome, with predictions for the likely targets of a FMDV-typing assay consistent with published experimental data. Although further work is required to improve these tools, including assessing the impact of primer-template mismatches in the reverse transcription step and the broader impact of mismatches for other assays, these data support the use of mathematical models for rapidly predicting the performance of rPCR primers and probes in silico.


Introduction
Real-time PCR (rPCR) has become an essential tool in molecular biology and is routinely used for detection, quantification, and differentiation of nucleic acids in both research and diagnostic settings [1][2][3]. Central to the specificity and sensitivity of rPCR assays are the primers and probes, with amplification affected by factors such as primer and probe-template complementarity and the presence of secondary structures (e.g., primer dimers) [4]. However, designing primers and probes with full sequence complementarity to all the required targets can be problematic. For instance, when considering RNA viruses such as foot-and-mouth disease virus (FMDV), the high mutation rate (in the range of 10 −3 to 10 −5 per nucleotide site, per genome replication [5,6]) can result in fully conserved regions being too short to accommodate primer and probe sets. This is especially true when designing assays to target the more varied genomic regions for serotype/strain differentiation [7][8][9][10]. Consequently, primer and probe-template mismatches are often unavoidable and a compromise approach that accommodates sequence mismatches is often adopted to design diagnostic tests.
Primer and probe-template mismatches can be especially problematic when considering the use of rPCR for diagnostic purposes. By impacting rPCR amplification, mismatches can alter the cycle threshold (C T ) at which targets are detected, leading to errors in nucleic acid quantification. For instance, a single internally located mismatch can result in up to a 1000-fold underestimation of initial copy number [18]. Notably, mismatches at the 3 -end of primers have been shown to produce effects ranging from a two-fold underestimation of initial copy number to complete prevention of amplification, thus leading to false-negative results [22]. As such, false-negative rPCR results may be common, especially in the instance of assays detecting lower viral loads (for FMDV, this is common in oesophageal-pharyngeal fluid and environmental samples).
These effects of mismatches result in the requirement for continual evaluation of primer and probe sequences. In addition to laborious manual laboratory-based screening, primer and probe validation traditionally occurs though Basic Local Alignment Search Tool (BLAST) searches against publicly available sequences [23], with tools now developed to automate the process [24][25][26]. However, despite numerous studies into the effects of mismatches, no primer evaluation programs to date have been developed using experimental data, with target sequences only reported as putative hits or misses. With rPCR assays requiring different performance criteria depending upon their use, the provision of binary predictions is limited. For example, high specificity is paramount for assays used to differentiate between diseases, high sensitivity is required for assays used to confirm negative results and an awareness of cross-reactivity is important for assays that distinguish between closely related sequences, such as FMDV serotypes and viral lineages [9]. As such, the availability of a quantitative primer/probe validation program could support rPCR evaluation by giving researchers and diagnosticians the ability to rapidly predict whether assays are fit for purpose. This paper describes the impacts of different primer and probe-template mismatches on C T and limit of detection (LOD) on rPCR, and the presentation of a primer and probe evaluation framework (GoPrime), in order to ascertain to what extent the effect of mismatches on template cDNA can be predicted. Further analysis is required to assess the effects of primer-template mismatches during the reverse transcription step.

The Effects of Primer and Probe-Template Mismatches
For all experimental analyses, primer sequences were kept the same and the template sequences were varied. The primer and probe sequences, published by Callahan et al. (2002), were as follows: forward primer: 5 -ACT GGG TTT TAC AAA CCT GTG A-3 (Tm: 56.5 • C); reverse primer, 5 -GCG AGT CCT GCC ACG GA-3 (Tm: 60 • C); and probe 5 -(6FAM)TCC TTT GCA CGC CGT GGG AC(TAMRA)-3 [27] (Tm calculated at www.eurofinsgenomics.eu/en/ecom/tools/oligo-analysis/). Linear DNA oligonucleotide templates of 109 bp (Sigma-Aldrich, Munich, Germany) were designed around the cDNA target region for a published assay [27]: a conserved region of the FMDV genome (3D pol -coding region). Ninety templates were ordered, each designed to evaluate the consequences of different variations in the primer or probe binding regions (Table 1). For example, variations across the length of the primer and probe target regions were designed to investigate the effect of position, with different bases substituted to study the effects of mismatch type and mismatch quantity. Sequences were based on FMDV O/UKG/35/2001 (accession number KR265074, nucleotides 7862-7970). In addition, a template with full primer/probe-template complementarity was ordered and used as the reference template (R) for all rPCR runs. Sequences were based on FMDV O/UKG/35/2001 (accession number KR265074, nucleotides 7862-7970). In addition, a template with full primer/probe-template complementarity was ordered and used as the reference template (R) for all rPCR runs. Table 1. Linear DNA oligonucleotide templates for real-time PCR targets (5ʹ-3ʹ).

Development of GoPrime
In order to ascertain the effect of each primer and probe-template mismatch type on ΔCT and ΔLOD, linear model analysis was performed. Linear model variables ( Table 2) were selected based on statistical analysis (to ascertain which primer and probe-template mismatch locations were Pathogens 2020, 9, 303 5 of 15

Real-Time PCR
rPCR reactions were performed using two Taq-based rPCR kits.: (1) Excite TM UF 2x Master Mix (Excite TM UF) (Quantig Ltd., Camberley, UK), a Taq-based rPCR kit, was selected as it required minimal reaction set-up, increasing the likelihood of assay variation being attributed to target sequence differences rather than human variability. Reactions were performed in a total of 20 µL, containing: 5 µL template, 10 µL 2x master mix, 50 nM ROX reference dye, 1.6 µL of each primer (16 pmol), 1.2 µL of probe (6 pmol) (primers and probes final concentrations as previously described [28]) and made up to volume with nuclease-free water (NFW). Thermal cycling conditions were 95 • C for 3 min, followed by 50 cycles of 95 • C for 5 s and 60 • C for 20 s. (2) SuperScript™ III Platinum™ One-Step qRT-PCR Kit (SSIII TM ) (Thermo Fisher Scientific, Waltham, MA, USA) was chosen as it is a commonly used Taq-based kit. Reagents, parameters, primer/probe final concentrations and thermal cycling conditions were as previously reported [28]. Reactions were performed in a total of 25 µL, containing: 5 µL template, 12.5 µL 2x buffer, 0.5 µL of Superscript III enzyme mix (both supplied with the kit), 50 nM ROX reference dye, 2 µL of each primer (20 pmol), 1.5 µL of probe (7.5 pmol) [27] and made up to volume with NFW. Thermal cycling conditions were 95 • C for 10 min, followed by 50 cycles of 95 • C for 15 s and 60 • C for 1 min. The reverse transcription (RT) step was omitted from the published protocol [28].
All rPCR reactions were performed on an ABI ViiA TM 7 real-time PCR system thermocycler (Thermo Fisher Scientific). Positive reactions were defined as those that gave a detectable C T (no C T cut-off set). Initial rPCR reactions were performed in duplicate using 10 6 copies of template. Where C T values were detected, further rPCR reactions were performed in duplicate across a log 10 dilution series of template (10 6 -10 0 copies/reaction) in 0.1 µg/µL carrier RNA (Ambion®, Austin, TX, USA, Thermo Fisher Scientific, Waltham, MA, USA). The effect of mismatches on rPCR were determined by calculating the change in C T (∆C T ) and change in LOD (∆LOD) between the reference and varying oligonucleotide DNA templates.

Development of GoPrime
In order to ascertain the effect of each primer and probe-template mismatch type on ∆C T and ∆LOD, linear model analysis was performed. Linear model variables (Table 2) were selected based on statistical analysis (to ascertain which primer and probe-template mismatch locations were statistically different from one another) and published data (Supplementary Data Table S1). Table 2. Variables included in the linear model analysis.

Probe
Percentage mismatch Mismatches were grouped as one of two types: (type 1) purine-pyrimidine mismatch (G-T or C-A nucleotide base pairing, leading to a minor conformational change in the primer/probe-template duplex); (type 2) purine-purine or pyrimidine-pyrimidine mismatch (G-A, A-A, G-G, C-T, T-T or C-C nucleotide base pairing, leading to a major conformational change in the primer/probe-template duplex).
Linear model analysis [29] was performed in R [30], using the variables stated in Table 2, and all quantitative data collected (90 templates, all template dilutions, using both Taq-based kits, to analyze the average effects of mismatches [one linear model looked primer-template mismatches; a second linear model was used to look at probe-template mismatches]). The results of the linear models were used to parameterize GoPrime: a computational tool for predicting the effects of mismatches on rPCR. GoPrime was built by implementing the primer and probe mismatch rules and C T penalties in a computer program written in the Java programming language. GoPrime takes as input fasta sequence files of the primers/probe and target sequences. It calculates an expected change in C T for each target sequence set based on the number and type of mismatches between the primer/probe and template present. The GoPrime program is freely available from: https://github.com/rjorton/GoPrime.

Evaluating GoPrime as a Predictor of rPCR Performance
In order to evaluate GoPrime on naturally occurring sequence variations, a search was performed using the National Centre for Biotechnology Information (NCBI) nucleotide database (NCBI, 2017). Seven FMDV sequences, which contained naturally occurring variations in the Callahan et al. (2002) primer and probe target regions [27], were selected and ordered as linear DNA oligonucleotides of 109 bp (Table 3). These were used as template in rPCR, using both the Excite TM UF and SSIII TM protocols, with ∆C T and ∆LOD results compared with predictions from GoPrime.   Target  R  ACTGGGTTTTACAAACCTGTGA  TCCTTTGCACGCCGTGGGAC  TCCGTGGCAGGACTCGC  JX040500  ACTGGGTTTTACAAACCTATGA  TCCTTTGCACGCCGTGGGAC  TCCGTGGCAGGACTCGC  KC440884  ACTGGATTTTATAAACCTGTGA  TCCTTTGCACGCCGTGGGAC  TCCGTGGCAGGACTCGC  AY593802  ACTGGGTTTTACAAACCTGTGA  TCCTTCGCACGCCGTGGGAC  TCTGTGGCAGGGCTCGC  KC440883  ACTGGGTTTTACAAACCTGTGA  TCCTTTGCACGCCGTGGGAC  TCTGTGGCGGGACTCGC  AY593812  ACTGGGTTTTACAAACCTGTGA  TCCTTTGCACGCCGTGGGAC  TCAGTGGCAGGACTCGC  KF112882  ACTGGGTTTTACAAACCTGTGA  CCCTTTGCACGCCGTGGGAC  TCCGTGGCAGGACTCGC  HM191257 ACTGGGTTTTACAAACCTGTGA TCCTTCGCACGCCGTGGGAC TCTGTGGCAGGACTCGC The primer/probe binding regions of the seven DNA oligonucleotides ordered to test the program (109 base pairs in length, with regions between primers consistent with the sequences for each accession number. The black sequence (top row) represents the reference template (R); grey sequences represent the varying DNA templates, with black highlighted bases depicting primer/probe-template mismatch sites. In order to ascertain how transferable GoPrime was to other areas of the FMDV genome, GoPrime was used to analyze four FMDV-typing assays designed to target the less conserved VP1/2A-coding regions [9]. The four sets of primers and probes (specific for either serotype A, O, Southern African Territories [SAT] 1 or SAT 2 field viruses circulating in East Africa) were evaluated against the 66 VP1/2A-coding sequences used in the initial laboratory-based evaluation (A = 15; O = 20; SAT 1 = 19; SAT 2 = 12). The published experimental results [9] and GoPrime predictions for the likely targets of each assay were then compared and displayed using GoPrimeTree. GoPrimeTree is a simple Java program that takes an existing nexus tree of the target sequences, and adds FigTree [31] readable coloring to the tree tips based on the GoPrime predicted ΔCT values; different shades of the colors green, orange, and red were used to represent target sequences with ΔCT's between 0-10, 10-20, and 20-30, respectively, whilst black was used for sequences that failed to amplify.

Statistical and Phylogenetic Analysis
Statistical analyses were performed using R [30]. Phylogenetic trees for the visualization of ΔCT GoPrime results across FMDV serotypes were produced from sequence alignments in Mega (version 7.0.21) [32] using the neighbor-joining method [33] and viewed in FigTree (version 1.4.3) [31].

The Effects of Primer/Probe-Template Mismatches on rPCR
For this assay, single mismatches between the template and the primer, at the 3ʹ-end of the and GoPrime predictions for the likely targets of each assay were then compared and displayed using GoPrimeTree. GoPrimeTree is a simple Java program that takes an existing nexus tree of the target sequences, and adds FigTree [31] readable coloring to the tree tips based on the GoPrime predicted ∆C T values; different shades of the colors green, orange, and red were used to represent target sequences with ∆C T 's between 0-10, 10-20, and 20-30, respectively, whilst black was used for sequences that failed to amplify.

Statistical and Phylogenetic Analysis
Statistical analyses were performed using R [30]. Phylogenetic trees for the visualization of ∆C T GoPrime results across FMDV serotypes were produced from sequence alignments in Mega (version 7.0.21) [32] using the neighbor-joining method [33] and viewed in FigTree (version 1.4.3) [31].

The Effects of Primer/Probe-Template Mismatches on rPCR
For this assay, single mismatches between the template and the primer, at the 3 -end of the primer (using both rPCR kits) had the most detrimental effect on C T (average ∆C T of between 6.38 and 11.57 across the dilution series) (Figure 1). Furthermore, the type of mismatch was shown to be important. For instance, at the 3 -end of the reverse primer, a C-A mismatch (type 1 mismatch: purine-pyrimidine) resulted in an average effect of ∆C T of 8.59 across the dilution series, whereas a G-A mismatch (type 2 mismatch: purine-purine or pyrimidine-pyrimidine) in the same location resulted in an average effect of ∆C T of 11.57 across the dilution series ( Figure 1A). Multiple mismatches in the 3 -end of the primers (either (i) both in one primer or (ii) one the forward and one in the reverse primer) either prevented amplification from occurring or required high template copy number (10 4 -10 6 copies per reaction) for any amplification to occur, depending on the mismatches ( Figure 1B,C).  When studying the effect of primer-template percentage complementarity across the total length of the primers, a minimum of 82.05% primer-template match between the forward and reverse primers (combined) was required for amplification to occur ( Figure 1D). When studying the effect of the mismatches across the length of the probe, a minimum of 85% probe-template complementarity was required in order for effective detection to occur, impacting upon rPCR ∆C T by 6.58 on average across the dilution series ( Figure 2). Step qRT-PCR Kit) and a dilution series of template (10 6 -10 0 copies/reaction). The limit of detection (LOD) for each template is defined as the lowest dilution where all replicates were positive (displayed in grey text). Error bars represent the standard deviation. (P) probe; percentages represent probe-template complementarity. Template number refers to Table 1.

Development of GoPrime
Using linear model analysis, the average effect of each primer and probe-template mismatch type was determined, accounting for both single and multiple mismatches in the primer and probe binding regions, by implementing additive and dampening effects where necessary (Table 4). These results from the linear model, the minimum match percentages (82.05% primers combined, 85% probe), combined maximum of 2 mutations at the 3ʹ end of primers, and ΔCT penalties for mutation types/combinations were used to parameterize the GoPrime program.
To use GoPrime, users provide a file of their primer/probe sequences (5ʹ-3ʹ fasta format) and a file of template sequences to be analyzed (5ʹ-3ʹ fasta format). GoPrime first analyzes the template, in both orientations, for possible forward and reverse primer targets ( Figure 3); at this stage, forward and reverse primers are analyzed individually to generate potential targets to investigate further as possible pairs. This is done based on the minimum requirements for primer-template complementarity, which were defined during data analysis (Table 4). Once possible forward and reverse primer targets are identified, they are evaluated as possible primer pairs (Figure 3). Potential probe targets (optional) are then identified between the primer pairs, searching again in both orientations based on the probe-template mismatch limits determined during data analysis (Table 4). The effects of probe-template mismatches on real-time PCR (rPCR) cycle threshold. Results represent the average increase in cycle threshold (C T ) from a perfectly matched template, across two rPCR kits (Excite TM UF 2x Master and SuperScript™ III Platinum™ One-Step qRT-PCR Kit) and a dilution series of template (10 6 -10 0 copies/reaction). The limit of detection (LOD) for each template is defined as the lowest dilution where all replicates were positive (displayed in grey text). Error bars represent the standard deviation. (P) probe; percentages represent probe-template complementarity. Template number refers to Table 1.

Development of GoPrime
Using linear model analysis, the average effect of each primer and probe-template mismatch type was determined, accounting for both single and multiple mismatches in the primer and probe binding regions, by implementing additive and dampening effects where necessary (Table 4). These results from the linear model, the minimum match percentages (82.05% primers combined, 85% probe), combined maximum of 2 mutations at the 3 end of primers, and ∆C T penalties for mutation types/combinations were used to parameterize the GoPrime program.
To use GoPrime, users provide a file of their primer/probe sequences (5 -3 fasta format) and a file of template sequences to be analyzed (5 -3 fasta format). GoPrime first analyzes the template, in both orientations, for possible forward and reverse primer targets ( Figure 3); at this stage, forward and reverse primers are analyzed individually to generate potential targets to investigate further as possible pairs. This is done based on the minimum requirements for primer-template complementarity, which were defined during data analysis (Table 4). Once possible forward and reverse primer targets are identified, they are evaluated as possible primer pairs (Figure 3). Potential probe targets (optional) are then identified between the primer pairs, searching again in both orientations based on the probe-template mismatch limits determined during data analysis (Table 4).  ). Mismatches were grouped as one of two types: (type 1) purine-pyrimidine mismatch (G-T; C-A: minor conformational change in the primer/probe-template duplex); (type 2) purine-purine or pyrimidine-pyrimidine mismatch (G-A; A-A; G-G; C-T; T-T; C-C: major conformational change in the primer/probe-template duplex). One linear model looked primer-template mismatches; a second linear model was used to look at probe-template mismatches. *If (for example) a type nt 1 mismatch was present, the percentage mismatch ΔCT would be calculated and an additional nt 1 mismatch ΔCT penalty added. ΔCT, SE, and t value given to 2 decimal places. **Insufficient oligos to calculate with accuracy, therefore GoPrime calculates this based on ΔCT of nt 3-4 mismatch (type 1) × 2.
Once a potentially suitable primer and probe set has been identified, GoPrime uses the parameters determined in the linear model to predict whether amplification is likely to occur and the effect of any mismatches present on ΔCT and ΔLOD, implementing additive/dampening effects of multiple mismatches if applicable. GoPrime provides the outputs as two separate text files. Firstly, a Figure 3. GoPrime flow diagram. GoPrime examines sets of primer sequences (optionally including a probe sequence), searches the target genome sequences for potential matches, then predicts the effect of any primers/probe-template mismatches present on real-time PCR cycle threshold and limit of detection.   +0.43 [2dp]). Mismatches were grouped as one of two types: (type 1) purine-pyrimidine mismatch (G-T; C-A: minor conformational change in the primer/probe-template duplex); (type 2) purine-purine or pyrimidine-pyrimidine mismatch (G-A; A-A; G-G; C-T; T-T; C-C: major conformational change in the primer/probe-template duplex). One linear model looked primer-template mismatches; a second linear model was used to look at probe-template mismatches. * If (for example) a type nt 1 mismatch was present, the percentage mismatch ∆C T would be calculated and an additional nt 1 mismatch ∆C T penalty added. ∆C T , SE, and t value given to 2 decimal places. ** Insufficient oligos to calculate with accuracy, therefore GoPrime calculates this based on ∆C T of nt 3-4 mismatch (type 1) × 2.
Once a potentially suitable primer and probe set has been identified, GoPrime uses the parameters determined in the linear model to predict whether amplification is likely to occur and the effect of any mismatches present on ∆C T and ∆LOD, implementing additive/dampening effects of multiple mismatches if applicable. GoPrime provides the outputs as two separate text files. Firstly, a simple analysis, which provides each sequence name against the predicted ∆C T and ∆LOD, number of mismatches present and the likely amplicon length. The second analysis provides more detail, including the position and orientation of each likely primer/probe target and number and the type of any mismatches present (Figure 3, Supplementary Data Figure S2).
If users have a corresponding phylogenetic tree of their target sequences, they can visualize the results by using the GoPrimeTree program which will color the tip labels of the tree based on the GoPrime ∆C T results. In order to use this, users provide the text file output of GoPrime in addition to a phylogenetic tree (nexus format). GoPrimeTree color codes the sequences according to the predicted ∆C T , via annotation of the nexus tree file, so that the predicted targets and effect of the primer/probe-template mismatches can be easily visualized across multiple target sequences in FigTree [31].

Evaluating GoPrime as a Predictor of rPCR Performance
Evaluation of GoPrime using the seven DNA oligonucleotide templates containing sequence variations, observed in naturally occurring FMDV isolates in the Callahan et al. (2002) target region [27] ( Table 3), showed that GoPrime on average predicted the ∆C T of reactions 1.49 (SD 1.20; range 0.01-6.37) away from the observed result, when looking at all data points across the dilution series ( Figure 4). In addition, GoPrime on average predicted the ∆LOD of reactions 0.63 (range 0.30-1.52) away from the observed result ( Figure 4, Supplementary Data Table S3).
Pathogens 2020, 9, x FOR PEER REVIEW 10 of 15 simple analysis, which provides each sequence name against the predicted ΔCT and ΔLOD, number of mismatches present and the likely amplicon length. The second analysis provides more detail, including the position and orientation of each likely primer/probe target and number and the type of any mismatches present (Figure 3, Supplementary Data Figure S2). If users have a corresponding phylogenetic tree of their target sequences, they can visualize the results by using the GoPrimeTree program which will color the tip labels of the tree based on the GoPrime ΔCT results. In order to use this, users provide the text file output of GoPrime in addition to a phylogenetic tree (nexus format). GoPrimeTree color codes the sequences according to the predicted ΔCT, via annotation of the nexus tree file, so that the predicted targets and effect of the primer/probe-template mismatches can be easily visualized across multiple target sequences in FigTree [31].

Evaluating GoPrime as a Predictor of rPCR Performance
Evaluation of GoPrime using the seven DNA oligonucleotide templates containing sequence variations, observed in naturally occurring FMDV isolates in the Callahan et al. (2002) target region [27] (Table 3), showed that GoPrime on average predicted the ΔCT of reactions 1.49 (SD 1.20; range 0.01-6.37) away from the observed result, when looking at all data points across the dilution series ( Figure 4). In addition, GoPrime on average predicted the ΔLOD of reactions 0.63 (range 0.30-1.52) away from the observed result ( Figure 4, Supplementary Data Table S3).  Although GoPrime can only be currently used to evaluate rPCR and two-step rRT-PCR assays (in which RT and PCR stages are separate and mismatches persist through to cDNA), GoPrime was able to identify the likely targets for four FMDV-typing assays (East Africa specific for serotypes A, O, SAT 1, and SAT 2). The targets identified were consistent with previously published results, with each set of serotype-specific primers identifying the template sequences within their target serotype. However, cross-reactivity between serotypes was not predicted by GoPrime, which was evident in the published results [9] ( Figure 5, Supplementary Data Figure S4).
Pathogens 2020, 9, x FOR PEER REVIEW 11 of 15 Excite TM UF 2x Master Mix; (D) observed change in limit of detection for SuperScript™ III Platinum™ One-Step qRT-PCR Kit. For the observed results, points represent the average change in cycle threshold or limit of detection across all dilutions (10 6 -10 0 ) of the starting template.
Although GoPrime can only be currently used to evaluate rPCR and two-step rRT-PCR assays (in which RT and PCR stages are separate and mismatches persist through to cDNA), GoPrime was able to identify the likely targets for four FMDV-typing assays (East Africa specific for serotypes A, O, SAT 1, and SAT 2). The targets identified were consistent with previously published results, with each set of serotype-specific primers identifying the template sequences within their target serotype. However, cross-reactivity between serotypes was not predicted by GoPrime, which was evident in the published results [9] ( Figure 5, Supplementary Data Figure S4).

Discussion
Primer and/or probe-template mismatches are often unavoidable in rPCR, leading to the requirement for continual monitoring of oligonucleotides used in assays against available sequence data, to ensure that assays remain fit for purpose. Consequently, the ability to quantitatively evaluate the performance of rPCR primers and probes in silico could aid researchers and diagnosticians by rapidly predicting the effects of mismatches present on rPCR amplification, which is not possible using current methods.
The GoPrime framework is currently limited to rPCR (DNA template). For GoPrime to predict the effect of mismatches on one-step rRT-PCR, where gene-specific primers are used in both the RT and rPCR stages, further work is required to analyze the effect of mismatches between RNA templates and primers during the reverse transcription step. Although GoPrime should be applicable to two-step rRT-PCR, where the use of Oligo(dT) or random hexamers for the RT stage results in primer and probe-template mismatches persisting from RNA though to cDNA, further analysis is required to confirm this. Furthermore, our experimental design is currently limited to a single real-time PCR machine and two PCR kits. With inherent differences between PCR machines and due to different rRT-PCR kits having previously been shown to differ in their tolerance to mismatches [22], it would be beneficial to investigate how the general mismatch rules reported are affected by factors such as equipment performance and assay format.
Empirical data generated in this study were consistent with previous publications: mismatches in the 3 -end of primers had a more detrimental effect on rPCR amplification than those located towards the 5 -end, due to disruption of the DNA polymerase active site [11,16,18,20,22]. The effect of single mismatches within the 3 -end region displayed a consistent pattern, related to both nucleotide position and mismatch type. The effect of probe-template mismatches also displayed a consistent pattern, however, further testing and analysis on the positional effects of probe-template mismatches is required in order for these positional effects to be accurately included within GoPrime, in addition to probe-template percentage complementarity. For example, mismatches in the center of the probe have been previously reported to destabilize probe-template annealing [20]. At present, GoPrime splits mismatches into two types: (type 1) purine-pyrimidine mismatch (G-T or C-A nucleotide base pairing, leading to a minor conformational change in the primer/probe-template duplex) and (type 2) purine-purine or pyrimidine-pyrimidine mismatch (G-A, A-A, G-G, C-T, T-T or C-C nucleotide base pairing, leading to a major conformational change in the primer/probe-template duplex). Furthermore, the current model of GoPrime looks at the average effect of primer/probe-template mismatches across all template copy numbers. Future frameworks, to improve accuracy, could consider categorizing mismatches further and could include the effect of template copy number and PCR efficiency.
With rPCR assays requiring different performance criteria depending upon their use, the ability of GoPrime to quantitatively predict the effect of primer/probe-template mismatches on both C T and LOD could help diagnosticians accurately assess whether rPCR assays are fit for their intended use. For instance, predicting whether assays that aim to differentiate between serotypes are specific to that serotype. Furthermore, some rPCR assays might give positive results in spite of mismatches when high viral loads are present (such as in acute stages of disease), but generate a false-negative in the presence of lower viral loads (such as in oesophageal-pharyngeal fluid or environmental samples for FMDV). Although the experimental data gained in this study was specific to the rPCR conditions and primer/probe sequences evaluated, GoPrime predicted the likely positive targets of four FMDV-typing assays, which target alternative regions of the FMDV genome. However, as cross-reactivity between serotypes was not predicted, the next stage for analysis would be to include ∆C T data from other assays and to test the program across other organisms, to analyze GoPrime's versatility and broaden the genomic context of the analysis. Furthermore, the analysis for GoPrime is currently based on data from synthesized DNA oligonucleotides diluted in carrier RNA. Further work is required to ensure that the general mismatch rules are consistent across clinical samples, which have more complex matrices including background nucleic acid and mixed DNA populations.
In conclusion, this paper describes the development of GoPrime: a freely available primer evaluation program which predicts the likely performance of primer/probe sets across multiple sequence data. Experimental data suggested that mismatch impacts follow a consistent pattern, enabling GoPrime to be parametrized from experimental observations. Within this study, GoPrime was only validated against primers and probes targeting FMDV, and a further research avenue would be to challenge GoPrime with alternative targets, to ascertain the broader prediction accuracy of this tool with additional assay targets. By providing a novel quantitative approach to primer/probe evaluation, GoPrime offers increased flexibility to the user by not only predicting the likely targets of primer/probe sets, but also estimating the effects of any mismatches present on C T and LOD in silico, thereby enabling selection of the most appropriate primer/probe combination given the research question and diagnostic sample.
Supplementary Materials: The following are available online at http://www.mdpi.com/2076-0817/9/4/303/s1, Table S1: Statistical analyses to determine linear model variables; Figure S2: GoPrime example output; Table S3: Predicted and observed ∆CT and ∆LOD for the linear DNA templates representing FMDV field isolates for testing GoPrime predictions; Figure S4: Using GoPrime to predict the likely targets of four foot-and-mouth disease virus (FMDV)-typing RT-PCR assays.