Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Temporal Dynamics of Recombination in Field Isolates of Foot-and-Mouth Disease Virus

Viruses 2026, 18(2), 262; https://doi.org/10.3390/v18020262

by Mate Malichava¹

, Alexander Lukashev^1,2

and Yulia Aleshina^1,3,*

Reviewer 1:

Ian Fish

Reviewer 2:

Shaohui Ma

Viruses 2026, 18(2), 262; https://doi.org/10.3390/v18020262

Submission received: 27 December 2025 / Revised: 3 February 2026 / Accepted: 17 February 2026 / Published: 19 February 2026

(This article belongs to the Special Issue Foot-and-Mouth Disease Virus)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Manuscript Authors and Journal Editors:

The manuscript “Temporal Dynamics of Recombination in Field Isolates of Foot-and-Mouth Disease Virus” by M Malichava, A Lukashev, and Y Aleshina presents a systemic analysis of publicly available full coding frame FMDV sequences with a focus on measuring the chronological and serotype-specific characteristics of recombination events. Although several papers have done similar analyses, the present work provides new information regarding the lifespan of recombinant virus forms including differences between serotypes.

The paper is very well written and organized with good quality data and (albeit huge) figures. Nevertheless, there are some significant data issues that should be addressed, some metadata to be included in analysis as well as a few things to be added to discussion, detailed below.

Most Important

Sample origins:

One sample says it is from USA, dated 2017. This is clearly a laboratory-derived sample, as the Unites States has not had an outbreak of FMDV. The UK samples appear to be from 2001 and 2007, years of well-known lab leaks, but the dates would not correlate with natural spread. There are or have been research facilities in many of these countries that have worked on FMDV – China, Thailand, Vietnam, S Korea, Japan, Netherlands, Spain, Germany, India, US, UK, South Africa, etc. Therefore, it is unclear how many other samples are in fact laboratory-derived, so the authors should re-assess the list.

Perhaps more importantly, there is a significant issue with the nature of the samples’ origins – some of them are not from outbreaks, but from persistently infected animals – FMDV carriers. Carriers, as a general rule do not transmit FMDV, so these are most likely dead-end viruses and are not part of natural transmission pathways or FMDV evolution. In fact, these animals may have been either subclinically infected, persistently infected, or even both at the same time (with different viruses). It is not clear how the removal of these sequences would affect the results, but the authors should at least check this out and discuss the issue. Some (a significant #, actually) examples would be those in Table S1 with names that end in “_pro” (Vietnam and Pakistan), for probang, which is the tool used to sample potential carriers. Also, since persistent infection can last a long time, it would be hard to know how to properly date these isolates (if the authors do choose to keep them).

Medium importance

Inclusion of host metadata:

It looks like the hosts are known for most of the samples that they have included – did the authors not find differences in recombination between hosts or potential host pools? For viral recombination to take place, two different viruses must enter a host cell (coinfection) – this is surely more common in some hosts than others or perhaps in some environments – larger farms or trade markets? This would also be interesting because larger ruminants (cattle) are more likely to be vaccinated, so a relationship between vaccination and recombination dynamics might be gleaned.

Inter- vs. Intratypic recombination

Is it possible for the authors to investigate if there are differences between recombination (hotspots or temporal dynamics) when the two parental viruses belonged to different serotypes vs. the same serotype?

Maybe Important?

Ambiguity of nonstructural region naming

Throughout the paper, there are instances when the authors mention recombination in non-structural regions associated with serotype. However, it is not always clear how they define what serotype these nonstructural regions belong to or are derived from (originally?).

For example, line 488 says “…and recombination breakpoint in the RdRp-coding region 3D in serotype A.” Do they mean that the genome in the 5’ direction of the breakpoint was serotype A (since it would have an A capsid as part of it?), and that the 3’ portion may or may not have come from an A virus?

The authors define the lifespan of a recombinant as “circulating virus lineages that originated from a recombination event and did not undergo additional recombination events”. Some FMDV publications have shown multiple recombination events having taken place within a single host (essentially at the same time) – would any recombinant that comes out of this type of passage be eliminated from the analysis or be considered age 0?

Discussion:

Can the authors include their ideas regarding the reasons for the recombination hotspots they have identified, not including those surrounding the capsid/P1? For example – they say that there is one at the 2C/3A junction in serotypes A and SAT2 – would this have to do with the functions of either of those encoded proteins relative to the parent or donor lineage? The real overall value of this type of analysis is in hypothesis generation, so it is best to include various hypotheses that might follow from their findings.

Minor Items

The word ‘other’ in the second to last Abstract sentence implies that humans get FMD.

Line 94: Not sure about the novelty. Does not the program RDP function by comparing trees to assess recombination?

Table 2: It is not clear why type O is in 4 sections here. Is O partitioned into 4 recombination-free segments or are these 4 specific lineages? Also, the title says, “structural region”, but should it say averaged across recombination-free segments/regions? The 10^-3looks like 10-3.

Line 366: Kenya not Kenia

Line 384-385: this sentence should be rewritten or removed.

Figure 5E and F: are these trend lines correct? Also, the SAT2 does not look like a negative correlation, but none?

Abbreviations: SAT’s S is for Southern

Author Response

We sincerely thank the reviewer for their time, expertise, and constructive feedback on our manuscript. Their comments have been invaluable in helping us improve the paper. Below, we provide a point-by-point response to all comments.

Most Important

Sample origins:

Reply:

We agree that these sequences could introduce bias into molecular dating and evolutionary analyses. Although we removed the most obvious non-natural sequences in the initial dataset, in response to this comment we undertook a comprehensive reassessment of our datasets.

We thoroughly reviewed all isolates by checking original publications to verify their epidemiological origins. For countries known to work with FMDV in research settings, we confirmed that the included isolates were collected during documented natural outbreaks within that country. This process led to the identification and removal of several groups of sequences. A total of 46 isolates of confirmed experimental origin or a non-confirmed natural origin, along with 46 isolates from documented laboratory leaks, were removed. Furthermore, 261 isolates sampled from persistently infected animals were excluded (serotype O – 119, A – 88, Asia1 – 53, SAT3 – 3).

Regarding the UK isolates raised by the reviewer, the UK 2007 isolates were removed prior to the analysis in BEAST in the initial version of manuscript, as they showed clear signs of a lab leak, such as grouping with an archival strain and exhibiting extremely low substitution rates. The UK 2001 isolates, however, did not exhibit these hallmarks; they grouped with contemporary Japanese outbreak viruses from the PanAsia lineage and passed our temporal signal checks in TempEst and BEAST. Also, we did not find any published reference on the unnatural origin of the 2001 UK isolates. Therefore, we retained them.

In total, 353 isolates were removed from the analysis. This is now described in the revised Materials and Methods section and noted in subsection 3.4:

Lines 112-115 “Sequences were additionally filtered by “isolation source” qualifier and “TITLE” field in GenBank entry to exclude those from inactivated vaccines, experimentally infected animals, and laboratory-derived strains.”

Lines 134-140“A significant number of virus isolates (N=261) originated from persistently infected animals. Since the FMDV carriers are not likely to take part in natural transmission, the persistent infection can last for years and could affect the molecular dating analyses. These sequences were excluded from molecular dating analysis. Also, sequences associated with the FMD outbreak in the UK in August 2007 caused by a release of vaccine strain O1 BFS [42] were excluded prior to re-combination temporal dynamics analysis.”

Lines 466-469 “Notably, the initial dataset comprised 261 isolates from persistently infected animals. Since the persistent FMDV infection can last up to several years [70] and it is not clear how to date such isolates, these isolates were excluded from molecular dating analysis..”

Since persistent FMDV isolates are still natural viruses, they were not excluded from the recombination breakpoint analysis but were excluded from molecular clock analysis. The removal of the 46 lab strains did not significantly alter the detection or distribution of recombination breakpoints in the RDP4 analysis. Therefore, the related figures (Figures 2, 4, and S3) were not changed. For the molecular dating analysis in BEAST, the removal of lab sequences and probang (persistent) sequences improved the temporal signal and the MCMC convergence. The mean estimates for substitution rates and tMRCA of the structural (P1) region were largely robust (Table 1).

The effect of dataset refinement on the analysis of recombinant form lifetimes differed for serotypes. For serotype Asia1, where a significant proportion of sequences were removed (53 out of 122 sequences), we observed a substantial decrease in the estimated median lifetimes. For serotype O, filtering reduced the variance in these lifetime estimates, increasing their consistency. The specific results are presented in Table 2. Therefore, the suggestion of the reviewer led to a clear improvement of the output.

All results and discussion sections have been updated to reflect the analyses performed on this refined dataset. We have revised Tables 1, 2, and 3; Figures 3, 5, and 6; and Supplementary Figures S4 through S12. Relevant descriptive text has been added to the Materials and Methods and Results sections.

Table 1. Impact of dataset refinement on the substitution rates and tMRCA inferred for the recombination-free structural region of FMDV serotypes, estimated using Bayesian phylogenetic inference (BEAST2). For serotype O, results are shown for analyses performed on four randomly subsampled datasets.

Serotype	Substitution rate [95% HPD confidence interval] × 10−3, s/s/y	tMRCA [95% HPD confidence interval], years	Substitution rate after removal of experimental isolates and isolates from persistent infections [95% HPD confidence interval] × 10−3, s/s/y	tMRCA after removal of experimental isolates and isolates from FMDV-carriers [95% HPD confidence interval], years
O	2.69 [2.1–3.33]	122 [94.69–156.34]	2.76 [2.1–3.41]	117 [92.6–145.3]
	2.92 [2.33–3.55]	111 [88.95–138.75]	3.22 [2.54–3.94]	107 [88.36–127.99]
	3.13 [2.45–3.85]	103 [82.69–126.05]	3.37 [2.65–4.1]	98 [81.95–117.23]
	2.91 [2.22–3.6]	116 [92.53–143.85]	2.83 [2.22–3.52]	113 [91.26–138.4]
A	4.81 [3.96–5.63]	132 [99.75–174.79]	4.35 [2.7–6.46]	139 [102.87–186.6]
Asia1	3.33 [2.16–4.57]	173 [88.04–291.35]	3.6 [2.49–4.78]	104 [73.99–143.99]
SAT1	1.74 [0.97–2.64]	208 [108.1109–618.647]	1.78 [1.01–2.65]	206 [130.5–295.53]

Table 2. Impact of dataset refinement on the estimates of median lifetimes of recombinant forms resulting from recombination of nonstructural genome regions relative to the structural region.

Serotype	Non-structural genome region	Median half-life time of recombinant forms, years (initial manuscript version)	Median half-life time of recombinant forms after removal of experimental isolates and isolates from FMDV-carriers, years
Asia1	Lpro	13.09	5.83
Asia1	3'part of 2C (P2)-P3	6.1	1.71
A	Lpro	4.93	5.54
	2C	3.96	4.07
	3A-3B-3Cpro-5' part of 3D (P3)	3.82	3.97
	3' part of 3D	3.77	3.97
O	Lpro	8.83	9.68
		4.77	8.39
		6.9	6.93
		5.82	6.09
		Mean=6.58, SD =1.73	Mean=7.77, SD =1.58
	2C (P2)-3A (P3)	10.83	11.08
		11.26	12.23
		12.34	9.57
		9.79	10.38
		Mean=11.05, SD=1.06	Mean=10.81, SD=1.13
	3C-3D (P3)	13.98	16.17
		22.17	16.62
		20.31	17.99
		16.75	16.57
		Mean=18.3, SD=3.66	Mean=16.83, SD=0.79
SAT1	Lpro	4.95	3.40
SAT1	2C (P2) -P3	9.42	11.30

Medium importance

Inclusion of host metadata:

Reply:

Indeed, differences in host-specific recombination dynamics could be very interesting to explore, as host ecology could influence the co-infection and recombination opportunity. Unfortunately, the available full-genome sequence data is heavily biased toward viruses isolated from cattle. While some sequences from other hosts (e.g., pigs, small ruminants) are available, their limited number does not allow for a statistically robust or biologically meaningful comparison at this time. We attempted to color the trees according to the host species in response to this comment, but did not see any consistent pattern that could be identified on the background of significant host- and outbreak-dependent sampling biases.

Inter- vs. Intratypic recombination

Reply:

This is an interesting suggestion, but it is hard to address it. There was no clearly identifiable difference, and it may be in principle difficult to identify, because the sampling coverage of the genome space is not exhaustive, and it is not possible to distinguish an inter-serotype event from an intra-serotype followed by another inter-serotype event. Also, on many occasions only one recombination partner was clear, while the other one was unknown (some distant variant not sampled or extinct).

Maybe Important?

Ambiguity of nonstructural region naming

Reply:

This is a very important comment, but unfortunately it may be in principle impossible to explicitly distinguish intra- and inter-serotype events. To avoid confusion, “intra-serotype” was changed to “observed in the dataset of sequences of the same serotype”

Reply:

In general, our approach is designed to study inter-host recombination dynamics at a phylogenetic scale and is not suitable for resolving intra-host recombination events within a population of very similar sequences, which occur nearly simultaneously within a single host, hypothetically within each cell and each replication cycle.

The detection of recombination used in this study was based on tree comparison and requires lineages to diverge sufficiently to generate a reliable phylogenetic signal and a well-resolved phylogeny. Intra-host evolution typically lacks this divergence due to its short timescale, making such events undetectable or indistinguishable at a phylogenetic scale. Consequently, any recombinant emerging from intra-host passage with multiple near-simultaneous recombinations would rather make up a poorly resolved clade. This ties directly to the limitation we discuss (lines 641-644): poor phylogenetic resolution in some clades (due to insufficient sequence divergence) led us to adopt a “soft” definition of recombinant forms: clades defined by strongly supported bipartitions (ultrafast bootstrap >95% or posterior probability >0.9) across parental trees, rather than exactly matching subtrees. Due to poor resolution within clades with similar sequences and poor branch support it is not possible to distinguish the recombination events within them from artifacts from insufficient phylogenetic signal. Poor resolution thus provides a conservative upper bound on recombinant lifespans, potentially grouping multiple events into a single form rather than precisely dating them.

Also, including intra-host sequences into a time-scaled phylogeny is problematic, as they reflect fundamentally different evolutionary dynamics and should not be analyzed jointly.

Discussion:

Reply:

There may be indeed factors that could hypothetically affect recombination incidence, such as sequence homology and secondary sequence structure. The FMDV genome, as that of most RNA viruses, is full of 3D structure (GORS) according to the “fold as you wish but fold you must” principle. While it is possible in theory to identify its relation to recombination hot spots, the recent study (Lasecka et al, 2021; 10.1128/mSphere.00015-21) did not detect any excessive structure around the massive universal hotspots on the edges of P1. Therefore, it is even more likely to detect such connection in the hot spot in 2C-3A in A and SAT2 serotypes. Such analysis would be further complicated by the fact that these hot spots were formed by just a few events. It seems more likely that protein compatibility was involved. This discussion was added to the text:

Lines 575-590: “Our analysis revealed notable variations in recombination hotspots for serotype-specific datasets. Beyond the mentioned hotspots, significant clustering of recombination breakpoints was identified at specific loci, such as within the 3B region in serotype O and at the 2C/3A junction in serotypes A, SAT2. Although these hotspots passed a formal significance test, they were made up by just a few recombination events; therefore, it may be premature to draw further conclusions from their detection. Furthermore, the frequency of genomic block exchanges varied among serotypes. For instance, while serotypes O and A exhibited more frequent exchanges of the P1 block relative to the P3 rather than Lpro region, Asia1 and SAT2 showed a comparable frequency of P1 exchange with both Lpro and P3. In SAT1, the Lpro region was the most commonly transferred. Collectively, the breakpoints at both edges of the P1 region appear comparably important, but the recombination landscape in FMDV is more complex. As the sequence sampling was notably biased for some serotypes, we can only conclude that there is no consistent difference between recombination frequency on the 5’ and 3’ boundary of the P1 genome region.”

Minor Items

The word ‘other’ in the second to last Abstract sentence implies that humans get FMD.

Abbreviations: SAT’s S is for Southern

Line 386: Kenya not Kenia

Reply: Thank you for pointing out these discrepancies; they were corrected (Lines 27, 58, 386).

Line 94: Not sure about the novelty. Does not the program RDP function by comparing trees to assess recombination?

Reply:

Thank you for this comment; we appreciate the opportunity to clarify this distinction. Indeed, algorithms implemented in RDP4 use tree comparisons as part of recombination detection, but it does not analyze the dynamics of recombination in time. The conceptual framework of manually comparing trees to estimate recombinant clade lifetimes has been described in our publications on other viruses (Lukashev et al, 2014; Vakulenko et al, 2023). The methodological novelty of this publication is the development of an automated, standardized software implementation of this method. This software is not designed for primary detection of recombination. Instead, it automates the process of comparing trees for already-identified regions that were exchanged due to recombination and calculates lifespans for coinciding clades. We revised the sentence to make this distinction and the nature of our novel implementation much clearer.

Lines 97-100: “Furthermore, we implemented an automated method to assess the temporal dynamics of recombination. This method is based upon a published manual approach [38,39] which was developed here to automatically compare time-scaled phylogenetic trees from recombinant genomic regions to calculate the lifetimes and half-life of recombinant forms.”

Reply:

The Bayesian phylogenetic analysis was performed specifically for the defined recombinant-free structural region of each serotype. For serotype O, the complete dataset was exceptionally large, making the analysis of the whole dataset computationally prohibitive. To address this, we generated four random subsamples from the full serotype O dataset (as detailed in Section 2.1.3, “Subsampling of serotype O sequences for Bayesian phylogenetic analysis”) and performed the analysis on each subsample independently. Consequently, the four entries for serotype O in Table 2 correspond to the results from these four distinct analyses of random subsamples; they do not represent specific lineages or partitioned segments of the genome. To eliminate any ambiguity, we have updated the caption of Table 2 as follows:

Lines 498-500: "Table 2. Substitution rates and tMRCA inferred for recombination-free structural region (Figure 4) of FMDV serotypes, estimated using Bayesian phylogenetic inference (BEAST2). For serotype O, results are shown for analyses performed on four randomly subsampled datasets. "

Finally, the notation has been corrected from "10-3" to "10⁻³"

Line 384-385: this sentence should be rewritten or removed.

Reply: The sentence was removed.

Figure 5E and F: are these trend lines correct? Also, the SAT2 does not look like a negative correlation, but none?

Reply:

Thank you for pointing out this question. We have verified the calculations, and the trend lines are correct. The initially reported Spearman correlation for SAT2 (ρ = -0.34) was calculated accurately. Given that the root-to-tip regression explicitly tests for a linear relationship between genetic divergence and time, the Pearson coefficient is conceptually more appropriate than the rank-based Spearman coefficient. For SAT2, Pearson correlation was weak, but significant (ρ = 0.22, p-value=0.008). To avoid potential confusion, the corresponding text was edited as follows:

Lines 467-476: “The temporal signal was present in P1 in the whole dataset, but it had to be confirmed in datasets of distinct serotype sequences. In most serotype datasets there was a reliable correlation between root-to-tip distances and sampling dates upon TempEst analysis. However, the correlation was weak (though significant) in SAT2 (Pearson ρ =0.22, p-value=0.009) (Figure 5E). To formally evaluate temporal signal, we performed a Bayesian Estimation of Temporal Signal (BETS) analysis implemented in BEAST2, which compares a model that utilizes the true sampling dates to a null model where all dates are set equal. This test confirmed a detectable temporal signal for all serotypes, except for serotypes SAT2 (Table S2). Consequently, SAT2 was excluded from subsequent analysis of recombination temporal dynamics.”

Reviewer 2 Report

Comments and Suggestions for Authors

The article titled "Temporal Dynamics of Recombination in Field Isolates of Foot-and-Mouth Disease Virus" provides a comprehensive analysis of recombination patterns and their temporal dynamics in Foot-and-Mouth Disease Virus (FMDV).The study introduces a novel method for assessing the temporal dynamics of recombination by comparing time-scaled phylogenetic trees constructed from different genomic regions. This approach can be applied to other RNA viruses, enhancing the study of viral evolution. However, several areas require clarification, correction, or improvement.
1.The study relies heavily on phylogenetic trees to infer recombination events and their temporal dynamics. However, the accuracy of these trees can be influenced by factors such as sequence alignment quality, substitution model selection, and tree inference methods. The authors should provide more details on how they validated the robustness of their phylogenetic analyses.
2.The authors use a temporal signal to analyze the temporal dynamics of recombination. However, the validation of this signal, particularly for serotypes with limited data, is not thoroughly discussed. Ensuring the temporal signal is robust across all serotypes is crucial for the validity of the results.
3.The estimated lifetimes of recombinant forms vary significantly across serotypes and genomic regions. The authors should provide a more detailed discussion on the potential factors contributing to these variations and how they might impact the interpretation of the results.
4.Some of the figures, particularly those showing recombination breakpoints and heatmaps, lack clear legends and labels. This makes it difficult for readers to interpret the data accurately. The authors should ensure that all figures are accompanied by detailed legends that explain the data presented.
5. The standard deviation of the estimated half-lives of recombinant forms is relatively high, indicating significant variability. The authors should discuss the implications of this variability and how it might affect the reliability of their conclusions.

Author Response

Dear reviewer, we sincerely appreciate your time and expertise in reviewing our manuscript. Your thoughtful comments and suggestions enabled us to improve our manuscript. The point-by-point response is provided below.

The study relies heavily on phylogenetic trees to infer recombination events and their temporal dynamics. However, the accuracy of these trees can be influenced by factors such as sequence alignment quality, substitution model selection, and tree inference methods. The authors should provide more details on how they validated the robustness of their phylogenetic analyses.

Reply:

We have taken several steps to ensure the reliability of our alignments and tree inferences, as detailed below.

All multiple sequence alignments were manually inspected using BioEdit and Jalview to verify correct reading frames and detect potential artifacts (e.g., unexpected stop codons or frameshifts). The whole-genome (species-level) alignment of the polyprotein-coding region contained mainly short in-frame insertions or deletions (typically 3–9 nucleotides). Overall, 93% of alignment columns (6,750 of 7,272) were gap-free, ensuring a reliable phylogenetic signal. Recombinant-free regions were excised from this high-quality alignment. Serotype-level alignments showed even higher proportions of gap-free columns. The manuscript was revised as follows:

Lines 130-132: “The resulting alignment of ORF encoding the polyprotein was comprised of 1,439 sequences and was examined for quality manually in JalView and BioEdit programs.”

We followed rigorous model-selection procedures for both Maximum Likelihood (ML) and Bayesian analyses. For ML trees (IQ-TREE), the best-fit substitution model was selected using ModelFinder based on the Bayesian Information Criterion (BIC) (described in lines 184-187). For Bayesian phylogenetic analysis, models were chosen via the Nested Sampling method to ensure appropriate tree branching and clock model specification (lines 207-213). The study conclusions are based on well-supported branches (ultrafast bootstrap support >95 or posterior probability >0.9).

Along with these standard procedures, to test the stability of our evolutionary estimates we performed additional analyses on four randomly subsampled datasets for serotype O. The consistency of key parameters (e.g., substitution rates, tMRCAs) across these subsets supports the robustness of our inferences. More details on estimated half-lives of recombinant forms are provided in our response to comment 4.

The authors use a temporal signal to analyze the temporal dynamics of recombination. However, the validation of this signal, particularly for serotypes with limited data, is not thoroughly discussed. Ensuring the temporal signal is robust across all serotypes is crucial for the validity of the results.

Reply:

To ensure the robustness of the temporal signal, we evaluated it using two approaches: a preliminary regression of root-to-tip distances against sampling dates (implemented in TempEst) and a formal Bayesian Estimation of Temporal Signal (BETS) test within BEAST2. The formal BETS test compares two models—one incorporating the true sampling times and another constraining all samples to be contemporaneous—to objectively determine if a dataset contains a temporal signal. A dataset was considered to have a significant temporal signal if the log Bayes factor (>5) supported the model with correct sampling dates (lines 214-219). The marginal likelihoods used to calculate these Bayes factors are provided in Table S2.

We have revised the Results section to clarify this validation process:

Lines 478-487: “The temporal signal was present in P1 in the whole dataset, but it had to be confirmed in datasets of distinct serotype sequences. In most serotype datasets there is a reliable correlation between root-to-tip distances and sampling dates upon TempEst analysis. However, the correlation was weak (though significant) in SAT2 (Pearson ρ =0.22, p-value=0.009) (Figure 5E). To formally evaluate temporal signal, we performed a Bayesian Estimation of Temporal Signal (BETS) analysis implemented in BEAST2, which compares a model that utilizes the true sampling dates to a null model where all dates are set equal. This test confirmed a detectable temporal signal for all serotypes, except for serotypes SAT2 (Table S2). Consequently, SAT2 was excluded from subsequent analysis of recombination temporal dynamics.”

The estimated lifetimes of recombinant forms vary significantly across serotypes and genomic regions. The authors should provide a more detailed discussion on the potential factors contributing to these variations and how they might impact the interpretation of the results.

Reply: We have expanded the Discussion section to provide a more detailed discussion of lifetime variability across serotypes and genome regions involved in recombination:

Lines 678-687: “The half-life time of recombinant forms for serotype A was more than twice as short as for serotype O. While this could be a result of sampling bias, it is interesting to note that host diversity was higher for serotype O. 86% of serotype A isolates and just 66% of serotype O isolates with a known host originated from cattle, while 1% of serotype A and 20% of serotype O isolates came from pigs. Also, serotype O, but not A full-genome sequences originated from sheep 5% and goats (1%). It is possible that circulation in diverse hosts limited probability of recombination and provided longer circulation of distinct recombinant forms; however, given the highly uneven sampling coverage in terms of countries, hosts and outbreaks this observation requires further confirmation.”

Some of the figures, particularly those showing recombination breakpoints and heatmaps, lack clear legends and labels. This makes it difficult for readers to interpret the data accurately. The authors should ensure that all figures are accompanied by detailed legends that explain the data presented.

Reply: We appreciate the reviewer's feedback regarding the clarity of figure captions. We reviewed all figure legends to make sure they are detailed and exhaustive. Specifically, we included a more detailed description of all key components and provided clearer explanation of how to interpret recombination hotspots and coldspots within the captions of Figures 2, 4, and S3.

The initial caption was changed from

“Figure 2. Distribution of recombination breakpoints and heatmap of recombination events detected by RDP4 in FMDV coding sequences. (A) Breakpoint distribution plot (bottom) and schematic representation of the FMDV ORF (top). The black lines indicate the number of breakpoints detected within a 100-nt window. The light and dark gray areas represent the local 95% and 99% confidence intervals, respectively. The red and blue lines represent the global 95% and 99% confidence limits, respectively [46]. (B) Recombinant regions count matrix. Unique recombination events are plotted according to the inferred breakpoints. The color scale reflects the number of recombination events that separate nucleotide pairs. Warmer colors indicate regions that were more frequently ex-changed due to recombination.”

“Figure 2. Recombination analysis of FMDV coding sequences using RDP4. (A) Distribution of inferred recombination breakpoints across the FMDV open reading frame (ORF). The top panel is a schematic of the FMDV open reading frame (ORF), showing the major functional regions (L, P1, P2, and P3) and mature proteins. Individual breakpoint locations are marked by tick marks below the plot. The bottom panel plots the number of inferred recombination breakpoints (black line) detected within a sliding 100-nt window. The light and dark gray areas represent the local 95% and 99% confidence intervals, respectively. Peaks where the black line rises above these shaded areas indicate statistically significant recombination hotspots; dips below indicate coldspots [46]. (B) Recombinant regions count matrix. Unique recombination events are mapped according to the inferred breakpoints at panel A. Each cell in the matrix corresponds to a pair of genome sites. The color of a cell (see heat scale) indicates the number of times the detected recombination events separated that specific pair of genome positions. Warmer colors (red) indicate genome regions that were more often exchanged due to recombination. The FMDV ORF schematic is aligned above for genomic reference.”

The captions for Figures 4 and S3 were changed accordingly.

The standard deviation of the estimated half-lives of recombinant forms is relatively high, indicating significant variability. The authors should discuss the implications of this variability and how it might affect the reliability of their conclusions.

Reply:

Indeed, the first reviewer pointed out that samples from persistently infected animals could bias the results and we have overlooked several lab-derived strains. Although we removed the most obvious non-natural sequences in the initial dataset, in response to this comment we undertook a comprehensive reassessment of our datasets and found that there were 45 isolates of confirmed experimental origin or a non-confirmed natural origin and 261 isolates sampled from persistently infected animals.

Since persistent FMDV isolates are still natural viruses, they were not excluded from the recombination breakpoint analysis but were excluded from molecular clock analysis. The removal of the 46 lab strains did not significantly alter the detection or distribution of recombination breakpoints in the RDP4 analysis. Therefore, the related figures (Figures 2, 4, and S3) were not changed. For the molecular dating analysis in BEAST, the removal of lab sequences and sequences from persistently infected animals improved the temporal signal and the MCMC convergence. The median estimates for substitution rates and tMRCA of the structural (P1) region were largely robust (Table 1).

The effect of dataset refinement on the analysis of recombinant form lifetimes differed for serotypes. For serotype Asia1, where a significant proportion of sequences were removed (53 out of 122 sequences), we observed a substantial decrease in the estimated median lifetimes. For serotype O, filtering reduced the variance in these lifetime estimates, increasing their consistency, with a highest SD of 1.58 years. The specific results are presented in Table 2.

Serotype	Substitution rate [95% HPD confidence interval] × 10−3, s/s/y	tMRCA [95% HPD confidence interval], years	Substitution rate after removal of experimental isolates and isolates from persistent infections [95% HPD confidence interval] × 10−3, s/s/y	tMRCA after removal of experimental isolates and isolates from FMDV-carriers [95% HPD confidence interval], years
O	2.69 [2.1–3.33]	122 [94.69–156.34]	2.76 [2.1–3.41]	117 [92.6–145.3]
	2.92 [2.33–3.55]	111 [88.95–138.75]	3.22 [2.54–3.94]	107 [88.36–127.99]
	3.13 [2.45–3.85]	103 [82.69–126.05]	3.37 [2.65–4.1]	98 [81.95–117.23]
	2.91 [2.22–3.6]	116 [92.53–143.85]	2.83 [2.22–3.52]	113 [91.26–138.4]
A	4.81 [3.96–5.63]	132 [99.75–174.79]	4.35 [2.7–6.46]	139 [102.87–186.6]
Asia1	3.33 [2.16–4.57]	173 [88.04–291.35]	3.6 [2.49–4.78]	104 [73.99–143.99]
SAT1	1.74 [0.97–2.64]	208 [108.1109–618.647]	1.78 [1.01–2.65]	206 [130.5–295.53]

Table 2. Impact of dataset refinement on the estimates of median lifetimes of recombinant forms resulting from recombination of nonstructural genome regions relative to the structural region.

Serotype	Non-structural genome region	Median half-life time of recombinant forms, years (initial manuscript version)	Median half-life time of recombinant forms after removal of experimental isolates and isolates from FMDV-carriers, years
Asia1	Lpro	13.09	5.83
Asia1	3'part of 2C (P2)-P3	6.1	1.71
A	Lpro	4.93	5.54
	2C	3.96	4.07
	3A-3B-3Cpro-5' part of 3D (P3)	3.82	3.97
	3' part of 3D	3.77	3.97
O	Lpro	8.83	9.68
		4.77	8.39
		6.9	6.93
		5.82	6.09
		Mean=6.58, SD =1.73	Mean=7.77, SD =1.58
	2C (P2)-3A (P3)	10.83	11.0792
		11.26	12.2321
		12.34	9.5669
		9.79	10.3772
		Mean=11.05, SD=1.06	Mean=10.81, SD=1.13
	3C-3D (P3)	13.98	16.1736
		22.17	16.6199
		20.31	17.9894
		16.75	16.5706
		Mean=18.3, SD=3.66	Mean=16.83, SD=0.79
SAT1	Lpro	4.95	3.4043
SAT1	2C (P2) -P3	9.42	11.3077

An analysis of the sources and implications of the variability in half-life estimates was added to the Discussion::

Lines 660-665: “Overall, the methodology was relatively robust as the standard deviations of recombinant forms’ half-lives ranged from 0.79 to 1.58 years between the O serotypes subsampled datasets, with the overall pattern (shorter half-lives for Lpro recombinants, longer for 3D recombinants) remaining consistent. This indicates that while absolute estimates carry some uncertainty due to sampling, the comparative conclusions are reliable.”

Lines 666-677: “A critical factor for estimation of recombination forms’ lifetimes is the quality of the time-scaled phylogeny itself. Sequences of non-natural origin (e.g., laboratory strains or potential lab leaks), incorrect collection dates, or viruses from persistently infected hosts can distort molecular clock estimates and bias half-life calculations. In a key refinement suggested by a reviewer, we excluded sequences from persistently infected FMDV-carriers from the molecular dating analysis. This filtering improved the temporal signal and Markov chain Monte Carlo (MCMC) convergence. For serotype Asia1, where a significant proportion of sequences were removed (53 of 122), we observed a substantial decrease in estimated median recombinant form lifetimes. For serotype O, filtering improved consistency of the results by reducing the variance between subsamples. These results underscore the importance of dataset curation for molecular clock inference.”

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The improvements are very well done. This reviewer appreciates the tables comparing the changes made included in the response.

A few minor items:

The choices that the authors made regarding the probang-derived samples are acceptable - it is true that field samples from FMDV carriers are real field samples. However their propensity to transmit is near zero (for the most commonly studied hosts), making them a tricky data point to utilize meaningfully in the evolutionary pathways and transmission history of FMDV.

The authors should include a reference of some sort to the persistence information added (~line 140). Also, it is actually not accurate to say that those samples were persistent infections - they were mostly 'possible' persistent infections - that is, a mixture of acute and persistent (or both). The method of sampling and testing for persistent infection does not discriminate between the two.

2. Line ~1090 - If the editor did not request this already, reference to the peer reviewer is not needed here. The purpose of having subject matter expert peer reviewers is precisely to point out things like this, so this sentiment should really go without saying in any (well-reviewed) scientific paper.

3. This should have no bearing on the manuscript as-is, but this reviewer is curious as to whether the authors have analyzed the recombination dynamics using full-length or nearly-full length FMDV sequences as well? They might not be statistically significant for all but serotype O viruses (or none?), but it would be interesting to know if there were any signal in the IRES, for example.

Cheers

Author Response

We sincerely appreciate the reviewer's time and their positive feedback on our revised work. Our point-by-point responses to the remaining comments are provided below.

1. The authors should include a reference of some sort to the persistence information added (~line 140). Also, it is actually not accurate to say that those samples were persistent infections - they were mostly 'possible' persistent infections - that is, a mixture of acute and persistent (or both). The method of sampling and testing for persistent infection does not discriminate between the two.

Reply: As requested, we have included a supporting reference and carefully rephrased the indicated paragraph to more accurately reflect the nature of the samples:

Lines 132-134“A significant number of virus isolates (N=261) originated from probang-derived samples. These samples most likely represent persistent infections, as this method is routinely used to screen for potential FMDV carriers. Since the FMDV carriers are not likely to take part in natural transmission, the persistent infection can last for years and could affect the molecular dating analyses [42]. These sequences were excluded from molecular dating analysis. Also, sequences associated with the FMD outbreak in the UK in August 2007 caused by a release of vaccine strain O1 BFS [43] were excluded prior to recombination temporal dynamics analysis.”

Reply: We have removed the reference to the peer reviewer from lines 657-659 as recommended.

“A critical factor for estimation of recombination forms’ lifetimes is the quality of the time-scaled phylogeny itself. Sequences of non-natural origin (e.g., laboratory strains or potential lab leaks), incorrect collection dates, or viruses from persistently infected hosts can distort molecular clock estimates and bias half-life calculations. To mitigate this, we excluded sequences from potential FMDV carriers from the molecular dating analysis. This filtering improved the temporal signal and Markov chain Monte Carlo (MCMC) convergence.”

Reply: Our initial analysis focused on the coding sequence to maximize the dataset size, as the 5'UTR was absent from approximately 40% of sequences. To specifically address the reviewer's point, we performed a RDP4 scan on the subset of genomes with available 5'UTR. This analysis identified several isolated recombination breakpoints in the 5'UTR but did not reveal any statistically significant recombination hotspots in this region, either in the species-level alignment or in serotype-specific (O and A) alignments.

Reviewer 2 Report

Comments and Suggestions for Authors

There are no further questions.

Author Response

We thank the reviewer for their time and expertise. We appreciate the reviewer's confirmation that our responses and manuscript changes have addressed their earlier comments.

Article Menu

Temporal Dynamics of Recombination in Field Isolates of Foot-and-Mouth Disease Virus

Further Information

Guidelines

MDPI Initiatives

Follow MDPI