Next Article in Journal
A Cell-Based ELISA to Improve the Serological Analysis of Anti-SARS-CoV-2 IgG
Next Article in Special Issue
Host Range and Coding Potential of Eukaryotic Giant Viruses
Previous Article in Journal
Simulations of Phage T7 Capsid Expansion Reveal the Role of Molecular Sterics on Dynamics
Previous Article in Special Issue
Diversity of tRNA Clusters in the Chloroviruses
 
 
Article
Peer-Review Record

Comparative Analysis of the Circular and Highly Asymmetrical Marseilleviridae Genomes

Viruses 2020, 12(11), 1270; https://doi.org/10.3390/v12111270
by Léo Blanca, Eugène Christo-Foroux, Sofia Rigou and Matthieu Legendre *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Viruses 2020, 12(11), 1270; https://doi.org/10.3390/v12111270
Submission received: 7 October 2020 / Revised: 5 November 2020 / Accepted: 5 November 2020 / Published: 7 November 2020
(This article belongs to the Collection Unconventional Viruses)

Round 1

Reviewer 1 Report

The manuscript bt Blanca et al reports interesting findings about genome organization of Marseilleviruses, revealing a circular structure and the presence of a generally well conserved region in all clades and a second region more subjected to the evolutionary pressure. Thus, it provides conclusive evidence to previously reported observations in single species. The manuscript is clear, methods seems adequate and the discussion is complete and well presented.

Points that I suggest to revise concerns Figure 2 and 3. Authors are urged to clarify better how the Figure 2 was designed. For instance it is stated that the most divergent member of Clade A is Tokyovirus, as also pointed out by Fig. S1. However, in Figure 2 also Marseillevirus Shanghai, belonging to clade A seems to be highly divergent as well, in particular in the rightmost conserved region. This is not cited in the text, neither reported in the Discussion. It is not clear to me if I misinterpret the figure, or if Marseillevirus Shanghai also differs from the others in clade A. If it is the latter case, Authors should comment on that.

In Figure 3 Tokyovirus, which is indicated as the most divergent among the clade A, is used as a representative for comparison of nucleotide conservation with the other clades. Why was Tokiovirus used instead of the other more conserved members of clade A? This should be indicated in the text.

Author Response

Reviewer 1

“The manuscript bt Blanca et al reports interesting findings about genome organization of Marseilleviruses, revealing a circular structure and the presence of a generally well conserved region in all clades and a second region more subjected to the evolutionary pressure. Thus, it provides conclusive evidence to previously reported observations in single species. The manuscript is clear, methods seems adequate and the discussion is complete and well presented.”

We thank the referee for pointing out the quality of this work and try to respond to his concerns.

Points that I suggest to revise concerns Figure 2 and 3. Authors are urged to clarify better how the Figure 2 was designed. For instance it is stated that the most divergent member of Clade A is Tokyovirus, as also pointed out by Fig. S1. However, in Figure 2 also Marseillevirus Shanghai, belonging to clade A seems to be highly divergent as well, in particular in the rightmost conserved region. This is not cited in the text, neither reported in the Discussion. It is not clear to me if I misinterpret the figure, or if Marseillevirus Shanghai also differs from the others in clade A. If it is the latter case, Authors should comment on that.

The remark from the referee on Fig 2 is due to a misinterpretation of our figure. Apart from tokyovirus, all viruses from clade A are almost completely collinear. Thus, the large number of rearrangements that the referee is pointing at refers to differences between viruses from clade A and viruses from clade B. In other words, we could have swapped marseillevirus shanghai with melbournevirus, cannes 8 virus or marseillevirus, the rearrangements between the last genome of clade A and the first of clade B would have been identical. To clarify these results and avoid misinterpretation we modified the figure and the corresponding legend. We hope that the reworked figure is now clearer.

In Figure 3 Tokyovirus, which is indicated as the most divergent among the clade A, is used as a representative for comparison of nucleotide conservation with the other clades. Why was Tokiovirus used instead of the other more conserved members of clade A? This should be indicated in the text.

All the viruses presented in Figure 3 are compared to the marseillevirus reference. Given the very high conservation between marseillevirus and the other viruses of clade A (roughly 98% with melbournevirus, cannes 8 virus and marseillevirus shanghai), we reasoned that the comparison with these viruses would not allow to reveal divergent regions. Therefore we chose the most divergent clade A virus (tokyovirus) otherwise the first line of the figure would have been completely red, with no relevant information. We added this sentence in the results part to clarify our choice of tokyovirus as a representative of clade A.Given that the viruses from clade A are highly conserved (Figure S1), we chose the most divergent one, namely tokyovirus, to be compared to the marseillevirus reference and highlight potential divergent regions.

Author Response File: Author Response.pdf

Reviewer 2 Report

In the manuscript "Comparative Analysis of the Circular and Highly 2 Asymmetrical Marseilleviridae Genomes", the authors experimentally demonstrate that the Marseilleviridae genomes are circular and are composed of two parts, one gathering duplocated genes, and another one mainly composed of core genes. This work and the results are interesting. The manuscript is well written, the illustrations are well presented.

However, some points need to be clarified or discussed.

Line 33 :  “Mature particles are then released through cell lysis roughly 8h post infection”

The authors should precise that the cycle could last more than 8hours and experimentally, some virions can be released at 13-16h and even until 24h.

Lines 94-95 “We performed a protein-coding genes re-annotation of all the sequences using the same gene-94 finding algorithm: genemarks [20] version 4.32 with the “virus” parameter”

Given the proportion of genes of bacterial origin in marseilleviruses, a brief analysis if there was a significant difference in the orfs prediction between the parameters “virus” and “prokaryotic” may be interesting.

Lines 103-110: please specify the thresholds for coverage and / or identity percentage to identify orthologous genes?

Lines 256-258: Might the breakpoints identified by the authors be hotspots for gene exchange? This point needs to be discussed.

The predicted orfs located on both sides of these breakpoints need to be cited and discussed if needed.

 

Line 260: Precise the locations of this pfam02399 all along the genome?

 

Line355: « Early genes are expressed from the beginning of the cycle with a peak of transcriptional activity between 1h pi and 2h pi, intermediate genes are mostly expressed between 1h pi and 4h pi and late genes from 4h pi until the end of the cycle. » Add clearly when  are mostly expressed  strain specific genes?

Lines 392-395 : For each of these group of giant viruses, please precise how the structure of the genome was predicted?

Line 417 : « Yet, these two viral families infect the exact same host and thus face the same environment. »

Do the authors talk about marseille and pithoviruses? Please clarify.  Other giant viruses have the same host than marseilleviruses and pithoviruses. This affirmation should be nuanced.

 

Author Response

Reviewer 2

In the manuscript "Comparative Analysis of the Circular and Highly 2 Asymmetrical Marseilleviridae Genomes", the authors experimentally demonstrate that the Marseilleviridae genomes are circular and are composed of two parts, one gathering duplocated genes, and another one mainly composed of core genes. This work and the results are interesting. The manuscript is well written, the illustrations are well presented.

We thank the referee for this positive comment. However, some points need to be clarified or discussed.

Line 33: “Mature particles are then released through cell lysis roughly 8h post infection”

The authors should precise that the cycle could last more than 8hours and experimentally, some virions can be released at 13-16h and even until 24h.

We modified the introduction with the following sentence to take this comment into account: The replication cycle can last more than 8h with virion released at 13-16h up to 24h pi.

Lines 94-95 “We performed a protein-coding genes re-annotation of all the sequences using the same gene-94 finding algorithm: genemarks [20] version 4.32 with the “virus” parameter”

Given the proportion of genes of bacterial origin in marseilleviruses, a brief analysis if there was a significant difference in the orfs prediction between the parameters “virus” and “prokaryotic” may be interesting.

Following the reviewer advice we predicted ORF in the marseillevirus genome with the “prokaryote” option of genemarkS.

We found exactly the same number of predicted protein-coding genes (509) the only difference being that alternative START codons (coding for valine and leucine) were sometimes used instead of methionine. This only slightly changed the average predicted protein length, from 223.2 to 224.3 amino acids.

We also added the missing following sentence from our initial submission to the materials and methods section: We performed a protein-coding genes re-annotation of all the sequences using the same gene-finding algorithm: GeneMarkS [20] version 4.32 with the “virus” parameter and kept Open Reading Frames (ORF) coding for proteins of at least 50 amino acids.

Lines 103-110: please specify the thresholds for coverage and / or identity percentage to identify orthologous genes? We did not use any specific threshold to define orthologous genes but used the default values of OrthoFinder. Basically this software uses a blast e-value threshold of 10-3 at first, then normalize the score by protein length and phylogenetic distance, and uses Reciprocal Best length-Normalised hits to define orthologous groups.

Please have a look at the following publication for more details on the algorithm: Emms DM, Kelly S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015 Aug 6;16(1):157. doi: 10.1186/s13059-015-0721-2. PMID: 26243257; PMCID: PMC4531804.Lines 256-258: Might the breakpoints identified by the authors be hotspots for gene exchange? This point needs to be discussed.

The predicted orfs located on both sides of these breakpoints need to be cited and discussed if needed.

Following the reviewer’s advice we checked for possible enrichment of HGTs in the neighborhood of the breakpoints. On the marseillevirus reference genome potential HGTs are not specifically enriched in this region but are uniformly distributed across the genome (kolgmogorov-smirnov test p-value = 0.98). Likewise, the predicted genes in these regions have no particular function. We thus feel that it does not need to be mentioned in this manuscript. Line 260: Precise the locations of this pfam02399 all along the genome? The protein-coding genes with a pfam02399 domain are evenly distributed along the genomes with no specific positional trend. We added this information in the results section using the following sentence:

The PFAM02399 domain containing genes are evenly distributed along the genomes with no specific trend in their genomic distribution.

Line355: « Early genes are expressed from the beginning of the cycle with a peak of transcriptional activity between 1h pi and 2h pi, intermediate genes are mostly expressed between 1h pi and 4h pi and late genes from 4h pi until the end of the cycle. » Add clearly when are mostly expressed strain specific genes? Neither strain-specific nor clade-specific genes have a biased distribution regarding expression timing. This is now added in the main text:

Conversely strain and clade specific genes are not specifically enriched in one of those classes (Chi-square p-value = 0.1).

Lines 392-395 : For each of these group of giant viruses, please precise how the structure of the genome was predicted? The genome topology of all these viruses is based on sequencing read mapping and genome assembly. We now add this information in the discussion. The six remaining families, namely the Mimiviridae [43], the pandoraviruses [44], the molliviruses [45], the faustoviruses [39], the pacmanviruses [46] and medusavirus [47] are all predicted, based on sequencing read mapping and genome assembly, to exhibit linear genomes.Line 417 : « Yet, these two viral families infect the exact same host and thus face the same environment. »

Do the authors talk about marseille and pithoviruses? Please clarify. Other giant viruses have the same host than marseilleviruses and pithoviruses. This affirmation should be nuanced. In this paragraph our point was to compare the Marseilleviridae, that exhibit a circular genome, to other viruses that infect Acanthameoba, namely the pandoraviruses, that have a linear genome and a much lower proportion of bacteria-related gene exchange. From there we hypothesize that circular genome topology might facilitate gene exchanges with bacteria. We now also mention the pithoviruses, that also have circular genomes and several bacteria-like genes.

To our opinion this reinforces our hypothesis. We thus modified the discussion and added this: Noteworthy, the only giant virus family exhibiting circular genomes, the pithoviruses, also contain a high proportion (38%) of cell-virus potential gene exchanges related to bacteria, although it only accounts for 8% of the total gene set [40]. This again supports the hypothesis that circular genome topology might facilitate gene transfers with this domain.

Author Response File: Author Response.pdf

Back to TopTop