Protein Structure, Models of Sequence Evolution, and Data Type Effects in Phylogenetic Analyses of Mitochondrial Data: A Case Study in Birds
Round 1
Reviewer 1 Report
- It is best to give the result of whether the new model can improve the branch length estimation.
- The two terms, historical signal vs non-historical signals, are not clearly defined on page 3.
- Page 7, Figure 2, panel (a) and (b) should capital.
- Check Line 422 and line 479 for subsection number.
Author Response
Reviewer comments are reproduced followed by our responses:
1. It is best to give the result of whether the new model can improve the branch length estimation.
RESPONSE: This is an excellent question, but it is challenging to determine whether or not branch lengths are improved when the model is changed. We have added the following to the last paragraph of section 3.3:
“Both partitioned analysis and use of the mixture model had an impact on branch length estimates; relative to tree resulting from the all sites unpartitioned analysis, the bird mtMIX treelength was 1.148 and the partitioned analysis treelength was 1.235. The ratio of the sum of the internal branch lengths to the total treelength was virtually identical across analyses (31.98% for the bird mtMIX model, 32.019% for partitioned analysis, and 32.03% for the unpartitioned analysis).”
We focused on the total treelength and the sum of the internal branch lengths, because they are the easiest values to understand. I suspect the longer branches in the partitioned and mixture model analyses are better estimates of the true underlying branch lengths, but we should also remain open minded to the possibility that they are overestimates. Ultimately, it is likely to be necessary to conduct simulations to answer this question, so we feel that detailed examination of the impact of models on branch lengths lies outside the scope of this manuscript. For this reason, we also state that “However, it will be necessary to conduct simulations to understand whether the branch length estimates based on analyses using mtMIX or partitioned analyses are closer to the true branch lengths.”
2. The two terms, historical signal vs non-historical signals, are not clearly defined on page 3.
RESPONSE: We have added the following definitions:
- historical signal = site patterns that support bipartitions in the true tree
- non-historical signals = site patterns that support bipartitions that are not present in the true tree
We acknowledge that “non-historical signals” are model dependent, at least to some degree. A site pattern that supports an erroneous bipartition (i.e., the site pattern has a better likelihood on the incorrect tree than on the correct tree) given one specific model might not support that erroneous bipartition given a different model. We feel the model dependence is clear given our discussion of models
3. Page 7, Figure 2, panel (a) and (b) should capital.
RESPONSE: Fixed.
4. Check Line 422 and line 479 for subsection number.
RESPONSE: Fixed.
Reviewer 2 Report
The bird phylogeny is a “phylogenetic model system” to test new ideas and models because of the low signal in the deep branches of the tree, which has resulted in many conflicting hypotheses. The mitochondrial genome is often not considered informative enough to help on this issue because of its limited number of base pairs and non-recombining nature.
In this article, the authors investigate whether incorporating biochemical information into phylogenetic models impacts phylogenetic reconstruction based on mitogenomes. Mitochondrially encoded proteins are partitioned into transmembrane and extra-membrane parts across a large number of birds. The experiments are carefully conducted and explicitly justified. The article finds that the best fitting substitution models differ between the transmembrane and extra-membrane partitions of mitochondrial genes. Accounting for these differences made no significant difference in improving the tree topology in the case of birds, but may prove important in other taxa.
These results will prove important in a range of taxa, because mitogenomic data is often the first sequence data to be investigated in a group of interest and because of its relative ease of analysis. The article contributes an important consideration when designing a mitogenomic study, but is also careful in not overstating the importance of mitogenomics in the age of whole nuclear genomes.
I would like to emphasize how much I enjoyed reading this article. Particularly the introduction stands out with its clear style and readability and is an excellent primer on phylogenetic incongruence..
I only have minor comments.
The use of the abbreviation TM and ExM vs. mtTM and mtExM changes throughout the manuscript. It would be best to use one version throughout.
Minor points
L.113 Punctation missing
L.137,138 What did the two studies do to reach this virtual certainty? Were these simulations or investigations on a large number of mitogenomes?
L.187 bracket missing
L.194 Indicate how alignment was done, I assume by eye. Were there any insertions or deletions and how were they dealt with?
L.205 typo phylogeny?
L.395 should be figure 5
Author Response
We have reproduced the reviewer comments followed by our responses.
The bird phylogeny is a “phylogenetic model system” to test new ideas and models because of the low signal in the deep branches of the tree, which has resulted in many conflicting hypotheses. The mitochondrial genome is often not considered informative enough to help on this issue because of its limited number of base pairs and non-recombining nature.
In this article, the authors investigate whether incorporating biochemical information into phylogenetic models impacts phylogenetic reconstruction based on mitogenomes. Mitochondrially encoded proteins are partitioned into transmembrane and extra-membrane parts across a large number of birds. The experiments are carefully conducted and explicitly justified. The article finds that the best fitting substitution models differ between the transmembrane and extra-membrane partitions of mitochondrial genes. Accounting for these differences made no significant difference in improving the tree topology in the case of birds, but may prove important in other taxa.
These results will prove important in a range of taxa, because mitogenomic data is often the first sequence data to be investigated in a group of interest and because of its relative ease of analysis. The article contributes an important consideration when designing a mitogenomic study, but is also careful in not overstating the importance of mitogenomics in the age of whole nuclear genomes.
I would like to emphasize how much I enjoyed reading this article. Particularly the introduction stands out with its clear style and readability and is an excellent primer on phylogenetic incongruence..
RESPONSE: We thank the reviewer for these kind words.
I only have minor comments.
The use of the abbreviation TM and ExM vs. mtTM and mtExM changes throughout the manuscript. It would be best to use one version throughout.
RESPONSE: We intended use TM and ExM versus mtTM and mtExM in the following way:
- TM and ExM: amino acids (or codons) in transmembrane or extramembrane regions
- mtTM and mtExM: the models (the R matrices) for the TM and ExM data types, respectively.
We understand the reviewer’s concern and wanted to address this in two ways:
- We clarified the meaning of these terms. To do this we modified the methods to read as follows:
“We analyzed three amino acid datasets (TM sites, ExM sites, and all sites) using the GTR20 and mtVer [69] models. We accommodated among sites rate heterogeneity using a combination of invariant sites and Γ-distributed rates across sites. We used empirical amino acid frequencies (+F) for the mtVer. For the partitioned analysis we fixed R matrix parameters at the values estimated using the separate TM and ExM alignments, which we call the bird mtTM model and bird mtExM model (hereafter, TM and ExM will be used as abbreviations for transmembrane and extramembrane sites while mtTM and mtExM will be used for the R matrices).” The parenthetical that begins “hereafter, TM and ExM will be used…” was added.
- We checked to make sure we used the terms consistently. We found one place where it was appropriate to revise the text (in the first paragraph of section 3.3). Now the terms are used in consistent manner.
Minor points
L.113 Punctation missing
RESPONSE: Fixed.
L.137,138 What did the two studies do to reach this virtual certainty? Were these simulations or investigations on a large number of mitogenomes?
RESPONSE: We have rewritten this as “When the results of Jones et al. [59] and Liò and Goldman [60] (which largely reflect nuclear-encoded TM proteins) are considered in light of the support for different relative exchangeabilities of amino acids in distinct structural environments [17,29–31], it seems likely that analyses focused on mitochondrially-encoded proteins will yield evidence of model differences between data types.” This weakens the statement from “virtual certainty” to “likely” and it emphasizes the basis of out logic.
Our goal was to set up the idea that whether or not significant topological differences would be evident in trees based on TM versus ExM sites is a very open question, but it seems much more likely that the relative exchangeability (R) matrices will differ. However, “virtually certain” is definitely an overstatement and we thank the reviewer for pointing this out.
L.187 bracket missing
RESPONSE: We changed the text to eliminate the need for a bracket and improve clarity. The complete sentence now reads “To complement our analyses of amino acid data, we analyzed the nucleotide sequences for each data type (including analyses conducted after RY-coding, in which the data are encoded as purines or pyrimidines).”
L.194 Indicate how alignment was done, I assume by eye. Were there any insertions or deletions and how were they dealt with?
RESPONSE: Yes, they were aligned by eye. Lengths of avian mitochondrial sequences are very conserved and indels are relatively easy to place. We have added a sequence to indicate this.
L.205 typo phylogeny?
RESPONSE: Fixed
L.395 should be figure 5
RESPONSE: Fixed
Reviewer 3 Report
Gordon et al. present an study performed on 420 species of 18 orders of bird by comparing the evolution of transmembrane helices and extramembrane segments of their mitochondrial proteins with the aim to test the hypothesis that hydrophobic amino acids in mitochondrially encoded proteins to be associated with non-historical signals. In my opinion the analysis is very accurate and thorough and the relative conclusion convincing and relevant for the advancement of phylogenomic study of birds, and other taxa in general.
My a few comments are in the attached file.
Comments for author File: Comments.pdf
Author Response
Gordon et al. present an study performed on 420 species of 18 orders of bird by comparing the evolution of transmembrane helices and extramembrane segments of their mitochondrial proteins with the aim to test the hypothesis that hydrophobic amino acids in mitochondrially encoded proteins to be associated with non-historical signals. In my opinion the analysis is very accurate and thorough and the relative conclusion convincing and relevant for the advancement of phylogenomic study of birds, and other taxa in general.
My a few comments are in the attached file.
RESPONSE: We thank the reviewer for this positive evaluation, and we have addressed the concerns that the reviewer highlighted in the pdf. We chose not to make one specific change: the reviewer suggested that we move the statement “Arguably, mitochondrial sequence data have the greatest potential as sources of information for biodiversity studies near the tips of the vertebrate tree of life [84-86]. Thus, it would be desirable to assess the impact of protein structure on analyses of nucleotide data.” from section 3.4 to the introduction or discussion. We understand the reviewer’s point, but this is a really short statement that clarifies he transition from earlier sections to section 3.4, so we prefer to keep it where it is. We made all of the other changes that the reviewer recommended.