Next Article in Journal
Probing Structural Dynamics of Membrane Proteins Using Electron Paramagnetic Resonance Spectroscopic Techniques
Previous Article in Journal
Nanomarker for Early Detection of Alzheimer’s Disease Combining Ab initio DFT Simulations and Molecular Docking Approach
Article

The Roles of Protein Structure, Taxon Sampling, and Model Complexity in Phylogenomics: A Case Study Focused on Early Animal Divergences

1
Department of Biology, University of Florida, Gainesville, FL 32611, USA
2
Kronos Bio, Inc., Cambridge, MA 02142, USA
*
Author to whom correspondence should be addressed.
Academic Editor: Luciano A. Abriata
Biophysica 2021, 1(2), 87-105; https://doi.org/10.3390/biophysica1020008
Received: 25 January 2021 / Revised: 18 February 2021 / Accepted: 28 February 2021 / Published: 25 March 2021
Despite the long history of using protein sequences to infer the tree of life, the potential for different parts of protein structures to retain historical signal remains unclear. We propose that it might be possible to improve analyses of phylogenomic datasets by incorporating information about protein structure. We test this idea using the position of the root of Metazoa (animals) as a model system. We examined the distribution of “strongly decisive” sites (alignment positions that support a specific tree topology) in a dataset comprising >1500 proteins and almost 100 taxa. The proportion of each class of strongly decisive sites in different structural environments was very sensitive to the model used to analyze the data when a limited number of taxa were used but they were stable when taxa were added. As long as enough taxa were analyzed, sites in all structural environments supported the same topology regardless of whether standard tree searches or decisive sites were used to select the optimal tree. However, the use of decisive sites revealed a difference between the support for minority topologies for sites in different structural environments: buried sites and sites in sheet and coil environments exhibited equal support for the minority topologies, whereas solvent-exposed and helix sites had unequal numbers of sites, supporting the minority topologies. This suggests that the relatively slowly evolving buried, sheet, and coil sites are giving an accurate picture of the true species tree and the amount of conflict among gene trees. Taken as a whole, this study indicates that phylogenetic analyses using sites in different structural environments can yield different topologies for the deepest branches in the animal tree of life and that analyzing larger numbers of taxa eliminates this conflict. More broadly, our results highlight the desirability of incorporating information about protein structure into phylogenomic analyses. View Full-Text
Keywords: protein structure; relative solvent accessibility; secondary structure; phylogeny; models of sequence evolution; gene tree–species tree discordance; incomplete lineage sorting; Ctenophora; Porifera protein structure; relative solvent accessibility; secondary structure; phylogeny; models of sequence evolution; gene tree–species tree discordance; incomplete lineage sorting; Ctenophora; Porifera
Show Figures

Figure 1

MDPI and ACS Style

Pandey, A.; Braun, E.L. The Roles of Protein Structure, Taxon Sampling, and Model Complexity in Phylogenomics: A Case Study Focused on Early Animal Divergences. Biophysica 2021, 1, 87-105. https://doi.org/10.3390/biophysica1020008

AMA Style

Pandey A, Braun EL. The Roles of Protein Structure, Taxon Sampling, and Model Complexity in Phylogenomics: A Case Study Focused on Early Animal Divergences. Biophysica. 2021; 1(2):87-105. https://doi.org/10.3390/biophysica1020008

Chicago/Turabian Style

Pandey, Akanksha, and Edward L. Braun 2021. "The Roles of Protein Structure, Taxon Sampling, and Model Complexity in Phylogenomics: A Case Study Focused on Early Animal Divergences" Biophysica 1, no. 2: 87-105. https://doi.org/10.3390/biophysica1020008

Find Other Styles

Article Access Map by Country/Region

1
Back to TopTop