Next Article in Journal
How Roads Affect the Spatial Use of the Guanaco in a South American Protected Area: Human Connectivity vs Animal Welfare
Next Article in Special Issue
Comparative Phylogenomics, a Stepping Stone for Bird Biodiversity Studies
Previous Article in Journal / Special Issue
Phylogenetic Signal of Indels and the Neoavian Radiation

Diversity 2019, 11(7), 109; https://doi.org/10.3390/d11070109

Article
A Phylogenomic Supertree of Birds
1
Department of Biology, University of Florida, Gainesville, FL 32607, USA
2
Department of Biological Sciences, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
3
Department of Ecology & Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
4
Department of Vertebrate Zoology, MRC 163, National Museum of Natural History, Smithsonian Institution, Washington, DC 20013, USA
5
Behavior, Ecology, Evolution and Systematics Program, University of Maryland, College Park, MD 20742, USA
6
Department of Ecology, Evolution and Behavior; and, Bell Museum of Natural History, St. Paul, MN 55108, USA
7
Department of Earth Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EQ, UK
8
Bruce Museum, Greenwich, CT 06830, USA
9
U.S. Geological Survey, Patuxent Wildlife Research Center, 12100 Beech Forest Rd., Laurel, MD 20708, USA
10
Division of Birds, National Museum of Natural History, Smithsonian Institution, Washington, DC 20013, USA
11
Department of Ecology and Evolutionary Biology and Biodiversity Institute, University of Kansas, Lawrence, KS 66045, USA
12
Museum of Natural Science, Louisiana State University, 119 Foster Hall, Baton Rouge, LA 70803, USA
13
Department of Ornithology, American Museum of Natural History, Central Park West at 79th Street, New York, NY 10024, USA
*
Author to whom correspondence should be addressed.
Received: 31 May 2019 / Accepted: 7 July 2019 / Published: 10 July 2019

Abstract

:
It has long been appreciated that analyses of genomic data (e.g., whole genome sequencing or sequence capture) have the potential to reveal the tree of life, but it remains challenging to move from sequence data to a clear understanding of evolutionary history, in part due to the computational challenges of phylogenetic estimation using genome-scale data. Supertree methods solve that challenge because they facilitate a divide-and-conquer approach for large-scale phylogeny inference by integrating smaller subtrees in a computationally efficient manner. Here, we combined information from sequence capture and whole-genome phylogenies using supertree methods. However, the available phylogenomic trees had limited overlap so we used taxon-rich (but not phylogenomic) megaphylogenies to weave them together. This allowed us to construct a phylogenomic supertree, with support values, that included 707 bird species (~7% of avian species diversity). We estimated branch lengths using mitochondrial sequence data and we used these branch lengths to estimate divergence times. Our time-calibrated supertree supports radiation of all three major avian clades (Palaeognathae, Galloanseres, and Neoaves) near the Cretaceous-Paleogene (K-Pg) boundary. The approach we used will permit the continued addition of taxa to this supertree as new phylogenomic data are published, and it could be applied to other taxa as well.
Keywords:
supertree; phylogeny; avian; genome; molecular clock; fossil calibrations; timetree; bootstrap support; phylogenomics

1. Introduction

Next generation sequencing (NGS) technologies have revolutionized systematics by permitting the collection of genome-scale datasets that can be used for phylogenetic analyses [1]. The availability of large-scale data has led to the genesis of a novel field of study within evolutionary biology that combines high-throughput data collection with evolutionary analysis: phylogenomics [2,3]. While the earliest phylogenomic studies focused on the use of comparative data to infer gene function [4,5], the focus shifted to efforts that used data from traditional Sanger sequencing combined with high-throughput workflows to understand evolutionary history [6,7,8]. Even more recently, the definition of the term has undergone another change; “phylogenomics” is now most often used to refer to studies that leverage NGS technologies to collect large amounts of data to inform phylogenetic reconstruction. NGS is often used for whole-genome sequencing (WGS) [9,10,11], but factors such as the cost of data collection and the computational burden associated with genome assembly have catalyzed the development of reduced-representation sequencing methods. Those methods economize data collection by reducing the amount of DNA targeted for sequencing to a subsample of the genome enriched for specific loci of interest.
Three reduced-representation sequencing methods have been used extensively in evolutionary biology: (1) transcriptome sequencing; (2) RADseq; and (3) sequence capture. All three methods have benefits and drawbacks in the context of specific studies [12]. Transcriptome sequencing has been used for phylogenetics [13,14] but it requires tissues that have been collected and preserved in an appropriate manner [15]. RADseq, which targets sequences located near specific restriction enzyme cut sites [16], is most useful for examining shallow divergences [17] (but see [18]) and requires relatively high molecular weight DNA samples for restriction enzymes to work efficiently [19]. In contrast to these two methods, sequence capture approaches [20] have unique advantages that make them ideal for projects that need to leverage the many tissue types available in natural history collections. For example, sequence capture techniques can be used with older museum specimens that may only yield small amounts of degraded DNA [21,22,23]. Additionally, sequence capture can target many different regions of interest, including loci originally identified in transcriptome sequencing [24] or RADseq [25,26] studies. Overall, sequence capture methods are potentially the best approach for leveraging natural history collections and achieving the dream of constructing a (nearly) complete tree of life with resources already available [27].
Birds are a group for which NGS approaches have been used with great success: to sequence genomes [28], to study adaptation [29], to understand gene function [30], and to clarify phylogenetic relationships (Table 1). As with many other groups, the evolutionary history of birds includes many rapid radiations that occurred at varying times during their evolution, making the avian tree difficult to resolve with small numbers of loci. Of particular importance is the rapid radiation early in the history of Neoaves (the clade comprising 95% of extant avian species) that has made it difficult to generate a well-supported hypothesis of relationships among extant avian orders (reviewed by [31]). A number of other rapid radiations within avian orders have also proven difficult to resolve using data generated by traditional Sanger sequencing [32,33,34,35]. The much larger datasets that NGS can produce [36,37,38] have provided the first evidence for several superordinal avian clades [39]. Similar progress has been made for difficult nodes within orders: in these cases, sequence capture studies targeting ultraconserved element (UCE) loci have provided most of the data [40,41,42]. The successful use of UCE data in many studies (Table 1) suggests they could be the key to building a well-resolved avian phylogeny that includes all extant bird species.
In parallel with ongoing efforts to collect phylogenomic data from birds, there are efforts to build “megaphylogenies” that include most or all extant avian species [43,44]. Most of those efforts are meta-analyses that build taxon-rich trees by synthesizing information generated in previous studies. These studies use two basic methodologies. Supermatrix methods compile as much raw character evidence (typically sequence data) as possible, generate a large data matrix, and use that matrix for large-scale phylogenetic analyses [45,46,47]. Supertree methods combine trees that were estimated in previous studies to yield a larger, more inclusive tree [48,49,50,51]. Unlike supermatrix methods, supertrees can include taxonomic information along with other source trees, making it possible to generate supertrees that include all named taxa in major clades [44,49]. Supermatrix and supertree approaches represent different ways to generate large, synthetic trees, and aspects of both can be combined [52,53]. In fact, the most extensively used avian megaphylogeny [54] was constructed by constraining a supermatrix analysis and then incorporating data-deficient taxa using simulations and taxonomic information. Thus far, limited amounts of phylogenomic data have been used in megaphylogenies, although it is clear that “phylogenomic megaphylogenies” have the potential to yield a strongly supported tree when genome-scale data become available for a sufficient number of species.
As phylogenomic data are rapidly accumulating, a remaining question is how to best leverage those data to build taxon-rich phylogenies for many taxonomic groups, including birds. The supermatrix approach has obvious appeal since it simply involves combining sequence data and conducting standard phylogenetic analyses. However, supermatrix analyses require at least some overlapping loci among studies (Figure 1a). Producing the supermatrices is conceptually straightforward if WGS data are available, since loci of interest can be extracted from the genome assemblies. This is not the case for data generated by reduced-representation sequencing. At this time, most (but not all) avian phylogenomic studies have used ultraconserved element (UCE) sequence capture (Table 1). However, even within UCE studies, two different probe sets targeting different numbers of UCE loci have been used (Table 1). While continued data collection may permit the eventual construction of large data matrices with many overlapping loci, those datasets are currently unavailable for birds (and many other groups of organisms). There is an additional problem that analyzing supermatrices with limited sequence overlap among loci often yields large numbers of equally optimal trees, a phenomenon referred to as “terraces in phylogenetic treespace” [76,77,78]. Finally, the computational burden imposed by phylogenomic supermatrix analyses, both for matrix assembly and for the tree searches, makes it burdensome to update supermatrix phylogenies as additional phylogenomic data become available.
Supertrees get around many of the limitations of supermatrices; they do not require overlapping loci and they can be very computationally efficient. The overarching goal of this study is to examine the feasibility of building a “phylogenomic supertree” of birds that can be updated rapidly as additional trees based on large-amounts of sequence data are published. To accomplish this goal, we identified source trees generated using sequence capture and WGS (Table 1) and we combined those trees using two computationally efficient supertree methods. Supertrees require overlapping taxa (Figure 1b); since the available phylogenomic studies for birds have limited overlap (Figure 2), for this initial attempt we also included three avian megaphylogenies [43,44,54] to “stitch” the phylogenomic studies together into a single tree. Then we compared the supertree to those megaphylogenies to determine the ways in which phylogenomic data have altered our existing hypotheses about avian evolution. Finally, we used a combination of molecular data and the fossil record to establish a temporal framework for our supertree. Our results show that combining data from sequence capture with other sources of phylogenomic data will allow rapid progress toward the goal of a strongly supported phylogenomic supertree of life.

2. Materials and Methods

2.1. Source Tree Selection and Taxonomic Reconciliation

We identified 30 published phylogenomic trees (Table 1), including two studies [39,67] that extracted large amounts of “legacy” data (loci used for older Sanger sequencing studies [79]) from genome assemblies. Those studies allowed us to include taxa with published genome assemblies that were not included in other source trees such as the extinct elephant birds (order Aepyornithiformes; [67]). When studies were redundant (i.e., based on sequence data that overlap extensively with the data used to generate another tree), we chose the most taxon-rich tree as the source tree. Ultimately, we used 22 source trees (Table 1). Taxon names were converted to those in the IOC World Bird List (v. 7.3) [80]. The species names for extinct taxa included in our supertree (elephant birds and moas) were added to our working taxonomy. There were also two cases in which a species recognized by IOC was not monophyletic based on the inclusion of multiple individuals in the source study; we created a new taxon name for those taxa by appending the subspecies (one case) or geographic region (one case) to the original name. This resulted in 707 taxa sampled in these source trees.
Because the phylogenomic trees included had very few overlapping taxa (Figure 2), we used three avian megaphylogenies [43,44,54] (hereafter called BigBird, Brown, and Jetz) as backbones to link the phylogenomic source trees together. For the Jetz backbone [54], we downloaded 1000 trees with the Hackett et al. [8] constraints (on 27 September 2018) for the 677 taxa that were also present in our source trees from http://birdtree.org and generated the majority rule extended (“greedy”) consensus of those trees. For the BigBird backbone [43], we extracted a subtree of the Burleigh et al. [43] maximum likelihood (ML) tree limited to the 598 taxa that were also in the phylogenomic source trees. For the Brown backbone [44], we extracted the 689 taxa that overlapped with our phylogenomic source trees.

2.2. MRP and MRL Supertree Searches

We used matrix representation with parsimony (MRP) and matrix representation with likelihood (MRL) to generate supertrees. We converted the source trees to binary matrices using the Baum–Ragan coding method [81,82] in CLANN [83] and analyzed those data matrices using PAUP* 4.0a163 [84] for MRP analyses and IQ-TREE 1.6.3 [85] for MRL analyses. We used the parsimony ratchet [86] to search treespace for our MRP analyses; the ratchet is useful when tree searching is slowed by long periods of branch swapping on large islands of equally parsimonious trees. We used “ratchblock” (a program originally used by Yuri et al. [87] that is now available from https://github.com/ebraun68/ratchblock) to generate a PAUP block with instructions for the ratchet searches. The instructions in the PAUP block generated a starting tree by stepwise addition, increased the weights for sets of randomly chosen characters to two, and subjected the reweighted data matrix to branch swapping, holding a single tree at any time (multrees = no). After a short round of branch swapping, the original (equal) weights were restored, another round of branch swapping was conducted, and the tree was saved. The optimal tree from this random reweighting-branch swapping procedure was then used as the starting tree for another cycle of reweighting and branch swapping. The search was completed by conducting a final round of tree bisection and reconnection (TBR) branch swapping (using “set maxtrees = 1000 increase = no”) on the shortest trees identified by all ratchet cycles. For this study, we conducted five searches, each with 100 ratchet replicates, that reweighted different percentages of characters (15%, 20%, 25%, 25%, and 30%). We collapsed branches to form polytomies if the minimum branch length was zero (pset collapse = minBrlen). We viewed the strict consensus of all MP trees as the phylogenomic supertree.
MRL [88] is analogous to MRP, but it uses the symmetric two-state model (the Cavender-Farris-Neyman [CFN] model; [89,90,91]) to analyze the Baum–Ragan matrix. For MRL, we used the options “-st BIN” and “-m JC2+FQ+G4” in IQ-TREE (note that the CFN model is called “JC2” in IQ-TREE).
Several methods to examine support for supertrees have been proposed (e.g., the QS method [92], its variants [93], and bootstrapping methods [94,95]). However, only some are useful for the current problem given the limited degree of overlap among our source trees. We used two approaches on the supertree based on all three backbones: 1) a simple bootstrap approach, and 2) MRL analysis with branch support values. For the bootstrap approach, we built 100 Baum–Ragan data matrices that each included a single tree from the set of bootstrap trees distributed for the BigBird backbone and for three phylogenomic trees (Jarvis [37], Prum [38], and Reddy [39]). For Jetz, birdtree.org distributes samples from a Bayesian Markov chain Monte Carlo (MCMC) chain (not bootstrap trees); we used the first 100 trees that we downloaded (thus, our bootstrap analysis actually includes bootstrap trees and trees sampled from an MCMC chain). The optimal tree was used for all other source trees. Then we analyzed the Baum–Ragan matrices using MRP and MRL. For the second approach, we tested two computationally efficient methods to examine ML branch support in our MRL tree: (1) the approximate likelihood ratio test (aLRT; [96]), and (2) the Bayesian-like transformation of the aLRT (aBayes; [97]). Those tests were conducted in IQ-TREE using the “-alrt 0” and “-abayes” options.
We analyzed seven supertree matrices. All matrices included the phylogenomic trees, but the matrices differed in which backbone trees were included. Three matrices included a single backbone tree, three included two of the three backbone trees, while one included all three backbone trees. Based on initial results, we also generated one constraint tree that enforced monophyly of most IOC orders and uncontroversial groups (e.g., Palaeognathae, Neognathae; the constraint tree is in Supplementary Materials). There are two cases for which monophyly of IOC orders is controversial (Caprimulgiformes and Pelecaniformes); the taxa in these orders were not constrained to be monophyletic in the constraint tree. All source trees, including the megaphylogeny backbone trees and the constraint trees, are available in Supplementary Materials. We used this constraint tree only for an analysis with the BigBird-only backbone (see below).
We weighted the input trees based approximately on the amount of underlying data. We wanted the phylogenomic trees to dominate, so we gave the three backbone trees (Jetz, BigBird, and Brown) a weight of one. All other input trees were weighted more heavily, with input tree weighting scaled to the number of backbone trees in an analysis (i.e., if a matrix contained a single backbone, then the base weights were multiplied by one; if two backbone trees the base weights were doubled, and for the analysis with all three backbone trees the base weights were multiplied by three). The two trees including extensive legacy data (Reddy [39] and Yonezawa [67]) were each given base weights of two, while Prum [38] (which contained 259 loci) was given a base weight of three. The UCE trees were each given a base weight of four, while the WGS trees had a base weight of eight. McCormack et al. [36] reported two UCE trees based on analyses of different (but strongly overlapping) data matrices; both trees were included as source trees, but we assigned each of them a base weight of two (totaling four, equal to other UCE trees). In all cases, we weighted trees by including each source tree the appropriate number of times in the file used as input for CLANN.
We used normalized Robinson-Foulds (RF) distances [98] to examine differences among consensus trees. Briefly, the “treedist” function in PAUP* was used to calculate symmetric tree distances (=2 × RF), which were then divided by the maximum possible symmetric distance for two fully resolved trees with the same number of taxa as the supertree. For comparisons to the megaphylogenies, we pruned the supertrees to overlapping taxa before calculating normalized RF distances. We also used these RF distances to generate a “tree-of-trees” (a cluster analysis showing the similarities among trees). This was accomplished by converting the RF distance matrix to NEXUS format and then clustering the trees by neighbor joining in PAUP*.

2.3. Estimating Branch Lengths and A Calibrated Time Tree

We estimated branch lengths in IQ-TREE using data matrices that comprise the mitochondrial gene regions cytochrome b (CYB) and NADH dehydrogenase subunit 2 (ND2) (Supplementary Materials). That matrix was generated by extracting those taxa with CYB and ND2 data from the BigBird [43] data matrix, adding additional sequences from GenBank [99], and supplementing the data with new sequences extracted from a large-scale avian sequence capture dataset we are currently analyzing. These new CYB and ND2 sequences were assembled from “off-target” reads from UCE sequence capture efforts as described previously [65,100,101]. This allowed us to construct a data matrix comprising 655 taxa and 2184 aligned base pairs (bp). Since some taxa were represented by a very limited amount of sequence data, we constructed a second mitochondrial alignment that excluded taxa that had less than 1638 bp (less than 75% of the total region); this resulted in an alignment with 367 taxa. For both datasets, we partitioned the data by gene and codon position and estimated the model parameters and branch lengths using linked branch lengths with partition-specific rates (the “-spp” option) assuming our primary supertree topology (the “-m TESTONLY” option).
Molecular branch lengths, even when they exhibit substantial among-lineage variation, can be used to establish divergence times when information from the fossil record is incorporated [102]. Thus, the trees with branch information from mitochondrial data were time-calibrated using autocorrelated penalized likelihood in treePL [103], which implements the penalized likelihood method of Sanderson [104]. We applied 22 fossil calibrations (Appendix A) following best practices proposed by Parham et al. [105]; only 18 of these were used for analysis of the 367 taxa. With both minimum and maximum dates assigned to each calibrated node (i.e., the most recent common ancestor [MRCA] of two chosen species, see Appendix A), we obtained the optimal parameter settings using the “prime” option and dated the tree using the best smoothing value (10 for the analysis using 655 taxa, and 0.01 for the analysis using 367 taxa) determined by random subsample and replicate cross validation (RSRCV).

3. Results

3.1. Meta-Analysis of Phylogenomic Trees Yields A Well-Resolved Supertree

We assessed the impact of each backbone tree on the supertree. Use of BigBird as the sole backbone tree yielded trees that were relatively poorly resolved (Supplementary Materials). BigBird had the fewest overlapping taxa with our phylogenomic trees (598, less than 85% of the total taxa), meaning that some taxa were only present in a single tree and could therefore be placed almost anywhere in the phylogeny. Restricting our analyses to a set of source trees that included only the relatively taxon-poor BigBird backbone resulted in “problematic source trees” (Figure 1b) and the supertree included some radically misplaced taxa (e.g., placement of taxa in Vangidae outside Passeriformes). The relative RF distances from the unconstrained BigBird-only supertree to those supertrees that contained at least one backbone tree other than BigBird ranged from 0.0972 to 0.1433 (Supplementary Materials). In contrast, the relative RF distances for trees with at least one backbone tree other than BigBird did not overlap; they ranged from 0.0071 to 0.0723 (Supplementary Materials). Even when a constraint was applied, the BigBird-only matrix yielded less resolved trees than analyses using other backbones. Use of more taxon-rich backbones (Jetz or Brown), or combinations of backbones, improved resolution (Table 2). Analyses based upon the Jetz backbone had the greatest resolution overall, probably reflecting the fact that the topology we used as the Jetz backbone was fully resolved whereas the Brown backbone included polytomies. The slightly reduced resolution in analyses using two or three backbone trees was likely due to conflict among the backbone trees. Given that using all three backbones emphasized uncertainty among relationships in the megaphylogenies, we focused our remaining analyses on the matrix that included all three backbones even though these trees exhibited slightly lower resolution.
Overall, MRP and MRL trees showed many similarities (Supplementary Materials). All trees contained the seven major higher-level clades identified by Reddy et al. [39], with the exception of the unconstrained analyses that used BigBird as the sole backbone. However, while the higher-level structure of the MRP trees (Figure 3) was identical to the Jarvis et al. [37] “total-evidence nucleotide tree” (TENT), several of the MRL analyses showed a different placement for Gruiformes (cranes, rails, and allies). Despite overall similarities between MRP and MRL supertrees, the optimal MRP trees cluster separately from the optimal MRL trees. (Figure 4).
Our use of megaphylogenies (i.e., BigBird, Jetz, and Brown) as backbones to link phylogenomic trees together into a single supertree raises an important question: to what degree does the phylogenomic supertree simply reflect the backbone trees? If the supertrees simply reflect the backbone trees, then any errors shared among the backbones could be propagated to the supertrees. The megaphylogeny backbones could share errors because they are not strictly independent. For example, the Jetz tree used the Hackett et al. [8] topology as a backbone constraint and BigBird included a large amount of data from Hackett et al. [8]; the Brown tree was a synthesis of prior trees, including Hackett et al. [8] and the Jetz tree. After pruning our trees to the 584 taxa common to all three backbone trees, we found that all of our supertrees were quite similar to each other, and that distances between the supertrees and the backbone trees were much larger (Figure 4; a spreadsheet with RF distances is in Supplementary Materials). Although we cannot rule out the possibility that errors common to all three backbone trees are present in our supertrees, it seems clear that the phylogenomic trees dominate the supertree topologies. This likely reflects the high weight that we gave our phylogenomic trees, acknowledging the large amount of sequence data used to generate those trees and thus the greater likelihood that they reflect the underlying species tree.
There was low bootstrap support for higher level relationships in both MRP and MRL supertrees, though there was much higher support within well-sampled clades and for the seven higher level clades identified in Reddy et al. [39]. Interestingly, the bootstrap consensus trees from both MRP and MRL were more similar to each other than to the optimal MRP or MRL trees (Figure 4). One higher-level relationship shared by the MRP and MRL majority rule extended bootstrap consensus trees was a rearrangement in clade IV (Otidimorphae) that placed Musophagiformes (turacos) sister to an Otidiformes-Cuculiformes (bustard-cuckoo) clade. Use of the rapid MRL support metrics (aLRT and aBayes) resulted in high (and likely inflated; see Discussion) support at all nodes.

3.2. Rapid Branch Length Estimation and Divergence Time Estimation

Most supertree methods (including MRP and MRL) are unable to generate meaningful branch length estimates. However, we were able to identify mitochondrial sequence data for 655 (93%) of the taxa in our matrix that we used for branch length estimation, though only 367 taxa (52%) had at least 1638 sites. Branch length estimates were variable (Figure 5), with average branch lengths being slightly shorter when using reduced numbers of taxa with less missing data (see treefiles in Supplementary Materials). Since large amounts of missing data can alter branch length estimates, we focused on the branch lengths estimated from the taxon-reduced dataset.
Using branch lengths estimated with the taxon-reduced matrix, our treePL analysis (Figure 6) corroborated the model, now supported in many other studies [37,38,106,107], in which Neoaves underwent an explosive radiation close to the Cretaceous-Paleogene (K-Pg) boundary. Not surprisingly, use of the more taxon-complete sequence matrix that included more missing data resulted in an older estimated radiation of Neoaves (Supplementary Materials).

4. Discussion

Supertree methods provide a computationally efficient means to integrate published phylogenomic studies of birds into a larger synthetic tree. At this point the overlap among phylogenomic trees is relatively limited, making it difficult to use standard supertree methods that use large numbers of source trees. To solve issues of limited overlap, we used megaphylogenies to unite the input trees. As more phylogenomic studies become available, the use of megaphylogeny backbones may become unnecessary and, if megaphylogenies are used, they are likely to have less influence on the overall topology of the supertrees. When sets of bootstrap trees (or samples from a Bayesian MCMC chain) are available for source trees it is also possible to incorporate those bootstrap trees into the supertree analysis and estimate the support for specific relationships. We also show that it is possible to obtain robust branch length estimates and generate a timetree consistent with other recent studies. Although we have focused on birds, which are extensively studied and therefore have relatively large amounts of data available, the resources we used (megaphylogenies and extensively sampled organellar sequence data) are available throughout the tree of life; if the phylogenomic trees are available, similar approaches could be used in other groups.

4.1. Strengths and Weaknesses of the Phylogenomic Supertree Approach

Much has been written regarding the choice among methods for generating large-scale phylogenies [108]. One major advantage that supertree methods offer is that they are very computationally efficient, while supermatrix analyses intrinsically impose a substantial computational burden [109,110]. The analyses for this study were conducted on a desktop computer (a 2.6 GHz Intel Core i5 Mac mini) and a laptop (3.1 GHz Intel Core i5 MacBook Pro) and they used less than one week of total compute time, as opposed to the >400 years used to analyze the Jarvis et al. data [37] (expressed as the equivalent runtime for a single processor). The maximum parsimony criterion (used for MRP) is orders of magnitude faster than analyses using the maximum likelihood (ML) criterion [111,112]. However, supertree analyses using the ML criterion (i.e., the MRL approach) remained quite fast given the sizes of the data matrices used in this study. Given the existence of many other computationally efficient supertree construction algorithms [113,114], we fully expect supertree methods to remain much more computationally efficient than supermatrix methods.
However, the advantages of supertree methods also come with costs (for detailed discussion see [95,115]). Ideally, there should be a direct connection between the results of any phylogenetic analysis and the original character data (e.g., aligned sequences), though using supertrees for meta-analysis (as we have done here) breaks this connection. In addition, supertrees can yield a phylogeny that does not acknowledge “hidden support” for clades. Hidden support is the observation that certain clades can have greater character support in combined analyses than they would in separate analyses of the partitions in the original dataset [116,117].
These arguments might seem to favor the use of supermatrix methods to construct megaphylogenies. However, it is important to emphasize that the sparse supermatrices used in current large-scale supermatrix studies (e.g., Figure 1a) also present analytical difficulties [76]. Moreover, supertree methods have benefits that go beyond their computational efficiency. For instance, supertree approaches can combine trees generated by rigorous phylogenetic analyses with taxonomic information [53]. At this time, there are few large taxonomic groups for which all taxa have associated character data. Thus, the supertree approach is the only method able to build trees including all named taxa in major clades [44,49] or even the entire tree of life [118]. Of course, including taxa using only taxonomic information is a double-edged sword; it can be useful when these are the only available information regarding the placement of a taxon, but there are also many examples of cases where using taxonomies to place data deficient taxa leads to inaccuracies [62,119,120,121]. The obvious next step in our efforts to understand the tree of life is to collect data (ideally phylogenomic data) for all species that remain data deficient. Yet, collecting the data necessary to construct truly phylogenomic megaphylogenies for all named species will present many challenges; the supertree approach that we have provides a useful estimate of that phylogeny for now.

4.2. Different Roles for Backbone Trees and Phylogenomic Trees

We note that the supertree approach we used differs in an important way from many others: we explicitly broke our source trees into “backbone trees” and “phylogenomic trees” whereas many supertree studies simply combine as many source trees as possible. Using a large number of source trees should minimize the problems associated with source tree overlap (Figure 1b). Because our goal was to generate a tree that summarizes phylogenomic studies, we leveraged published avian megaphylogenies (whether they reflect supermatrix and supertree analyses) to link our currently limited pool of source trees. However, there were some challenges associated with using backbone trees. Even a relatively taxon-rich backbone, like the BigBird tree (85% of all taxa sampled in source trees) was not sufficient to produce a well-resolved supertree. While using constraints improved analyses that only included the BigBird backbone, the BigBird-only trees still exhibited relatively large RF distances from the other supertrees that we generated (note the relatively long terminal branches for the BigBird trees in Figure 4), suggesting that the BigBird-only supertrees were relatively incongruent with our other trees. Overall, this indicates that one should use backbone trees with as many taxa as possible or use multiple backbone trees, as we did here. However, the need for backbone trees should diminish as more phylogenomic studies are published because this will increase the taxonomic overlap among source trees; we expect the increasing overlap among phylogenomic trees to make the choice of the backbone trees less challenging (or unnecessary) over time.
Establishing the most appropriate weights for the source trees is also challenging. It was clearly appropriate to use low weights for the backbone trees since they were based on relatively limited data. However, the most appropriate weights for the phylogenomic trees is unclear. For this study we used approximate size of the datasets used to generate weights for the source trees. However, one might argue that other factors should be used to determine weights. For example, if taxon sampling beneficially impacts the estimate of phylogeny, one might wish to weight taxon-rich trees such as the Prum et al. [38] tree higher than relatively taxon-poor trees like the Jarvis et al. [37] TENT. On the other hand, if commonly used analytical methods yield more accurate trees when applied to non-coding data (cf. Reddy et al. [39]), it might be more appropriate to down-weight Prum et al. (2015), which is primarily coding data, and eschew the Jarvis et al. [37] TENT (which is a mixture of coding and non-coding data) in favor of either the intron tree or the UCE tree from the latter study (both of which reflect analyses of largely non-coding data). Other methods to examine data quality, such as phylogenetic informativeness [122] or various metrics of model adequacy [123] could also be applied. Ultimately, weighting source trees requires judgement regarding their accuracy. For this study, we felt that simply weighting the source trees by the approximate size of the underlying dataset was the most objective way to summarize these published avian phylogenomic studies. As more phylogenomic trees become available, with greater overlap among trees, it may be of interest to explore alternative weighting schemes to explore their impact on the supertree estimation.

4.3. MRP and MRL Support Values

Bootstrap support values were relatively low for higher-level relationships in the supertree. The basis for these support values differs from those estimated using sequence data. When aligned DNA sequences are used directly, bootstrap values reflect the spectrum of site patterns in the multiple sequence alignments used in the analysis. By contrast, the supertree bootstrapping approach we used is likely to reflect both the impact of limited overlap among source trees and the limited support of clades in the backbone trees. It is encouraging that the simple approach we used yielded support values that reflect the uncertainty observed in other studies.
It can be challenging to use bootstrapping to examine support in meta-analyses because many studies only provide the optimal tree (typically with a support value included as node labels or branch lengths) or may use methods that do not generate a sample of trees [124]. Our approach requires a set of sampled (e.g., bootstrap or MCMC posterior) trees. Once a larger number of phylogenomic studies are available for birds, it may be possible to simply bootstrap the optimal or summary (e.g., maximum clade credibility) source trees (cf. [94]), but that requires a large number of source trees with substantial overlap. An alternative approach for estimating uncertainty would be to conduct a supertree analysis using gene trees from available studies as source trees. The growing popularity of multispecies coalescent (“species tree”) methods for phylogenomic analyses [125] often results in reporting of gene trees; if those gene trees are available electronically, it would be straightforward to bootstrap them during supertree construction. The observation that some supertree methods appear to be useful estimators of the species tree when collections of gene trees are used as input [126] suggests that using gene trees for some (or all) of the source trees in supertree meta-analyses might have an additional benefit: the supertree could be a reasonable estimate of the species tree. ASTRAL might be especially useful in this context; although it is generally viewed as a multispecies coalescent method it is actually a supertree method with two important properties [126]. First, it is a consistent estimator of the species tree as long as the input trees are gene trees [126]. Second, it provides useful support values (local posterior probabilities [124]) when gene trees are used as input trees. These properties would make it an excellent choice for generating a phylogenomic supertree if gene trees are available. However, if they are not available, it does not have obvious benefits relative to MRP and MRL.
For MRL analyses, the aLRT and aBayes tests [96,97] can also be used to examine support. Both tests are computationally efficient and can be used to rapidly update supertrees as additional phylogenomic trees become available. Support values from both aLRT and aBayes analyses were typically high in our analyses, with most nodes having the maximum support possible (>97% and >94% of nodes have support of 1.0 in the aLRT and aBayes analyses, respectively; Supplementary Materials), failing to reflect the underlying uncertainty in some of these relationships. Branch support in the aLRT and aBayes methods reflects the likelihood difference between the optimal tree and the two nearest neighbor interchanges for that branch. Our hope was that these analyses would provide a rapid means to highlight two different cases: (1) those in which the backbone trees conflict but the phylogenomic trees provide no information, and (2) those in which there is conflict among the phylogenomic trees. However, the small number of nodes with low support relative to those revealed by the bootstrap (>84% and >87% of resolved nodes have support of 100% in the MRP and MRL analyses, respectively; Supplementary Materials) suggests the bootstrap is a better approach. For studies like ours where there is little overlap among trees, there is little potential for conflict, and these support metrics may not yield meaningful results.

4.4. Branch Lengths and Divergence Times

Although methods to assign branch lengths to supertrees without using molecular data have been proposed [127,128,129,130,131], we used mitochondrial data to estimate branch lengths (specifically, ND2 and CYB, which are the best-sampled mitochondrial gene regions [43]). Mitochondrial data have already been collected for many birds [43] and both sequence capture and WGS (even low coverage WGS;) often produce nearly complete mitogenome sequences [65,71,100,101,132,133]. Thus, mitochondrial sequences are available from many taxa for branch length estimation, and additional sequences are expected to accumulate. Our branch lengths were similar (in relative terms) to those estimated by large-scale avian studies using nuclear sequence data [8,37,38,134]. For example, taxa that exhibit exceptionally long branches in our tree include Turnicidae (hemipodes, also known as buttonquails; see Figure 5b) and Tinamiformes (tinamous; see Figure 5c); both of these taxa exhibit long branches in trees based on nuclear data [8,38,107,134]. Our relative branch lengths also align closely with the inferred substitution rates presented by Berv and Field [135] (see their Figure S2), providing additional support for the validity of our approach.
The availability of branch lengths allowed us to estimate divergence times for the major lineages of birds. Since our goal was to explore computationally efficient methods, we used treePL, a fast rate smoothing program that performs well in simulations for as many as 10,000 tips [103]. treePL analysis of the reduced taxon set tree (which had limited missing data) resulted in a timetree that corroborated several recent studies [37,38,106,107], showing that Neoaves underwent an explosive radiation close to the K-Pg boundary. However, our estimates of divergence times appeared to be sensitive to missing data; treePL analysis of the tree including all taxa with mitochondrial sequence yielded estimates for the origin of crown Neoaves that were greater than 70 million years ago (Ma; Supplementary Materials). Missing data appeared to have an even larger impact on the estimated times for the origins of crown Palaeognathae and crown Galloanseres (Supplementary Materials), in sharp contrast to the dates that are tightly clustered near the K-Pg boundary when taxa with limited data are excluded (Figure 6). Estimates of divergence times for avian lineages have varied among studies [37,38,136] and there are indications that divergence time estimates generated using mitochondrial data exhibit systematic differences from those based on nuclear data [136]. However, our divergence time estimates are fairly close to those in other recent publications (e.g., our estimated time for the diversification of Neoaves was very similar to the estimate in Jarvis et al. [37], somewhat older than the estimate in Prum et al. [38], and within the ranges of the estimates from Ksepka and Phillips [136]). Therefore, we do not view our analyses and results, which were chosen for computational efficiency, to be the final word. Neveretheless, it is clear that our results are congruent with recent phylogenomic studies [37,38] in that they contradict estimates that place much of the diversification of Neoaves in the Cretaceous [137,138]. The hypothesis that many lineages in Neornithes (extant birds) arose during the Cretaceous requires mass survival across the K-Pg boundary; that hypothesis has become less tenable in light of recent fossil evidence for a mass extinction of birds across the K-Pg boundary [139,140]. Our timetree can be added to the growing body of molecular evidence that contradicts a Cretaceous radiation of Neornithes.

4.5. Taxonomic Flux—A Fundamental Challenge for Supertrees (and Supermatrices)

During the course of this study, the computational efficiency of supertree methods highlighted the time-consuming nature of another step in phylogenetics: taxonomic reconciliation. Our efforts to convert the taxonomies used in our source trees to a common set of names were challenging and required substantial manual intervention. It is important to recognize that taxonomic reconciliation is also necessary for supermatrix studies (for example, taxonomic difficulties presented a major challenge for the BigBird study; E.L.B., personal observation). There are three types of changes to taxonomic names that occur over time at the species level: (1) simple renaming; (2) lumping of previously separate species; and (3) splitting existing species into two or more new species. The first typically reflects splits or reassignments at the generic level. These changes are relatively straightforward to deal with because they are a one-to-one replacement of names. However, generic reassignment can result in changes to the gender of the specific epithet; this can make it challenging to trace names, especially when one also has to consider the possibility of spelling errors. Changes in species circumscriptions (lumps and splits) are more challenging. We are in a historical phase in avian taxonomy where splitting is more common, due to improved recognition of cryptic species using genetic data and information from fieldwork (e.g., improved equipment for sound recordings). There are suggestions that the number of bird species may increase twofold (or possibly even more; [141,142]). These taxonomic splits create a major problem for meta-analyses since they make it necessary to trace provenance of data at the level of individual samples and assess the proper taxonomic assignment for those samples. This emphasizes the need to make genomic data and tips in trees traceable to vouchered specimens in natural history collections. Without direct links between sequence data and the names in trees that are derived from those data, proper taxonomic assignment is impossible.

4.6. Moving Forward: OpenWings, B10K, and Other Phylogenomic Efforts

The ultimate goal of ongoing phylogenomic efforts is to estimate a phylogeny that includes all extant species and as many recently extinct species as is feasible given available material. Our ability to construct a phylogenomic supertree of birds with approximately 7% of currently named species is certainly encouraging, although we acknowledge that the taxon sampling of our current tree is highly uneven. This is because our supertree reflects the combination of source trees from global efforts to understand the avian tree of life, such as Jarvis et al. [37] and Prum et al. [38], and taxonomically focused phylogenomic source trees generated by focused efforts in specific laboratories (the overrepresentation of Galliformes and Psittaciformes reflects projects in the R.T.K.-E.L.B and B.T.S. laboratories, respectively). It is clear that these efforts need to be expanded, both for birds and for other organisms.
Efforts that will aid in the goal of inferring a species-level avian tree are underway, including many ongoing and parallel phylogenomic studies for various taxonomic groups across the tree of life, both at global scales [143,144,145] and in individual laboratories. Here, we highlight two ongoing efforts to collect large-scale data for birds. The B10K project (described by Zhang [28]) plans to generate complete genome assemblies for all named bird species. This ambitious approach is exciting and will revolutionize our understanding of avian biology, but we expect the limited availability of high-quality tissues to present major challenges for WGS efforts. The OpenWings project (www.openwings.org; described by Pennisi [146]) has the potential to achieve rapid resolution of the avian tree of life, using resources already at hand. OpenWings will focus on sequence capture of nuclear and mitochondrial loci, leveraging genetic resources from frozen tissues or well-preserved study skins of vouchered specimens in natural history collections. The overarching project goal is to collect new genomic data from >8000 bird species and to integrate those data with existing, compatible UCE data sets to infer a complete species-level avian tree of life. Importantly, the data will be openly available to other researchers immediately after they have been generated and subjected to quality control. We expect the rapid availability of data to minimize duplicate efforts among groups and allow the broader ornithological community to maximize the utility of this genome-scale data collection effort for their independent projects. Until the OpenWings project completes its data collection goals we plan to update our phylogenomic supertree of birds regularly.

Supplementary Materials

The following are available online at https://www.mdpi.com/1424-2818/11/7/109/s1, Data: Kimball_2019_OW_Supertree_Data.zip, Poster1: Kimball_2019_OW_supertree_poster.pdf.

Author Contributions

Conceptualization, R.T.K. and E.L.B.; methodology, R.T.K., D.J.F., D.T.K., and E.L.B.; software, N.W. and E.L.B.; formal analysis, C.H.O., N.W., and E.L.B.; investigation, R.T.K., C.H.O., N.D.W., F.K.B., D.J.F., D.T.K., B.C.F., B.T.S., and E.L.B.; resources, R.T.C., R.G.M., M.J.B., and R.T.B.; data curation, R.T.K., C.H.O., D.J.F., D.T.K., and E.L.B.; writing—original draft preparation, R.T.K., C.H.O., N.W., D.J.F., D.T.K., and E.L.B.; writing—review and editing, R.T.K., C.H.O., N.W., N.D.W., F.K.B., D.J.F., D.T.K., R.T.C., R.G.M., M.J.B., R.T.B., B.C.F., B.T.S., and E.L.B.; visualization, C.H.O. and E.L.B.; project administration, B.T.S. and E.L.B.; funding acquisition, R.T.K., F.K.B., D.T.K., R.T.C., M.J.B., R.T.B., B.C.F., B.T.S., and E.L.B.

Funding

This research was funded by the US National Science Foundation, grant numbers DEB-1655683 (to R.T.K. and E.L.B), DEB-1655624 (to B.C.F. and R.T.B.), DEB-1655559 (to F.K.B.), DEB-1655736 to (B.T.S; subcontracts to D.T.K. and R.T.C.). M.J.B., N.D.W., R.T.B., E.L.B., and B.C.F. were supported by grants from the Smithsonian Grand Challenges Consortia.

Acknowledgments

We are grateful to Peter Houde for encouraging us to write this manuscript and to Marco A. Rego, members of the Kimball-Braun lab, and two anonymous reviewers for careful reading of this manuscript.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results. Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government.

Appendix A

Information on fossils used to calibrate the timetree (Figure 6). treePL uses two taxa to define each node in the tree; the taxa used to define that node are listed in Supplementary Materials. All taxonomic groups reflect the IOC World Bird List (v. 7.3) [80].
A.1. Calibrated Node: Crown Casuariiformes (Dromaius–Casuarius split)
Fossil Specimen:Emuarius gidju QM F45460
Phylogenetic Justification: Worthy et al. [147] recovered Emuarius as more closely related to Dromaius than to Casuarius in a phylogenetic analysis. Codings for Emuarius were based on multiple specimens, and key synapomorphies occur in the skull, tarsometatarsus, and scapulocoracoid. A scapulocoracoid (QM F45460) is thus specified as the calibrating specimen.
Minimum Age Constraint: 24.5 Ma
Maximum Age Constraint: 58.7 Ma
Age Justification: The calibrating fossil is from Faunal Zone A at the Hiatus South Site of the Riversleigh locality in Queensland, Australia. Based on biocorrelation to the faunas from the Etadunna and Namba Formations in South Australia [148], a minimum age matching the top of Chron 7r is applied, with the numerical date selected from table 28.2 of Gradstein et al. [149]. The maximum is based on the age of the oldest putative palaeognaths, which include middle-late Paleocene lithornithids from North America [150] and the ratite Diogenornis, from the early Eocene of Brazil [151]. While the precise phylogenetic relationships of these taxa are debated, none are plausibly nested within crown Casuariiformes.
A.2. Calibrated Node: Stem Phasianidae (Phasianidae–Odontophoridae split)
Fossil Specimen:Palaeortyx cf. gallica PW 2005/5023a-LS
Phylogenetic Justification: Mayr et al. [152] described apomorphies including the well-developed processus intermetacarpalis that support placement of Palaeortyx cf. gallica within crown Galliformes, most likely as a stem group representative of Phasianidae. PW 2005/5023a-LS represents a nearly complete skeleton and thus is selected as the calibrating specimen.
Minimum Age Constraint: 24 Ma
Maximum Age Constraint: 51.81 Ma
Age Justification: The fossil is from a maar lake deposit at Enspel, near Bad Marienberg in Westerwald, Rheinland-Pfalz, Germany. These deposits are assigned to the MP28 biozone [153], the top of which is used for the hard minimum age. The maximum is based on the age of the Green River Formation from which multiple complete skeletons the stem galliform Gallinuloides wyomingensis have been collected. This maximum encompasses other strata that have yielded good material of stem galliforms but no convincing crown galliform material including the Messel Formation, Late Eocene horizons at Quercy, and the London Clay Formation. The maximum also encompasses the ages of taxa that may possibly represent crown galliforms but require additional study such as Procrax and Schaubortyx.
A.3. Calibrated Node: Stem Mirandornithes (Mirandornithes–Columbiformes split)
Fossil Specimen:Juncitarsus merkeli SMF A 295 (cast)
Phylogenetic Justification: Mayr [154] presented evidence for four synapomorphies linking Juncitarsus to Podicipediformes + Phoenicopteriformes, and also listed primitive characters which rule out placement of this taxon within crown Mirandornithes
Minimum Age Constraint: 46.6 Ma
Maximum Age Constraint: 61.6 Ma
Age Justification: The fossil is from the Messel Formation. A maximum age for the fossiliferous deposits of the Messel Formation is provided by a 47.8 ± 0.2 Ma 40Ar/39Ar age obtained from the basalt chimney below Lake Messel [155]. This date provides a maximum age for Lake Messel itself, but a minimum age for the fossil must take into account time elapsed between the cooling of the basalt and the deposition of the fossiliferous layers which occur higher in the section. Lacustrine sediments are estimated to have filled in the maar lake that formed above this basalt chimney over a span of approximately 1 Myr [155]. Accounting for sedimentation rate, the layers yielding avian fossils (including SMF-ME 1883a+b) are most likely ~47 Ma in age [155,156]. When both the error range associated with the dating of the basalt (±0.2 Ma) and the estimate of time spanned between this date and deposition of the fossil (1 Ma) are incorporated, the hard minimum age for the fossil is 46.6 Ma. We use the upper age range estimate reported for the oldest aquatic neoavian, Waimanu manneringi as a maximum age.
A.4. Calibrated Node: Stem Steatornithidae (Steatornithidae–Nyctibiidae split)
Fossil Specimen:Prefica nivea USNM 336278
Phylogenetic Justification: Olson [157] discussed synapomorphies of Prefica and Steatornis, and a sister group relationship between the two was supported by the phylogenetic analysis of Mayr [158].
Minimum Age Constraint: 51.81 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The fossil is from Fossil Butte Member, Green River Formation, Wyoming, USA. These deposits are late early Eocene, and multicrystal analyses (sanidine) from a K-feldspar tuff (FQ-1) at the top of the middle unit of the Fossil Butte Member from Fossil–Fowkes Basin (locality: N41°47′32.2″ W110°42′39.6″) have yielded an age of 51.97 ± 0.16 Ma [159]. The latest Cretaceous is set as the soft maximum, corresponding to the age range of the oldest known neognathous bird Vegavis iaii. No members of Strisores are known from Cretaceous deposits, indicating it is unlikely the highly nested divergence between oilbirds and other Strisores had occurred before the Paleocene.
A.5. Calibrated Node: Crown Apodi (Apodidae–Hemiprocnidae split)
Fossil Specimen:Scaniacypselus wardi NHMUKA5430
Phylogenetic Justification: Phylogenetic analyses have consistently placed Scaniacypselus as the sister taxon to extant Apodidae [160,161,162].
Minimum Age Constraint: 51 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The fossil is from Bed R6 of the Røsnæs Clay Formation of Ølst, Denmark. Thiede et al. [163] assigned the upper calcareous beds of the Røsnæs Clay Formation, including R5 and R6 to nanoplankton biozones NP11 and NP12. Biostratigraphy supports correlation of the Røsnæs Clay Formation to the European mammal reference biozone MP8 [164], which suggests an age >50 Ma [149]. A conservative minimum age of 51 Ma is proposed, based specifically on the estimated age of the upper boundary of NP12, which is dated to 51 Ma [165]. The latest Cretaceous is set as the maximum, corresponding to the age range of the oldest neognathous bird Vegavis iaii. No members of Strisores are known from Cretaceous deposits, indicating it is unlikely the highly nested divergence between swifts and hummingbirds had occurred before the Paleocene.
A.6. Calibrated Node: Crown Gruiformes (Ralloidea–Gruoidea split)
Fossil Specimen:Pellornis mikkelseni MGUH 29278
Phylogenetic Justification:Pellornis has been recovered as a member of Messelornithidae, a clade that has been supported by synapomorphies as sister taxon to Rallidae+Heliornithidae [166] or Rallidae to the exclusion of Heliornithidae [167]. Recent work [168] supports the former position for Pellornis and Messelornithidae, and that is used here.
Minimum Age Constraint: 53.9 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The fossil is from the Fur Formation of Denmark. The minimum age is based on a 54.04+/-0.14 Ma radiometric date reported for layer +19 of the Fur Formation [169]. The latest Cretaceous is set as the maximum, corresponding to the age range of the oldest neognathous bird Vegavis iaii. No reliable records of Gruiformes are known from Cretaceous deposits. This maximum incorporates the possibility that Paleocene taxa such as the poorly known Messelornis russelli or the enigmatic Walbeckornis belong to crown Gruiformes.
A.7. Calibrated Node: Crown Gruidae (Balearicinae–Gruinae split)
Fossil Specimen:Balearica exigua UNSM 53579
Phylogenetic Justification:Balearica exigua is known from a number of specimens that exhibit distinct similarities to extant Balearica across the skeleton, including the skull, beak, femur, tibiotarsus, tarsometatarsus, and humerus [170]. Most diagnostically, B. exigua exhibits inflated frontals, a feature shared with extant Balearica, in contrast to the uninflated condition in Gruinae (the crane subfamily including all other cranes in the genera Leucogeranus, Antigone, and Grus). Among Neoaves, Balearica is the only clade of extant Neoaves exhibiting this feature [171], further supporting referral of B. exigua to total-clade Balearicinae.
Minimum Age Constraint: 10.3 Ma
Maximum Age Constraint: 53.9 Ma
Age Justification: The specimens derive from the upper Clarendonian Ash Hollow Formation of the Cap Rock Member, near Orchard, Nebraska [170]. Although the specimens derive from a 2m-thick volcanic ash bed, and despite previous work on the age of the Ash Hollow Formation [172], precise constraints on the age of this locality are lacking. Considering this uncertainty, and given the upper Clarendonian age of this locality, we assign an age of 10.3 Ma, corresponding to the lower bound of the Clarendonian inclusive of error.
A.8. Calibrated Node: Stem Laridae (Laridae–Alcidae split)
Fossil Specimen:Laricola elegans NMB s.g.18810
Phylogenetic Justification: De Pietri et al. [173] recovered Laricola as either the sister to Laridae (=Laromorphae) or within Laridae (with Anous the sister taxon to all other Laridae). Smith [174] recommended Laricola as a crown Laromorphae calibration, however, the analysis upon which this was based was conducted before new cranial material was described. We conservatively place it as sister to Laromorphae, reflecting this uncertainty.
Minimum Age Constraint: 20.44 Ma
Maximum Age Constraint: 47.8 Ma
Age Justification: The fossil is from Saint-Gérand-le-Puy, France. Quarries at Saint-Gérand-le-Puy span the Oligocene and Miocene, but De Pietri et al. [173] were unable to confirm or refute whether any of the historically collected Laricola material comes from the Oligocene age deposits. We thus conservatively use the upper bound of the Aquitanian for the hard minimum. The oldest reasonably complete fossil assignable to Charadriiformes is an unnamed Eocene (Lutetian) fossil SMF-ME 2458A+B [175]. The lower bound of the Lutetian is thus used as a maximum.
A.9. Calibrated Node: Stem Phaethontiformes
Fossil Specimen:Lithoptila abdounensis OCP.DEK/GE 1087
Phylogenetic Justification: Phylogenetic analyses by Bourdon et al. [176] and Smith [177] recover Lithoptila abdounensis as a stem representative of Phaethontiformes, and cranial characters preserved in OCP.DEK/GE 1087 support this placement. Although the position of Phaethontidae within Aves has been controversial, the placement of Lithoptila has been stable, which tracks Phaethontidae in phylogenetic analyses regardless of the arrangement of other taxa.
Minimum Age Constraint: 56 Ma
Maximum Age Constraint: 72.1 Ma
Age Justification: The fossil was collected from an unspecified quarry, assigned to Bed IIa of the Ouled Abdoun Basin, near Grand Daoui, Morocco, which in turn can be assigned to the Thanetian based on selachians identified in the matrix [176]. As both the precise numerical age of Bed IIa deposits and the precise horizon from which the fossil was collected remain uncertain, the lower age bound for the Thanetian is used as a hard minimum. More fragmentary records of probable Phaethontiformes are known from slightly older (Danian) deposits in New Zealand [178]. We conservatively rely on Lithoptila, but note that these records are encompassed between the minimum and maximum bounds. The maximum age extends to the base of the Maastrichtian to accommodate the possibility that some of the poorly represented marine birds from the Cretaceous–Paleogene of New Jersey may represent tropicbirds [176].
A.10. Calibrated Node: Stem Phalacrocoracidae (Phalacrocoracidae–Anhingidae split)
Fossil Specimen: Oligocorax (=Borvocarbo) stoeffelensis PW 2005/5022-LS
Phylogenetic Justification: Phylogenetic analysis by Smith [177] and Mayr [179] recover Oligocorax stoeffelensis as more closely related to Phalacrocorax than to Anhinga. PW 2005/5022-LS preserves a substantial portion of the skeleton, including synapomorphy-bearing elements.
Minimum Age Constraint: 24.82 Ma
Maximum Age Constraint: 51.81 Ma
Age Justification: The fossil is from a maar lake deposit at Enspel in Germany. These deposits are assigned to the MP28 biozone [153], the top of which is used for the hard minimum age. Comparable in age is the Late Oligocene Nambashag from the Australian Etadunna and Namba Formations (Worthy, 2011), which also represents a stem member of Phalacrocoracidae [179]. The maximum is based on the age of the Green River Formation, from which members of Aequornithes such as Limnofregata and Vadaravis have been recovered.
A.11. Calibrated Node: Crown Austrodyptornithes (Sphenisciformes–Procellariiformes split)
Fossil Specimen:Waimanu maneringi CM zfa35
Phylogenetic Justification: Phylogenetic analysis supports the placement of Waimanu along the penguin stem lineage [180,181]. CM zfa35 is the only published specimen of Waimanu manneringi.
Minimum Age Constraint: 60.5 Ma
Maximum Age Constraint: 72.1 Ma
Age Justification: Biostratigraphic evidence, specifically the ranges of Hornibrookina teuriensis and Chaismolithus bidens indicate the minimum possible age of the type locality is 60.5 Ma [180,182,183]. The maximum is based on the lower bound of the Maastrichtian Stage. Southern Hemisphere Maastrichtian marine vertebrate sites have yielded diving birds such as Polarornis and hesperornithids, indicating preservation potential for marine diving birds, but no penguin (or procellariiform) remains have been recovered at these sites.
A.12. Calibrated Node: Stem Fregatidae (Fregatidae–Suloidea split)
Fossil Specimen: Limnofregata azygosternon USNM 22753
Phylogenetic Justification: Phylogenetic analysis supports the placement of Limnofregata as the sister taxon to extant Fregata [177], in agreement with longstanding interpretations of this fossil taxon [157]. USNM 22753 is an articulated skeleton preserving most key synapomorphies that place Limnofregata azygosternon on the frigatebird stem lineage.
Minimum Age Constraint: 51.57 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The minimum date of 51.57 Ma incorporates the error associated with an 40Ar/39Ar date of 51.66 ± 0.09 Ma obtained from a potassium-feldspar (K-spar) tuff above the fossiliferous horizon containing USNM 336484 [184]. A few fragmentary records of Limnofregata are known from slightly older (~2 Ma) deposits of the Wasatch Formation [185] and Namejoy Formation [186]. We conservatively rely on the complete Fossil Butte skeleton but we also note that these records are encompassed between the minimum and maximum bounds. The latest Cretaceous is set as the soft maximum, corresponding to the age range of the oldest known crown bird Vegavis. No well-supported material from the core waterbird clade Aequornithes are known from Cretaceous deposits, indicating it is unlikely the highly nested divergence between Fregatidae and Suloidea had occurred before the Paleocene.
A.13. Calibrated Node: Crown Spheniscidae (MRCA extant Spheniscidae)
Fossil Specimen:Madrynornis mirandus MEF-PV 100
Phylogenetic Justification:Madrynornis mirandus was originally considered to be closely related to Eudyptes [187,188] and later considered to possibly represent the sister taxon to crown Spheniscidae [189,190]. Re-study of the holotype has revealed new character evidence and the most recent phylogenetic analysis suggested that Madrynornis is instead more closely related to Spheniscus and Eudyptula, though support was weak for this hypothesis (trees placing the fossil with Eudyptes were only one step longer). Nevertheless, seven synapomorphies support crown status for Madrynornis, most compellingly the widely separated fossa temporalis, elongate processus retroarticularis, and small foramen ilioischiadicum [191]. Given the strong evidence that Madrynornis is a crown penguin and the lingering uncertainty over the precise placement of this taxon, we use Madrynornis as a calibration for the penguin crown group.
Minimum Age Constraint: 9.2 Ma
Maximum Age Constraint: 27 Ma
Age Justification: The single known specimen of Madrynornis mirandus, comprising most of a skeleton, was collected from the “Entrerriense” sequence of the Puerto Madryn Formation [187]. This sequence was deposited at 10.0+/−0.3 Ma based on 87Sr/86Sr dates obtained from fossil mollusks [192]. The maximum age is based on the age of the Kokoamu Greensand of New Zealand, a unit that together with the overlying Otekaika Limestone has yielded no less than ten penguin species. All of these species, as well as those from other Oligocene units in Australia and South America, are stem penguins. The extensive global record of stem penguins demonstrates strong preservation potential for total group penguins, so the lack of any potential crown penguins in the Paleogene is most likely due to true absence rather than being an artifact of the fossil record.
A.14. Calibrated Node: Stem Apodiformes (Apodiformes–Aegothelidae split)
Fossil Specimen:Eocypselus vincenti MGUH 29278
Phylogenetic Justification: Phylogenetic analyses have consistently placed Eocypselus as the sister taxon to extant Apodiformes [160,193]. Eocypselus is supported as a member of Pan-Apodiformes by two unambiguous synapomorphies: an abbreviated humerus and an ossified arcus extensorius of the tarsometatarsus, while monophyly of crown Apodiformes to the exclusion of Eocypselus is supported by eight additional characters [160].
Minimum Age Constraint: 53.9 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The minimum age is based on a 54.04+/−0.14 Ma radiometric date reported for layer +19 of the Fur Formation [169]. The maximum age is set as the K–Pg boundary. The earliest records of Strisores are all late Eocene in age (e.g., stem podargids, stem nyctibiids, etc.), and the divergence between Apodiformes and Aegothelidae is deeply nested within this clade [37,38]. Although Strisores have been hypothesized to be the sister taxon to all other neoavians [38], the lack of Late Cretaceous Neoaves and the deeply nested position of Pan-Apodiformes within Strisores argue against a Cretaceous origin.
A.15. Calibrated Node: Stem Threskiornithidae (Threskiornithidae–Pelecanidae/Ardeidae split)
Fossil Specimen:Rhynchaeites sp. MGUH 20288
Phylogenetic Justification: Multiple apomorphies support the placement of Rhynchaeites within the total clade Threskiornithidae [194]. Although the characteristic ibis-type bill is not preserved in MGUH 20288, derived characteristics of the hindlimb support assignment to Rhynchaeites as well as placement along the stem lineage of Threskiornithidae for this specimen [167].
Minimum Age Constraint: 53.9 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The minimum age is based on a 54.04+/−0.14 Ma radiometric date reported for layer +19 of the Fur Formation [169]. The latest Cretaceous is set as the soft maximum, corresponding to the age range of the oldest neognathous bird Vegavis. No members of the core waterbird clade Aequornithes are known from Cretaceous deposits, indicating it is unlikely the highly nested divergence between ibises and other waterbirds occurred before the Paleocene.
A.16. Calibrated Node: Stem Musophagiformes (Musophagiformes–Otidiformes split)
Fossil Specimen:Foro panarium USNM 336261
Phylogenetic Justification:F. panarium was supported as the sister taxon to crown Musophagidae on the basis of phylogenetic analyses employing multiple alternative backbone constraints [195]. Character states resolving as unambiguous synapomorphies of an exclusive Musophagidae+Foro clade include a furcula unfused at its midline, large tubercula praeacetabularia of the pelvis, os carpi ulnare with crus longum greatly abbreviated, and bill short and stout with broad processus maxillaris of os nasale [135].
Minimum Age Constraint: 51.81 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The fossil is from the Fossil Butte Member, Green River Formation, Wyoming, USA. These deposits are late early Eocene in age, and multicrystal analyses (sanidine) from a K-feldspar tuff (FQ-1) at the top of the middle unit of the Fossil Butte Member, from Fossil–Fowkes Basin (locality: N41°47′32.2″ W110°42′39.6″) have yielded an age of 51.97 ± 0.16 Ma [159]. The latest Cretaceous is set as the maximum, corresponding to the age range of the oldest neognathous bird Vegavis. Foro panarium is easily the oldest known well supported member of Otidimorphae, indicating that although an Otidimorphae ghost lineage must extend earlier into the Paleogene, a Cretaceous divergence among crown Otidimorphae is unlikely.
A.17. Calibrated Node: Stem Coliiformes (Coliiformes–Cavitaves split)
Fossil Specimen:Tsidiiyazhi abini NMMNH P-54128
Phylogenetic Justification: Combined analyses by Ksepka et al. [196] recovered Tsidiiyazhi abini as a stem mousebird, regardless of whether only morphological data are considered or the relationships of extant taxa are constrained using various topologies recovered by recent large-scale molecular studies.
Minimum Age Constraint: 62.221 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The fossil was collected from the Ojo Encino Member of the Nacimiento Formation. This horizon falls within magnetochron C27N, constraining the absolute geochronological age to 62.221–62.517 Ma. The latest Cretaceous is set as the maximum, corresponding to the age range of the oldest neognathous bird Vegavis. No members of the “landbird” clade Telluraves are known from Cretaceous deposits, indicating it is unlikely the Coliiformes–Cavitaves divergence had occurred before the Paleocene.
A.18. Calibrated Node: Crown Piciformes (MRCA extant Piciformes)
Fossil Specimen:Rupelramphastoides knopfi SMF Av 500
Phylogenetic Justification: Mayr [197,198] provided evidence from synapomorphic features of the tarsometatarsus and ulna that clearly support placement of this fossil within total clade Pici. However, uncertainty remains over whether this taxon belongs within the crown Pici or is outside this clade. Conservatively, it is used as a calibration for the Pici–Galbulae split.
Minimum Age Constraint: 31 Ma
Maximum Age Constraint: 58.5 Ma
Age Justification: The fossil is from Frauenweiler, Germany. The Frauenweiler locality was considered MP22 (32 Ma) by Micklich and Hildebrandt [199]. In order to set a hard minimum, the top of MP22 at 31 Ma [165] was used. The maximum is based on the oldest described member of Afroaves, the Paleocene owl Ogygoptynx wetmorei.
A.19. Calibrated Node: Stem Coracii (Coracioidea–Meropidae split)
Fossil Specimen:Primobucco mcgrewi USNM 336484
Phylogenetic Justification: Phylogenetic analyses place Primobucco mcgrewi along the stem lineage leading to the clade Coracioidea (rollers and ground rollers) [200,201]. This is consistent with the hypothesis originally proposed by Houde and Olson [202].
Minimum Age Constraint: 51.81 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The fossil is from the Fossil Butte Member, Green River Formation, Wyoming, USA. These deposits are late early Eocene, and multicrystal analyses (sanidine) from a K-feldspar tuff (FQ-1) at the top of the middle unit of the Fossil Butte Member, from Fossil–Fowkes Basin (locality: N41°47′32.2″ W110°42′39.6″) have yielded an age of 51.97 ± 0.16 Ma [159]. The latest Cretaceous is set as the maximum, corresponding to the age range of the oldest neognathous bird Vegavis. No members of the “landbird” clade Telluraves are known from Cretaceous deposits, indicating it is unlikely the highly nested Coracioidea–Meropidae divergence had occurred before the Paleocene.
A.20. Calibrated Node: Stem Todidae (Todidae–Momotidae/Alcedinidae split)
Fossil Specimen:Palaeotodus itardiensis SMF Av505
Phylogenetic Justification: Mayr and Knopf [203] identified derived characters of Todidae including the scapi clavicularum of the furcula being very thin, the proximal end of the humerus reaching far ventrally and being inflected so that almost the entire caput humeri is situated farther ventrally than the ventral margin of the shaft, a carpometacarpus with a large processus intermetacarpalis, a greatly elongated and slender tarsometatarsus measuring almost the length of the humerus, and the plantar surface of trochlea metatarsi III bearing a marked sulcus.
Minimum Age Constraint: 31 Ma
Maximum Age Constraint: 55 Ma
Age Justification: The fossil is from Frauenweiler south of Wiesloch (Baden-Württemberg, Germany), former clay pit of the Bott-Eder GmbH (“Grube Unterfeld”). The Frauenweiler locality was considered MP22 (32Ma) by Micklich and Hildebrandt [199]. The top of MP22 at 31 Ma [165] was used to set a hard minimum. The oldest reported Coraciiformes (sensu Yuri et al. [87]) are from the early Eocene. Given this limit and the absence of Todidae in Lagerstätten such as the Green River, Messel, London Clay, and Fur Formations which otherwise preserve an abundance of small birds, a maximum of 55 Ma is specified.
A.21. Calibrated Node: Psittacopasserae (Psittaciformes and Passeriformes split)
Fossil Taxon: Eozygodactylus americanus
Specimen: USNM 299821, partial articulated skeleton
Phylogenetic Justification: Phylogenetic analyses have consistently recovered Zygodactylidae (including Eozygodactylus) as stem passerines [204,205]. This relationship is supported by a suite of characters including the presence of a large processus intermetacarpalis of the carpometacarpus, great elongation of the tarsometatarsus (exceeding length of humerus) and presence of a crista plantaris lateralis of the tarsometatarsus [204], and has been recovered both by analyses of morphological data alone and those in which molecular scaffolds are enforced.
Minimum Age Constraint: 51.81 Ma
Maximum Age Constraint: 66.5 Ma
Age Justification: The minimum age is based on the lower bound of the age for the fossiliferous horizons of the Fossil Butte Member of the Green River Formation. These deposits are late early Eocene, and multicrystal analyses (sanidine) from a K-feldspar tuff (FQ-1) at the top of the middle unit of the Fossil Butte Member have yielded an age of 51.97 ± 0.16 Ma [159]. Slightly older potential records of Zygodactylidae from the London Clay Formation and Fur Formation have been referenced [206,207,208], but because the former are isolated bones and the latter are not yet formally described, we conservatively rely on the more complete and well documented Green River Formation specimens for the hard minimum age. The latest Cretaceous is set as the maximum, corresponding to the age range of the oldest confirmed crown bird fossil Vegavis. No members of the Psittacopasserae or the more inclusive clade Telluraves (“higher land birds”) are known from Cretaceous deposits, indicating it is extremely unlikely that the highly nested parrot-songbird divergence had occurred before the Paleocene.
A.22. Calibrated Node: Crown Eupasseres
Fossil Specimen: Suboscines indet. SMF Av 504
Phylogenetic Justification: The presence of a distally protruding finger-like process at the cranial edge of metacarpal III is an apomorphy supporting assignment of SMF Av 504 to at least the stem suboscine lineage [209]. Additionally, the hatchet-shaped phalanx II-1 is similar to suboscines and differs from oscines, Acanthisittidae, and Zygodactylus luberonensis. This feature is potentially another apomorphy for suboscines, though its distribution has not yet been fully documented.
Minimum Age Constraint: 26 Ma
Maximum Age Constraint: 55 Ma
Age Justification: The exact horizon from which this specimen was collected was not specified, but the Luberon fossil deposits are considered to fall within the MP21–MP25 age range according to Manegold [210]. We conservatively use the minimum age of MP25 as a hard minimum date (see Figure 28.10 of [165]). The oldest reported stem Passeriformes are from the early Eocene. Furthermore, no crown Passeriformes of any type are found in Eocene deposits such as the Green River Formation, Messel Formation, London Clay Formation, or Fur Formation, each of which otherwise preserves an abundance of small bird fossils. These deposits are all from the Northern Hemisphere. Eupasseres appear to have originated in the Southern Hemisphere, which has a much poorer fossil record for small birds. Nevertheless, several Oligocene–Miocene fossils of early members of Eupasseres have been described from European deposits, which indicate that the clade was not entirely restricted to the Southern Hemisphere early in their evolutionary history. These include Oligocene fossils Wieslochia weissi (a possible stem suboscine or perhaps a lineage just outside Eupasseres; [211], a potential Miocene record of the suboscine lineage Eurylaimidae [212] and several Miocene tarsometatarsi that retain plesiomorphic features suggesting they represent an extinct lineage outside of Eupasseres [213]. Thus, the Early Eocene provides a conservative maximum age.

References

  1. McCormack, J.E.; Hird, S.M.; Zellmer, A.J.; Carstens, B.C.; Brumfield, R.T. Applications of next-generation sequencing to phylogeography and phylogenetics. Mol. Phylogenet. Evol. 2013, 66, 526–538. [Google Scholar] [CrossRef] [PubMed]
  2. Delsuc, F.; Brinkmann, H.; Philippe, H. Phylogenomics and the reconstruction of the tree of life. Nat. Rev. Genet. 2005, 6, 361–375. [Google Scholar] [CrossRef] [PubMed]
  3. Philippe, H.; Delsuc, F.; Brinkmann, H.; Lartillot, N. Phylogenomics. Annu. Rev. Ecol. Evol. Syst. 2005, 36, 541–562. [Google Scholar] [CrossRef]
  4. Eisen, J.A. Phylogenomics: Improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 1998, 8, 163–167. [Google Scholar] [CrossRef] [PubMed]
  5. Eisen, J.A.; Kaiser, D.; Myers, R.M. Gastrogenomic delights: A movable feast. Nat. Med. 1997, 3, 1076–1078. [Google Scholar] [CrossRef] [PubMed]
  6. Philippe, H.; Snell, E.A.; Bapteste, E.; Lopez, P.; Holland, P.W.H.; Casane, D. Phylogenomics of eukaryotes: Impact of missing data on large alignments. Mol. Biol. Evol. 2004, 21, 1740–1752. [Google Scholar] [CrossRef] [PubMed]
  7. Dunn, C.W.; Hejnol, A.; Matus, D.Q.; Pang, K.; Browne, W.E.; Smith, S.A.; Seaver, E.; Rouse, G.W.; Obst, M.; Edgecombe, G.D.; et al. Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 2008, 452, 745–749. [Google Scholar] [CrossRef]
  8. Hackett, S.J.; Kimball, R.T.; Reddy, S.; Bowie, R.C.K.; Braun, E.L.; Braun, M.J.; Chojnowski, J.L.; Cox, W.A.; Han, K.L.; Harshman, J.; et al. A phylogenomic study of birds reveals their evolutionary history. Science 2008, 320, 1763–1768. [Google Scholar] [CrossRef]
  9. Shen, X.X.; Zhou, X.F.; Kominek, J.; Kurtzman, C.P.; Hittinger, C.T.; Rokas, A. Reconstructing the backbone of the Saccharomycotina yeast phylogeny using genome-scale data. G3 Genes Genomes Genet. 2016, 6, 3927–3939. [Google Scholar] [CrossRef]
  10. Ascunce, M.S.; Huguet-Tapia, J.C.; Ortiz-Urquiza, A.; Keyhani, N.O.; Braun, E.L.; Goss, E.M. Phylogenomic analysis supports multiple instances of polyphyly in the oomycete peronosporalean lineage. Mol. Phylogenet. Evol. 2017, 114, 199–211. [Google Scholar] [CrossRef]
  11. Wu, G.A.; Terol, J.; Ibanez, V.; Lopez-Garcia, A.; Perez-Roman, E.; Borreda, C.; Domingo, C.; Tadeo, F.R.; Carbonell-Caballero, J.; Alonso, R.; et al. Genomics of the origin and evolution of citrus. Nature 2018, 554, 311. [Google Scholar] [CrossRef] [PubMed]
  12. Harvey, M.G.; Smith, B.T.; Glenn, T.C.; Faircloth, B.C.; Brumfield, R.T. Sequence capture versus restriction site associated dna sequencing for shallow systematics. Syst. Biol. 2016, 65, 910–924. [Google Scholar] [CrossRef] [PubMed]
  13. Misof, B.; Liu, S.L.; Meusemann, K.; Peters, R.S.; Donath, A.; Mayer, C.; Frandsen, P.B.; Ware, J.; Flouri, T.; Beutel, R.G.; et al. Phylogenomics resolves the timing and pattern of insect evolution. Science 2014, 346, 763–767. [Google Scholar] [CrossRef] [PubMed]
  14. Wickett, N.J.; Mirarab, S.; Nguyen, N.; Warnow, T.; Carpenter, E.; Matasci, N.; Ayyampalayam, S.; Barker, M.S.; Burleigh, J.G.; Gitzendanner, M.A.; et al. Phylotranscriptomic analysis of the origin and early diversification of land plants. Proc. Natl. Acad. Sci. USA 2014, 111, E4859–E4868. [Google Scholar] [CrossRef] [PubMed]
  15. Gayral, P.; Weinert, L.; Chiari, Y.; Tsagkogeorga, G.; Ballenghien, M.; Galtier, N. Next-generation sequencing of transcriptomes: A guide to RNA isolation in nonmodel animals. Mol. Ecol. Resour. 2011, 11, 650–661. [Google Scholar] [CrossRef] [PubMed]
  16. Andrews, K.R.; Good, J.M.; Miller, M.R.; Luikart, G.; Hohenlohe, P.A. Harnessing the power of RADseq for ecological and evolutionary genomics. Nat. Rev. Genet. 2016, 17, 81–92. [Google Scholar] [CrossRef] [PubMed]
  17. Rubin, B.E.R.; Ree, R.H.; Moreau, C.S. Inferring phylogenies from RADsequence data. PLoS ONE 2012, 7, e33394. [Google Scholar] [CrossRef]
  18. Eaton, D.A.R.; Spriggs, E.L.; Park, B.; Donoghue, M.J. Misconceptions on missing data in RAD-seq phylogenetics with a deep-scale example from flowering plants. Syst. Biol. 2017, 66, 399–412. [Google Scholar] [CrossRef]
  19. Tin, M.M.Y.; Rheindt, F.E.; Cros, E.; Mikheyev, A.S. Degenerate adaptor sequences for detecting PCR duplicates in reduced representation sequencing data improve genotype calling accuracy. Mol. Ecol. Resour. 2015, 15, 329–336. [Google Scholar] [CrossRef]
  20. Jones, M.R.; Good, J.M. Targeted capture in evolutionary and ecological genomics. Mol. Ecol. 2016, 25, 185–202. [Google Scholar] [CrossRef]
  21. Hosner, P.A.; Faircloth, B.C.; Glenn, T.C.; Braun, E.L.; Kimball, R.T. Avoiding missing data biases in phylogenomic inference: An empirical study in the landfowl. Mol. Biol. Evol. 2016, 33, 1110–1125. [Google Scholar] [CrossRef] [PubMed]
  22. McCormack, J.E.; Tsai, W.L.E.; Faircloth, B.C. Sequence capture of ultraconserved elements from bird museum specimens. Mol. Ecol. Resour. 2016, 16, 1189–1203. [Google Scholar] [CrossRef] [PubMed]
  23. Ruane, S.; Austin, C.C. Phylogenomics using formalin-fixed and 100+ year-old intractable natural history specimens. Mol. Ecol. Resour. 2017, 17, 1003–1008. [Google Scholar] [CrossRef] [PubMed]
  24. Bi, K.; Vanderpool, D.; Singhal, S.; Linderoth, T.; Moritz, C.; Good, J.M. Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales. BMC Genom. 2012, 13, 403. [Google Scholar] [CrossRef] [PubMed]
  25. Ali, O.A.; O’Rourke, S.M.; Amish, S.J.; Meek, M.H.; Luikart, G.; Jeffres, C.; Miller, M.R. RAD capture (Rapture): Flexible and efficient sequence-based genotyping. Genetics 2016, 202, 389–400. [Google Scholar] [CrossRef] [PubMed]
  26. Hoffberg, S.L.; Kieran, T.J.; Catchen, J.M.; Devault, A.; Faircloth, B.C.; Mauricio, R.; Glenn, T.C. RADcap: Sequence capture of dual-digest RADseq libraries with identifiable duplicates and reduced missing data. Mol. Ecol. Resour. 2016, 16, 1264–1278. [Google Scholar] [CrossRef] [PubMed]
  27. Glenn, T.C.; Faircloth, B.C. Capturing Darwin’s dream. Mol. Ecol. Resour. 2016, 16, 1051–1058. [Google Scholar] [CrossRef] [PubMed]
  28. Zhang, G.J. Bird sequencing project takes off. Nature 2015, 522, 34. [Google Scholar] [CrossRef]
  29. Lamichhaney, S.; Berglund, J.; Almen, M.S.; Maqbool, K.; Grabherr, M.; Martinez-Barrio, A.; Promerova, M.; Rubin, C.J.; Wang, C.; Zamani, N.; et al. Evolution of Darwin’s finches and their beaks revealed by genome sequencing. Nature 2015, 518, 371–375. [Google Scholar] [CrossRef]
  30. Seki, R.; Li, C.; Fang, Q.; Hayashi, S.; Egawa, S.; Hu, J.; Xu, L.H.; Pan, H.L.; Kondo, M.; Sato, T.; et al. Functional roles of Aves class-specific cis-regulatory elements on macroevolution of bird-specific features. Nat. Commun. 2017, 8, 14229. [Google Scholar] [CrossRef]
  31. Thomas, G.H. Evolution: An avian explosion. Nature 2015, 526, 516–517. [Google Scholar] [CrossRef] [PubMed]
  32. Moyle, R.G.; Filardi, C.E.; Smith, C.E.; Diamond, J. Explosive Pleistocene diversification and hemispheric expansion of a “Great speciator”. Proc. Natl. Acad. Sci. USA 2009, 106, 1863–1868. [Google Scholar] [CrossRef] [PubMed]
  33. Jonsson, K.A.; Fabre, P.H.; Ricklefs, R.E.; Fjeldsa, J. Major global radiation of corvoid birds originated in the proto-Papuan archipelago. Proc. Natl. Acad. Sci. USA 2011, 108, 2328–2333. [Google Scholar] [CrossRef] [PubMed]
  34. Kimball, R.T.; Braun, E.L. Does more sequence data improve estimates of galliform phylogeny? Analyses of a rapid radiation using a complete data matrix. PeerJ 2014, 2, e361. [Google Scholar] [CrossRef] [PubMed]
  35. Provost, K.L.; Joseph, L.; Smith, B.T. Resolving a phylogenetic hypothesis for parrots: Implications from systematics to conservation. Emu 2018, 118, 7–21. [Google Scholar] [CrossRef]
  36. McCormack, J.E.; Harvey, M.G.; Faircloth, B.C.; Crawford, N.G.; Glenn, T.C.; Brumfield, R.T. A phylogeny of birds based on over 1,500 loci collected by target enrichment and high-throughput sequencing. PLoS ONE 2013, 8, e54848. [Google Scholar] [CrossRef] [PubMed]
  37. Jarvis, E.D.; Mirarab, S.; Aberer, A.J.; Li, B.; Houde, P.; Li, C.; Ho, S.Y.W.; Faircloth, B.C.; Nabholz, B.; Howard, J.T.; et al. Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 2014, 346, 1320–1331. [Google Scholar] [CrossRef]
  38. Prum, R.O.; Berv, J.S.; Dornburg, A.; Field, D.J.; Townsend, J.P.; Lemmon, E.M.; Lemmon, A.R. A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing. Nature 2015, 526, 569–573. [Google Scholar] [CrossRef]
  39. Reddy, S.; Kimball, R.T.; Pandey, A.; Hosner, P.A.; Braun, M.J.; Hackett, S.J.; Han, K.L.; Harshman, J.; Huddleston, C.J.; Kingston, S.; et al. Why do phylogenomic data sets yield conflicting trees? Data type influences the avian tree of life more than taxon sampling. Syst. Biol. 2017, 66, 857–879. [Google Scholar] [CrossRef]
  40. Sun, K.P.; Meiklejohn, K.A.; Faircloth, B.C.; Glenn, T.C.; Braun, E.L.; Kimball, R.T. The evolution of peafowl and other taxa with ocelli (eyespots): A phylogenomic approach. Proc. R. Soc. B Biol. Sci. 2014, 281, 20140823. [Google Scholar] [CrossRef]
  41. Moyle, R.G.; Oliveros, C.H.; Andersen, M.J.; Hosner, P.A.; Benz, B.W.; Manthey, J.D.; Travers, S.L.; Brown, R.M.; Faircloth, B.C. Tectonic collision and uplift of Wallacea triggered the global songbird radiation. Nat. Commun. 2016, 7, 1–7. [Google Scholar] [CrossRef] [PubMed]
  42. Hosner, P.A.; Tobias, J.A.; Braun, E.L.; Kimball, R.T. How do seemingly non-vagile clades accomplish trans-marine dispersal? Trait and dispersal evolution in the landfowl. Proc. R. Soc. B Biol. Sci. 2017, 284, 20170210. [Google Scholar] [CrossRef] [PubMed]
  43. Burleigh, J.G.; Kimball, R.T.; Braun, E.L. Building the avian tree of life using a large-scale sparse supermatrix. Mol. Phylogenet. Evol. 2015, 84, 53–63. [Google Scholar] [CrossRef] [PubMed]
  44. Brown, J.W.; Wang, N.; Smith, S.A. The development of scientific consensus: Analyzing conflict and concordance among avian phylogenies. Mol. Phylogenet. Evol. 2017, 116, 69–77. [Google Scholar] [CrossRef] [PubMed]
  45. Driskell, A.C.; Ane, C.; Burleigh, J.G.; McMahon, M.M.; O’Meara, B.C.; Sanderson, M.J. Prospects for building the tree of life from large sequence databases. Science 2004, 306, 1172–1174. [Google Scholar] [CrossRef] [PubMed]
  46. de Queiroz, A.; Gatesy, J. The supermatrix approach to systematics. Trends Ecol. Evol. 2007, 22, 34–41. [Google Scholar] [CrossRef]
  47. Goloboff, P.A.; Catalano, S.A.; Miranda, J.M.; Szumik, C.A.; Arias, J.S.; Kallersjo, M.; Farris, J.S. Phylogenetic analysis of 73 060 taxa corroborates major eukaryotic groups. Cladistics 2009, 25, 211–230. [Google Scholar] [CrossRef]
  48. Sanderson, M.J.; Purvis, A.; Henze, C. Phylogenetic supertrees: Assembling the trees of life. Trends Ecol. Evol. 1998, 13, 105–109. [Google Scholar] [CrossRef]
  49. Bininda-Emonds, O.R.P.; Cardillo, M.; Jones, K.E.; MacPhee, R.D.E.; Beck, R.M.D.; Grenyer, R.; Price, S.A.; Vos, R.A.; Gittleman, J.L.; Purvis, A. The delayed rise of present-day mammals. Nature 2007, 446, 507–512. [Google Scholar] [CrossRef]
  50. Cotton, J.A.; Wilkinson, M. Supertrees join the mainstream of phylogenetics. Trends Ecol. Evol. 2009, 24, 1–3. [Google Scholar] [CrossRef]
  51. Warnow, T. Supertree construction: Opportunities and challenges. arXiv 2018, arXiv:1805.03530. [Google Scholar]
  52. Smith, S.A.; Beaulieu, J.M.; Donoghue, M.J. Mega-phylogeny approach for comparative biology: An alternative to supertree and supermatrix approaches. BMC Evol. Biol. 2009, 9, 37. [Google Scholar] [CrossRef] [PubMed]
  53. Thomas, G.H.; Hartmann, K.; Jetz, W.; Joy, J.B.; Mimoto, A.; Mooers, A.O. Pastis: An R package to facilitate phylogenetic assembly with soft taxonomic inferences. Methods Ecol. Evol. 2013, 4, 1011–1017. [Google Scholar] [CrossRef]
  54. Jetz, W.; Thomas, G.H.; Joy, J.B.; Hartmann, K.; Mooers, A.O. The global diversity of birds in space and time. Nature 2012, 491, 444–448. [Google Scholar] [CrossRef] [PubMed]
  55. Faircloth, B.C.; McCormack, J.E.; Crawford, N.G.; Harvey, M.G.; Brumfield, R.T.; Glenn, T.C. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Syst. Biol. 2012, 61, 717–726. [Google Scholar] [CrossRef] [PubMed]
  56. Baker, A.J.; Haddrath, O.; McPherson, J.D.; Cloutier, A. Genomic support for a moa-tinamou clade and adaptive morphological convergence in flightless ratites. Mol. Biol. Evol. 2014, 31, 1686–1696. [Google Scholar] [CrossRef] [PubMed]
  57. Bryson, R.W.; Faircloth, B.C.; Tsai, W.L.E.; McCormack, J.E.; Klicka, J. Target enrichment of thousands of ultraconserved elements sheds new light on early relationships within new world sparrows (Aves: Passerellidae). Auk 2016, 133, 451–458. [Google Scholar] [CrossRef]
  58. Hosner, P.A.; Braun, E.L.; Kimball, R.T. Rapid and recent diversification of curassows, guans, and chachalacas (Galliformes: Cracidae) out of Mesoamerica: Phylogeny inferred from mitochondrial, intron, and ultraconserved element sequences. Mol. Phylogenet. Evol. 2016, 102, 320–330. [Google Scholar] [CrossRef]
  59. Manthey, J.D.; Campillo, L.C.; Burns, K.J.; Moyle, R.G. Comparison of target-capture and restriction-site associated DNA sequencing for phylogenomics: A test in cardinalid tanagers (Aves, genus: Piranga). Syst. Biol. 2016, 65, 640–650. [Google Scholar] [CrossRef]
  60. Meiklejohn, K.A.; Faircloth, B.C.; Glenn, T.C.; Kimball, R.T.; Braun, E.L. Analysis of a rapid evolutionary radiation using ultraconserved elements: Evidence for a bias in some multispecies coalescent methods. Syst. Biol. 2016, 65, 612–627. [Google Scholar] [CrossRef]
  61. Ottenburghs, J.; Megens, H.J.; Kraus, R.H.S.; Madsen, O.; van Hooft, P.; van Wieren, S.E.; Crooijmans, R.P.M.A.; Ydenberg, R.C.; Groenen, M.A.M.; Prins, H.H.T. A tree of geese: A phylogenomic perspective on the evolutionary history of true geese. Mol. Phylogenet. Evol. 2016, 101, 303–313. [Google Scholar] [CrossRef] [PubMed]
  62. Persons, N.W.; Hosner, P.A.; Meiklejohn, K.A.; Braun, E.L.; Kimball, R.T. Sorting out relationships among the grouse and ptarmigan using intron, mitochondrial, and ultra-conserved element sequences. Mol. Phylogenet. Evol. 2016, 98, 123–132. [Google Scholar] [CrossRef] [PubMed]
  63. Zarza, E.; Faircloth, B.C.; Tsai, W.L.E.; Bryson, R.W.; Klicka, J.; Mccormack, J.E. Hidden histories of gene flow in highland birds revealed with genomic markers. Mol. Ecol. 2016, 25, 5144–5157. [Google Scholar] [CrossRef] [PubMed]
  64. Burga, A.; Wang, W.G.; Ben-David, E.; Wolf, P.C.; Ramey, A.M.; Verdugo, C.; Lyons, K.; Parker, P.G.; Kruglyak, L. A genetic signature of the evolution of loss of flight in the Galapagos cormorant. Science 2017, 356, eaal3345. [Google Scholar] [CrossRef]
  65. Wang, N.; Hosner, P.A.; Liang, B.; Braun, E.L.; Kimball, R.T. Historical relationships of three enigmatic phasianid genera (Aves: Galliformes) inferred using phylogenomic and mitogenomic data. Mol. Phylogenet. Evol. 2017, 109, 217–225. [Google Scholar] [CrossRef]
  66. White, N.D.; Mitter, C.; Braun, M.J. Ultraconserved elements resolve the phylogeny of potoos (Aves: Nyctibiidae). J. Avian Biol. 2017, 48, 872–880. [Google Scholar] [CrossRef]
  67. Yonezawa, T.; Segawa, T.; Mori, H.; Campos, P.F.; Hongoh, Y.; Endo, H.; Akiyoshi, A.; Kohno, N.; Nishida, S.; Wu, J.Q.; et al. Phylogenomics and morphology of extinct paleognaths reveal the origin and evolution of the ratites. Curr. Biol. 2017, 27, 68–77. [Google Scholar] [CrossRef]
  68. Andersen, M.J.; McCullough, J.M.; Mauck, W.M.; Smith, B.T.; Moyle, R.G. A phylogeny of kingfishers reveals an Indomalayan origin and elevated rates of diversification on oceanic islands. J. Biogeogr. 2018, 45, 269–281. [Google Scholar] [CrossRef]
  69. Bruxaux, J.; Gabrielli, M.; Ashari, H.; Prys-Jones, R.; Joseph, L.; Mila, B.; Besnard, G.; Thebaud, C. Recovering the evolutionary history of crowned pigeons (Columbidae: Goura): Implications for the biogeography and conservation of New Guinean lowland birds. Mol. Phylogenet. Evol. 2018, 120, 248–258. [Google Scholar] [CrossRef]
  70. Campillo, L.C.; Oliveros, C.H.; Sheldon, F.H.; Moyle, R.G. Genomic data resolve gene tree discordance in spiderhunters (Nectariniidae, Arachnothera). Mol. Phylogenet. Evol. 2018, 120, 151–157. [Google Scholar] [CrossRef]
  71. Chen, D.; Braun, E.L.; Forthman, M.; Kimball, R.T.; Zhang, Z.W. A simple strategy for recovering ultraconserved elements, exons, and introns from low coverage shotgun sequencing of museum specimens: Placement of the partridge genus Tropicoperdix within the Galliformes. Mol. Phylogenet. Evol. 2018, 129, 304–314. [Google Scholar] [CrossRef] [PubMed]
  72. Musher, L.J.; Cracraft, J. Phylogenomics and species delimitation of a complex radiation of Neotropical suboscine birds (Pachyramphus). Mol. Phylogenet. Evol. 2018, 118, 204–221. [Google Scholar] [CrossRef] [PubMed]
  73. Smith, B.T.; Mauck III, W.M.; Benz, B.; Andersen, M.J. Uneven missing data skews phylogenomic relationships within the lories and lorikeets. Biorxiv 2018. [Google Scholar] [CrossRef]
  74. Younger, J.L.; Strozier, L.; Maddox, J.D.; Nyari, A.S.; Bonfitto, M.T.; Raherilalao, M.J.; Goodman, S.M.; Reddy, S. Hidden diversity of forest birds in Madagascar revealed using integrative taxonomy. Mol. Phylogenet. Evol. 2018, 124, 16–26. [Google Scholar] [CrossRef] [PubMed]
  75. Sackton, T.; Grayson, P.; Cloutier, A.; Hu, Z.; Liu, J.; Wheeler, N.; Gardner, P.; Clarke, J.; Baker, A.; Clamp, M.; et al. Convergent regulatory evolution and loss of flight in paleognathous birds. Science 2019, 364, 74. [Google Scholar] [CrossRef] [PubMed]
  76. Dobrin, B.H.; Zwickl, D.J.; Sanderson, M.J. The prevalence of terraced treescapes in analyses of phylogenetic data sets. BMC Evol. Biol. 2018, 18, 46. [Google Scholar] [CrossRef] [PubMed]
  77. Sanderson, M.J.; McMahon, M.M.; Stamatakis, A.; Zwickl, D.J.; Steel, M. Impacts of terraces on phylogenetic inference. Syst. Biol. 2015, 64, 709–726. [Google Scholar] [CrossRef]
  78. Sanderson, M.J.; McMahon, M.M.; Steel, M. Terraces in phylogenetic tree space. Science 2011, 333, 448–450. [Google Scholar] [CrossRef]
  79. Kimball, R.T.; Braun, E.L.; Barker, F.K.; Bowie, R.C.K.; Braun, M.J.; Chojnowski, J.L.; Hackett, S.J.; Han, K.L.; Harshman, J.; Heimer-Torres, V.; et al. A well-tested set of primers to amplify regions spread across the avian genome. Mol. Phylogenet. Evol. 2009, 50, 654–660. [Google Scholar] [CrossRef]
  80. Gill, F.; Donsker, D. IOC World Bird List, 7.3. Available online: https://www.worldbirdnames.org/ (accessed on 5 August 2017).
  81. Baum, B.R. Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon 1992, 41, 3–10. [Google Scholar] [CrossRef]
  82. Ragan, M.A. Matrix representation in reconstructing phylogenetic-relationships among the eukaryotes. Biosystems 1992, 28, 47–55. [Google Scholar] [CrossRef]
  83. Creevey, C.J.; McInerney, J.O. Clann: Investigating phylogenetic information through supertree analyses. Bioinformatics 2005, 21, 390–392. [Google Scholar] [CrossRef] [PubMed]
  84. Swofford, D.L. PAUP*. Available online: http://paup.phylosolutions.com/ (accessed on 2 August 2018).
  85. Nguyen, L.T.; Schmidt, H.A.; von Haeseler, A.; Minh, B.Q. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 2015, 32, 268–274. [Google Scholar] [CrossRef] [PubMed]
  86. Nixon, K.C. The parsimony ratchet, a new method for rapid parsimony analysis. Cladistics 1999, 15, 407–414. [Google Scholar] [CrossRef]
  87. Yuri, T.; Kimball, R.T.; Harshman, J.; Bowie, R.C.; Braun, M.J.; Chojnowski, J.L.; Han, K.L.; Hackett, S.J.; Huddleston, C.J.; Moore, W.S.; et al. Parsimony and model-based analyses of indels in avian nuclear genes reveal congruent and incongruent phylogenetic signals. Biology 2013, 2, 419–444. [Google Scholar] [CrossRef] [PubMed]
  88. Nguyen, N.; Mirarab, S.; Warnow, T. MRL and superfine plus MRL: New supertree methods. Algorithms Mol. Biol. 2012, 7, 3. [Google Scholar]
  89. Cavender, J.A. Taxonomy with confidence. Math. Biosci. 1978, 40, 271–280. [Google Scholar] [CrossRef]
  90. Farris, J.S. Probability model for inferring evolutionary trees. Syst. Zool. 1973, 22, 250–256. [Google Scholar] [CrossRef]
  91. Neyman, J. A source of novel statistical problems. In Molecular Studies of Evolution: A Source of Novel Statistical Problems; Gupta, S.S., Yackel, J., Eds.; Academic Press: New York, NY, USA, 1971; pp. 1–27. [Google Scholar]
  92. Bininda-Emonds, O.R.P. Novel versus unsupported clades: Assessing the qualitative support for clades in MRP supertrees. Syst. Biol. 2003, 52, 839–848. [Google Scholar] [PubMed]
  93. Wilkinson, M.; Pisani, D.; Cotton, J.A.; Corfe, I. Measuring support and finding unsupported relationships in supertrees. Syst. Biol. 2005, 54, 823–831. [Google Scholar] [CrossRef] [PubMed]
  94. Burleigh, J.G.; Driskell, A.C.; Sanderson, M.J. Supertree bootstrapping methods for assessing phylogenetic variation among genes in genome-scale data sets. Syst. Biol. 2006, 55, 426–440. [Google Scholar] [CrossRef] [PubMed]
  95. Moore, B.R.; Smith, S.A.; Donoghue, M.J. Increasing data transparency and estimating phylogenetic uncertainty in supertrees: Approaches using nonparametric bootstrapping. Syst. Biol. 2006, 55, 662–676. [Google Scholar] [CrossRef] [PubMed]
  96. Anisimova, M.; Gascuel, O. Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst. Biol. 2006, 55, 539–552. [Google Scholar] [CrossRef] [PubMed]
  97. Anisimova, M.; Gil, M.; Dufayard, J.F.; Dessimoz, C.; Gascuel, O. Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst. Biol. 2011, 60, 685–699. [Google Scholar] [CrossRef] [PubMed]
  98. Robinson, D.F.; Foulds, L.R. Comparison of phylogenetic trees. Math. Biosci. 1981, 53, 131–147. [Google Scholar] [CrossRef]
  99. Benson, D.A.; Cavanaugh, M.; Clark, K.; Karsch-Mizrachi, I.; Ostell, J.; Pruitt, K.D.; Sayers, E.W. Genbank. Nucleic Acids Res. 2018, 46, D41–D47. [Google Scholar] [CrossRef] [PubMed]
  100. Meiklejohn, K.A.; Danielson, M.J.; Faircloth, B.C.; Glenn, T.C.; Braun, E.L.; Kimball, R.T. Incongruence among different mitochondrial regions: A case study using complete mitogenomes. Mol. Phylogenet. Evol. 2014, 78, 314–323. [Google Scholar] [CrossRef]
  101. Tamashiro, R.A.; White, N.D.; Braun, M.J.; Faircloth, B.C.; Braun, E.L.; Kimball, R.T. What are the roles of taxon sampling and model fit in tests of cyto-nuclear discordance using avian mitogenomic data? Mol. Phylogenet. Evol. 2019, 130, 132–142. [Google Scholar] [CrossRef]
  102. Ho, S.Y.W.; Duchene, S. Molecular-clock methods for estimating evolutionary rates and timescales. Mol. Ecol. 2014, 23, 5947–5965. [Google Scholar] [CrossRef]
  103. Smith, S.A.; O’Meara, B.C. Treepl: Divergence time estimation using penalized likelihood for large phylogenies. Bioinformatics 2012, 28, 2689–2690. [Google Scholar] [CrossRef]
  104. Sanderson, M.J. Estimating absolute rates of molecular evolution and divergence times: A penalized likelihood approach. Mol. Biol. Evol. 2002, 19, 101–109. [Google Scholar] [CrossRef] [PubMed]
  105. Parham, J.F.; Donoghue, P.C.J.; Bell, C.J.; Calway, T.D.; Head, J.J.; Holroyd, P.A.; Inoue, J.G.; Irmis, R.B.; Joyce, W.G.; Ksepka, D.T.; et al. Best practices for justifying fossil calibrations. Syst. Biol. 2012, 61, 346–359. [Google Scholar] [CrossRef] [PubMed]
  106. Claramunt, S.; Cracraft, J. A new time tree reveals earth history’s imprint on the evolution of modern birds. Sci. Adv. 2015, 1, 13. [Google Scholar] [CrossRef] [PubMed]
  107. Cracraft, J.; Houde, P.; Ho, S.Y.W.; Mindell, D.P.; Fjeldsa, J.; Lindow, B.; Edwards, S.V.; Rahbek, C.; Mirarab, S.; Warnow, T.; et al. Response to comment on “Whole-genome analyses resolve early branches in the tree of life of modern birds”. Science 2015, 349, 3. [Google Scholar] [CrossRef] [PubMed]
  108. Bininda-Emonds, O.R.P. Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life; Kluwer Academic Publishers: Dordrecht, The Netherlands, 2004; Volume 4. [Google Scholar]
  109. Hinchliff, C.E.; Smith, S.A. Some limitations of public sequence data for phylogenetic inference (in plants). PLoS ONE 2014, 9, e98986. [Google Scholar] [CrossRef]
  110. Philippe, H.; de Vienne, D.M.; Ranwez, V.; Roure, B.; Baurain, D.; Delsuc, F. Pitfalls in supermatrix phylogenomics. Eur. J. Taxon. 2017, 283, 1–25. [Google Scholar] [CrossRef]
  111. Goloboff, P.A. Parsimony, likelihood, and simplicity. Cladistics Int. J. Willi Hennig Soc. 2003, 19, 91–103. [Google Scholar] [CrossRef]
  112. Sanderson, M.J.; Kim, J. Parametric phylogenetics? Syst. Biol. 2000, 49, 817–829. [Google Scholar] [CrossRef]
  113. Redelings, B.D.; Holder, M.T. A supertree pipeline for summarizing phylogenetic and taxonomic information for millions of species. PeerJ 2017, 5, e3058. [Google Scholar] [CrossRef]
  114. Swenson, M.S.; Suri, R.; Linder, C.R.; Warnow, T. Superfine: Fast and accurate supertree estimation. Syst. Biol. 2012, 61, 214–227. [Google Scholar] [CrossRef]
  115. Gatesy, J.; Matthee, C.; DeSalle, R.; Hayashi, C. Resolution of a supertree/supermatrix paradox. Syst. Biol. 2002, 51, 652–664. [Google Scholar] [CrossRef] [PubMed]
  116. Gatesy, J.; Baker, R.H. Hidden likelihood support in genomic data: Can forty-five wrongs make a right? Syst. Biol. 2005, 54, 483–492. [Google Scholar] [CrossRef] [PubMed]
  117. Gatesy, J.; O’Grady, P.; Baker, R.H. Corroboration among data sets in simultaneous analysis: Hidden support for phylogenetic relationships among higher level artiodactyl taxa. Cladistics 1999, 15, 271–313. [Google Scholar] [CrossRef]
  118. Hinchliff, C.E.; Smith, S.A.; Allman, J.F.; Burleigh, J.G.; Chaudhary, R.; Coghill, L.M.; Crandall, K.A.; Deng, J.; Drew, B.T.; Gazis, R.; et al. Synthesis of phylogeny and taxonomy into a comprehensive tree of life. Proc. Natl. Acad. Sci. USA 2015, 112, 12764–12769. [Google Scholar] [CrossRef] [PubMed]
  119. Barker, F.K.; Burns, K.J.; Klicka, J.; Lanyon, S.M.; Lovette, I.J. New insights into new world biogeography: An integrated view from the phylogeny of blackbirds, cardinals, sparrows, tanagers, warblers, and allies. Auk 2015, 132, 333–348. [Google Scholar] [CrossRef]
  120. Hosner, P.A.; Braun, E.L.; Kimball, R.T. Land connectivity changes and global cooling shaped the colonization history and diversification of New World quail (Aves: Galliformes: Odontophoridae). J. Biogeogr. 2015, 42, 1883–1895. [Google Scholar] [CrossRef]
  121. Wang, N.; Kimball, R.T.; Braun, E.L.; Liang, B.; Zhang, Z.W. Ancestral range reconstruction of Galliformes: The effects of topology and taxon sampling. J. Biogeogr. 2017, 44, 122–135. [Google Scholar] [CrossRef]
  122. Townsend, J.P. Profiling phylogenetic informativeness. Syst. Biol. 2007, 56, 222–231. [Google Scholar] [CrossRef] [PubMed]
  123. Duchêne, D.A.; Duchêne, S.; Ho, S.Y. Differences in performance among test statistics for assessing phylogenomic model adequacy. Genome Biol. Evol. 2018, 10, 1375–1388. [Google Scholar] [CrossRef] [PubMed]
  124. Sayyari, E.; Mirarab, S. Fast coalescent-based computation of local branch support from quartet frequencies. Mol. Biol. Evol. 2016, 33, 1654–1668. [Google Scholar] [CrossRef] [PubMed]
  125. Edwards, S.V.; Xi, Z.X.; Janke, A.; Faircloth, B.C.; McCormack, J.E.; Glenn, T.C.; Zhong, B.J.; Wu, S.Y.; Lemmon, E.M.; Lemmon, A.R.; et al. Implementing and testing the multispecies coalescent model: A valuable paradigm for phylogenomics. Mol. Phylogenet. Evol. 2016, 94, 447–462. [Google Scholar] [CrossRef] [PubMed]
  126. Mirarab, S.; Bayzid, M.S.; Warnow, T. Evaluating summary methods for multilocus species tree estimation in the presence of incomplete lineage sorting. Syst. Biol. 2016, 65, 366–380. [Google Scholar] [CrossRef] [PubMed]
  127. Bininda-Emonds, O.R.P.; Gittleman, J.L.; Purvis, A. Building large trees by combining phylogenetic information: A complete phylogeny of the extant Carnivora (Mammalia). Biol. Rev. 1999, 74, 143–175. [Google Scholar] [CrossRef] [PubMed]
  128. Purvis, A. A composite estimate of primate phylogeny. Philos. Trans. R. Soc. Lond. 1995, 348, 405–421. [Google Scholar]
  129. Moles, A.T.; Ackerly, D.D.; Webb, C.O.; Tweddle, J.C.; Dickie, J.B.; Westoby, M. A brief history of seed size. Science 2005, 307, 576–580. [Google Scholar] [CrossRef] [PubMed]
  130. Torices, R. Adding time-calibrated branch lengths to the Asteraceae supertree. J. Syst. Evol. 2010, 48, 271–278. [Google Scholar] [CrossRef]
  131. Webb, C.O.; Ackerly, D.D.; Kembel, S.W. Phylocom: Software for the analysis of phylogenetic community structure and trait evolution. Bioinformatics 2008, 24, 2098–2100. [Google Scholar] [CrossRef]
  132. do Amaral, F.R.; Neves, L.G.; Resende, M.F.R.; Mobili, F.; Miyaki, C.Y.; Pellegrino, K.C.M.; Biondo, C. Ultraconserved elements sequencing as a low-cost source of complete mitochondrial genomes and microsatellite markers in non-model amniotes. PLoS ONE 2015, 10, e0138446. [Google Scholar] [CrossRef]
  133. Barker, F.K.; Oyler-McCance, S.; Tomback, D.F. Blood from a turnip: Tissue origin of low-coverage shotgun sequencing libraries affects recovery of mitogenome sequences. Mitochondrial DNA 2015, 26, 384–388. [Google Scholar] [CrossRef]
  134. Reddy, S. What’s missing from avian global diversification analyses? Mol. Phylogenet. Evol. 2014, 77, 159–165. [Google Scholar] [CrossRef]
  135. Berv, J.S.; Field, D.J. Genomic signature of an avian Lilliput effect across the K-Pg extinction. Syst. Biol. 2018, 67, 1–13. [Google Scholar] [CrossRef] [PubMed]
  136. Ksepka, D.T.; Phillips, M.J. Avian diversification patterns across the K-Pg boundary: Influence of calibrations, datasets, and model misspecification. Ann. Mo. Bot. Gard. 2015, 100, 300–328. [Google Scholar] [CrossRef]
  137. Cooper, A.; Penny, D. Mass survival of birds across the Cretaceous-Tertiary boundary: Molecular evidence. Science 1997, 275, 1109–1113. [Google Scholar] [CrossRef] [PubMed]
  138. Mitchell, K.J.; Cooper, A.; Phillips, M.J. Comment on “Whole-genome analyses resolve early branches in the tree of life of modern birds”. Science 2015, 349, 1460. [Google Scholar] [CrossRef] [PubMed]
  139. Field, D.J.; Bercovici, A.; Berv, J.S.; Dunn, R.; Fastovsky, D.E.; Lyson, T.R.; Vajda, V.; Gauthier, J.A. Early evolution of modern birds structured by global forest collapse at the end-Cretaceous mass extinction. Curr. Biol. 2018, 28, 1825–1831. [Google Scholar] [CrossRef]
  140. Longrich, N.R.; Tokaryk, T.; Field, D.J. Mass extinction of birds at the Cretaceous-Paleogene (K-Pg) boundary. Proc. Natl. Acad. Sci. USA 2011, 108, 15253–15257. [Google Scholar] [CrossRef]
  141. Barrowclough, G.F.; Cracraft, J.; Klicka, J.; Zink, R.M. How many kinds of birds are there and why does it matter? PLoS ONE 2016, 11, e0166307. [Google Scholar] [CrossRef]
  142. Gill, F.B. Species taxonomy of birds: Which null hypothesis? Auk 2014, 131, 150–161. [Google Scholar] [CrossRef]
  143. Matasci, N.; Hung, L.H.; Yan, Z.X.; Carpenter, E.J.; Wickett, N.J.; Mirarab, S.; Nguyen, N.; Warnow, T.; Ayyampalayam, S.; Barker, M.; et al. Data access for the 1,000 plants (1KP) project. Gigascience 2014, 3, 17. [Google Scholar] [CrossRef]
  144. Robinson, G.E.; Hackett, K.J.; Purcell-Miramontes, M.; Brown, S.J.; Evans, J.D.; Goldsmith, M.R.; Lawson, D.; Okamuro, J.; Robertson, H.M.; Schneider, D.J. Creating a buzz about insect genomes. Science 2011, 331, 1386. [Google Scholar] [CrossRef]
  145. Sun, Y.; Huang, Y.; Li, X.F.; Baldwin, C.C.; Zhou, Z.C.; Yan, Z.X.; Crandall, K.A.; Zhang, Y.; Zhao, X.M.; Wang, M.; et al. Fish-T1K (transcriptomes of 1,000 fishes) project: Large-scale transcriptome data for fish evolution studies. Gigascience 2016, 5, 18. [Google Scholar] [CrossRef] [PubMed]
  146. Pennisi, E. Bigger, better bird tree of life will soon fly into view. Available online: https://www.sciencemag.org/news/2018/04/bigger-better-bird-tree-life-will-soon-fly-view/ (accessed on 16 April 2018).
  147. Worthy, T.H.; Hand, S.J.; Archer, M. Phylogenetic relationships of the Australian Oligo-Miocene ratite Emuarius gidju Casuariidae. Integr. Zool. 2014, 9, 148–166. [Google Scholar] [CrossRef] [PubMed]
  148. Woodburne, M.O.; Macfadden, B.J.; Case, J.A.; Springer, M.S.; Pledge, N.S.; Power, J.D.; Woodburne, J.M.; Springer, K.B. Land mammal biostratigraphy and magnetostratigraphy of the Etadunna Formation (late Oligocene) of south Australia. J. Vertebr. Paleontol. 1994, 13, 483–515. [Google Scholar] [CrossRef]
  149. Gradstein, F.; Ogg, J.; Smith, A. A Geologic Time Scale 2004; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
  150. Houde, P.W. Paleognathous Birds from the Early Tertiary of the Northern Hemisphere; Nuttall Ornithologcal Club: Cambridge, UK, 1988; Volume 22. [Google Scholar]
  151. Alvarenga, H.M.F. Uma ave ratite do Paleoceno brasileiro: Bacia calcária de Itaboraí, estado do Rio de Janeiro, Brasil. Bol. Mus. Nac. Rio J. 1983, 41, 1–47. [Google Scholar]
  152. Mayr, G.; Poschmann, M.; Wuttke, M. A nearly complete skeleton of the fossil galliform bird Palaeortyx from the late Oligocene of Germany. Acta Ornithol. 2006, 41, 129–135. [Google Scholar]
  153. Storch, G.; Engesser, B.; Wuttke, M. Oldest fossil record of gliding in rodents. Nature 1996, 379, 439–441. [Google Scholar] [CrossRef]
  154. Mayr, G. The Eocene Juncitarsus—Its phylogenetic position and significance for the evolution and higher-level affinities of flamingos and grebes. C. R. Palevol 2014, 13, 9–18. [Google Scholar] [CrossRef]
  155. Mertz, D.F.; Harms, F.-J.; Gabriel, G.; Felder, M. Arbeitstreffen in der Forschungsstation Grube Messel mit neven Egebrissen aus der Messel-Forschung. Nat. Und Mus. 2004, 134, 289–290. [Google Scholar]
  156. Franzen, J.F. The implications of the numerical dating of the Messel fossil deposit (Eocene, Germany) for mammalian biochronology. Ann. Paléontol. 2005, 91, 329–335. [Google Scholar] [CrossRef]
  157. Olson, S.L. A lower Eocene frigatebird from the Green River formation of Wyoming (Pelecaniformes, Fregatidae). Smithson. Contrib. Paleontol. 1977, 35, 1–33. [Google Scholar] [CrossRef]
  158. Mayr, G. The Palaeogene Old World potoo Paraprefica Mayr, 1999 (Aves, Nyctibiidae): Its osteology and affinities to the New World Preficinae Olson, 1987. J. Syst. Palaeontol. 2005, 3, 359–370. [Google Scholar] [CrossRef]
  159. Smith, M.E.; Chamberlain, K.R.; Singer, B.S.; Carroll, A.R. Eocene clocks agree: Coeval Ar-40/Ar-39, U-Pb, and astronomical ages from the Green River formation. Geology 2010, 38, 527–530. [Google Scholar] [CrossRef]
  160. Ksepka, D.T.; Clarke, J.A.; Nesbitt, S.J.; Kulp, F.B.; Grande, L. Fossil evidence of wing shape in a stem relative of swifts and hummingbirds (Aves, Pan-Apodiformes). Proc. R. Soc. B Biol. Sci. 2013, 280, 20130580. [Google Scholar] [CrossRef] [PubMed]
  161. Mayr, G. A new Eocene swift-like bird with a peculiar feathering. Ibis 2003, 145, 382–391. [Google Scholar] [CrossRef]
  162. Mayr, G. A new cypselomorph bird from the middle Eocene of Germany and the early diversification of avian aerial insectivores. Condor 2005, 107, 342–352. [Google Scholar] [CrossRef]
  163. Thiede, J.; Nielsen, O.B.; Perch-Nielsen, K. Lithofacies, mineralogy and biostratigraphy of Eocene sediments in northern Denmark (Deep test Viborg 1). Neues Jahrb. Geol. Paläontol. Abh. 1980, 160, 149–172. [Google Scholar]
  164. Milkovsky, J. Tertiary avian localities of Denmark. Acta Univ. Carol. Geol. 1996, 39, 559–562. [Google Scholar]
  165. Gradstein, F.M.; Ogg, J.G.; Hilgen, F.J. On the geologic time scale. Newsl. Stratigr. 2012, 45, 171–188. [Google Scholar] [CrossRef]
  166. Mayr, G. Phylogenetic relationships of the early tertiary Messel rails (Aves, Messelornithidae). Senckenberg. Lethaea 2004, 84, 317–322. [Google Scholar] [CrossRef]
  167. Bertelli, S.; Chiappe, L.M.; Mayr, G. A new Messel rail from the early Eocene Fur Formation of Denmark (Aves, Messelornithidae). J. Syst. Palaeontol. 2011, 9, 551–562. [Google Scholar] [CrossRef]
  168. Musser, G.; Ksepka, D.T.; Field, D.J. New material of Palaeocene-Eocene Pellornis (Aves: Gruiformes) clarifies pattern and timing of the extant gruiform radiation. Diversity 2019, 11, 102. [Google Scholar] [CrossRef]
  169. Chambers, L.; Pringle, M.; Fitton, G.; Larsen, L.M.; Pedersen, A.K.; Parrish, R. Recalibration of the Palaeocene-Eocene boundary (P-E) using high precision U-Pb and Ar-Ar isotopic datingIn Proceedings of the EGS-AGU-EUG Joint Assembly, Abstracts from the meeting, Nice, France, 6–11 April 2003.
  170. Feduccia, A.; Voorhies, M.R. Crowned cranes (Gruidae: Balearica) in the Miocene of Nebraska. Nat. Hist. Mus. Los Angel. Cty. Sci. Ser. 1992, 36, 239–248. [Google Scholar]
  171. Mayr, G. A survey of casques, frontal humps, and other extravagant bony cranial protuberances in birds. Zoomorphology 2018, 137, 457–472. [Google Scholar] [CrossRef]
  172. Boellstorf, J. Chronology of some late Cenozoic deposits from the central United States and the ice ages. Trans. Neb. Acad. Sci. 1978, 6, 35–49. [Google Scholar]
  173. De Pietri, V.L.; Costeur, L.; Guntert, M.; Mayr, G. A revision of the Lari (Aves: Charadriiformes) from the early Miocene of Saint-Gérand-le-Puy (Allier, France). J. Vertebr. Paleontol. 2011, 31, 812–828. [Google Scholar] [CrossRef]
  174. Smith, N.A. Sixteen vetted fossil calibrations for divergence dating of Charadriiformes (Aves, Neognathae). Palaeontol. Electron. 2015, 18, 1–18. [Google Scholar] [CrossRef]
  175. Mayr, G. Charadriiform birds from the early Oligocene of Cereste (France) and the middle Eocene of Messel (Hessen, Germany). Geobios 2000, 33, 625–636. [Google Scholar] [CrossRef]
  176. Bourdon, E.; Mourer-Chauvire, C.; Amaghzaz, M.; Bouya, B. New specimens of Lithoptila abdounensis (Aves, Prophaethontidae) from the lower Paleogene of Morocco. J. Vertebr. Paleontol. 2008, 28, 751–761. [Google Scholar] [CrossRef]
  177. Smith, N.D. Phylogenetic analysis of Pelecaniformes (Aves) based on osteological data: Implications for waterbird phylogeny and fossil calibration studies. PLoS ONE 2010, 5, e13354. [Google Scholar] [CrossRef]
  178. Mayr, G.; Scofield, R.P. New avian remains from the Paleocene of New Zealand: The first early Cenozoic Phaethontiformes (tropicbirds) from the Southern Hemisphere. J. Vertebr. Paleontol. 2016, 36, e1031343. [Google Scholar] [CrossRef]
  179. Mayr, G. A new skeleton of the late Oligocene “Enspel cormorant” from Oligocorax to Borvocarbo, and back again. Palaeobiodivers. Palaeoenviron. 2015, 95, 87–101. [Google Scholar] [CrossRef]
  180. Slack, K.E.; Jones, C.M.; Ando, T.; Harrison, G.L.; Fordyce, R.E.; Arnason, U.; Penny, D. Early penguin fossils, plus mitochondrial genomes, calibrate avian evolution. Mol. Biol. Evol. 2006, 23, 1144–1155. [Google Scholar] [CrossRef] [PubMed]
  181. Ksepka, D.; Bertelli, S.; Norberto, G. The phylogeny of living and fossil Sphenisciformes (penguins). Cladistics 2006, 22, 412–441. [Google Scholar] [CrossRef]
  182. Cooper, R.A. The New Zealand Geological Timescale; Institute of Geological and Nuclear Sciences: Lower Hutt, New Zealand, 2004. [Google Scholar]
  183. Ogg, J.G.; Ogg, G.; Gradstein, F.M. The Concise Geological Time Scale; Cambridge University Press: Cambridge, UK, 2008. [Google Scholar]
  184. Smith, M.E.; Singer, B.S.; Carroll, A.R.; Fournelle, J.H. Precise dating of biotite in distal volcanic ash: Isolating subtle alteration using 40Ar/39Ar laser incremental heating and electron microprobe techniques. Am. Mineral. 2008, 93, 784–795. [Google Scholar] [CrossRef]
  185. Stidham, T.A. A new species of Limnofregata (Pelecaniformes: Fregatidae) from the early Eocene Wasatch Formation of Wyoming: Implications for palaeoecology and palaeobiology. Palaeontology 2015, 58, 239–249. [Google Scholar] [CrossRef]
  186. Mayr, G. The world’s smallest owl, the earliest unambiguous charadriiform bird, and other avian remains from the early Eocene Nanjemoy Formation of Virginia (USA). Palaeontol. Z. 2016, 90, 747–763. [Google Scholar] [CrossRef]
  187. Acosta Hospitaleche, C.; Tambussi, C.; Donato, M.; Cozzuol, M. A new Miocene penguin from Patagonia and its phylogenetic relationships. Acta Palaeontol. Pol. 2007, 52, 299–314. [Google Scholar]
  188. Ksepka, D.T.; Clarke, J.A. The basal penguin (Aves: Sphenisciformes) Perudyptes devriesi and a phylogenetic evaluation of the penguin fossil record. Bull. Am. Mus. Nat. Hist. 2010, 337, 1–77. [Google Scholar] [CrossRef]
  189. Chavez Hoffmeister, M. Phylogenetic characters in the humerus and tarsometatarsus of penguins. Pol. Polar Res. 2014, 35, 469–496. [Google Scholar] [CrossRef]
  190. Chavez Hoffmeister, M.; Briceno, J.D.C.; Nielsen, S.N. The evolution of seabirds in the Humboldt Current: New clues from the Pliocene of central Chile. PLoS ONE 2014, 9, e90043. [Google Scholar] [CrossRef]
  191. Degrange, F.J.; Ksepka, D.T.; Tambussi, C.P. Redescription of the oldest crown clade penguin: Cranial osteology, jaw myology, neuroanatomy, and phylogenetic affinities of Madrynornis mirandus. J. Vertebr. Paleontol. 2018, 38, e1445636. [Google Scholar] [CrossRef]
  192. Scasso, R.A.; McArthur, J.M.; del Rio, C.J.; Martinez, S.; Thirlwall, M.F. 87Sr/86Sr late Miocene age of fossil molluscs in the ‘Entrerriense’ of the Valdés Peninsula (Chubut, Argentina). J. S. Am. Earth Sci. 2001, 14, 319–329. [Google Scholar] [CrossRef]
  193. Mayr, G. Phylogenetic relationships of the paraphyletic ‘caprimulgiform’ birds (nightjars and allies). J. Zool. Syst. Evol. Res. 2010, 48, 126–137. [Google Scholar] [CrossRef]
  194. Mayr, G.; Bertelli, S. A record of Rhynchaeites (Aves, Threskiornithidae) from the early Eocene Fur Formation of Denmark, and the affinities of the alleged parrot Mopsitta. Palaeobiodivers. Palaeoenviron. 2011, 91, 229–236. [Google Scholar] [CrossRef]
  195. Field, D.J.; Hsiang, A.Y. A North American stem turaco, and the complex biogeographic history of modern birds. BMC Evol. Biol. 2018, 18. [Google Scholar] [CrossRef] [PubMed]
  196. Ksepka, D.T.; Stidham, T.A.; Williamson, T.E. Early Paleocene landbird supports rapid phylogenetic and morphological diversification of crown birds after the K-Pg mass extinction. Proc. Natl. Acad. Sci. USA 2017, 114, 8047–8052. [Google Scholar] [CrossRef] [PubMed]
  197. Mayr, G. A tiny barbet-like bird from the lower Oligocene of Germany: The smallest species and earliest substantial fossil record of the Pici (woodpeckers and allies). Auk 2005, 122, 1055–1063. [Google Scholar] [CrossRef]
  198. Mayr, G. First fossil skull of a Palaeogene representative of the Pici (woodpeckers and allies) and its evolutionary implications. Ibis 2006, 148, 824–827. [Google Scholar] [CrossRef]
  199. Micklich, N.; Hildebrandt, L. The Frauenweiler clay pit (“Grube Unterfeld”). Kaupia Darmstädter Beiträge Nat. 2005, 14, 113–118. [Google Scholar]
  200. Clarke, J.A.; Ksepka, D.T.; Smith, N.A.; Norell, M.A. Combined phylogenetic analysis of a new North American fossil species confirms widespread Eocene distribution for stem rollers (Aves, Coracii). Zool. J. Linn. Soc. 2009, 157, 586–611. [Google Scholar] [CrossRef]
  201. Mayr, G.; Mourer-Chauvire, C.; Weidig, I. Osteology and systematic position of the Eocene Primobucconidae (Aves, Coraciiformes sensu stricto), with first records from Europe. J. Syst. Palaeontol. 2004, 2, 1–12. [Google Scholar] [CrossRef]
  202. Houde, P.; Olson, S.L. Small arboreal nonpasserine birds from the early Tertiary of western North America. In Acta XIX Congressus Internationalis Ornithologici; Ouellet, H., Ed.; University of Ottawa Press: Ottawa, ON, Canada, 1989; pp. 2030–2036. [Google Scholar]
  203. Mayr, G.; Knopf, C.W. A tody (Alcediniformes: Todidae) from the early Oligocene of Germany. Auk 2007, 124, 1294–1304. [Google Scholar] [CrossRef]
  204. Mayr, G. A reassessment of Eocene parrotlike fossils indicates a previously undetected radiation of zygodactyl stem group representatives of passerines (Passeriformes). Zool. Scr. 2015, 44, 587–602. [Google Scholar] [CrossRef]
  205. Mayr, G. Phylogenetic affinities of the enigmatic avian taxon Zygodactylus based on new material from the early oligocene of France. J. Syst. Palaeontol. 2008, 6, 333–344. [Google Scholar] [CrossRef]
  206. Harrison, C.J.O.; Walker, C.A. Birds of the British Lower Eocene; Tertiary Research Special Paper; BRILL: Leiden, The Netherlands, 1977; pp. 1–52. [Google Scholar]
  207. Kristoffersen, A.V. The Avian Diversity in the Latest Paleocene Earliest Eocene Fur Formation, Denmark: A Synopsis; University of Copenhagen: Copenhagen, Denmark, 2002. [Google Scholar]
  208. Mayr, G. Paleogene Fossil Birds; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  209. Mayr, G.; Manegold, A. A small suboscine-like passeriform bird from the early Oligocene of France. Condor 2006, 108, 717–720. [Google Scholar] [CrossRef]
  210. Manegold, A. Passerine diversity in the late Oligocene of Germany: Earliest evidence for the sympatric coexistence of suboscines and oscines. Ibis 2008, 150, 377–387. [Google Scholar] [CrossRef]
  211. Mayr, G.; Manegold, A. The oldest European fossil songbird from the early Oligocene of Germany. Naturwissenschaften 2004, 91, 173–177. [Google Scholar] [CrossRef]
  212. Ballmann, P. Die Vögel aus der altburdigalen Spaltenfüllung von Wintershof (West) bei Eichstätt in Bayern. Zitteliana 1969, 1, 5–60. [Google Scholar]
  213. Manegold, A.; Mayr, G.; Mourer-Chauvire, C. Miocene songbirds and the composition of the European passeriform avifauna. Auk 2004, 121, 1155–1160. [Google Scholar] [CrossRef]
Figure 1. Challenges for megaphylogeny construction. (a) Supermatrix analyses often use sparse data matrices. The box shows part of the BigBird data matrix (Burleigh et al. 2015) with taxa plotted on the x-axis and loci on the y-axis. Sampled loci are black and missing data are white. The imperfect overlap of loci means some taxa will be placed using only one or two loci. Moreover, some sister taxa may not share any loci. (b) Supertree analyses do not require overlapping loci but specific patterns of taxon overlap in source trees are necessary. The source trees shown are sufficient to yield the parts of the true species tree with solid lines (assuming the source trees are accurate). Problematic source trees are of three types: (1) type I trees have only one taxon present in the other source trees, making it impossible to place the non-overlapping taxa; (2) type II trees include a taxon (J′) absent from other source trees that is a close relative of a taxon (J) in other informative source trees, in this case J and J′ may not emerge as sister taxa; and (3) type III (not shown) are cases in which a potential source tree does not share any taxa with any other source tree. Including additional source trees with appropriate taxa (if they are available) will render types I and III trees informative. Type II trees often reflect taxonomic changes that split species or cases where different studies use closely related but distinct taxa (e.g., congeners). Enforcing constraints that include a J + J′ clade can solve the problems caused by type II trees.
Figure 1. Challenges for megaphylogeny construction. (a) Supermatrix analyses often use sparse data matrices. The box shows part of the BigBird data matrix (Burleigh et al. 2015) with taxa plotted on the x-axis and loci on the y-axis. Sampled loci are black and missing data are white. The imperfect overlap of loci means some taxa will be placed using only one or two loci. Moreover, some sister taxa may not share any loci. (b) Supertree analyses do not require overlapping loci but specific patterns of taxon overlap in source trees are necessary. The source trees shown are sufficient to yield the parts of the true species tree with solid lines (assuming the source trees are accurate). Problematic source trees are of three types: (1) type I trees have only one taxon present in the other source trees, making it impossible to place the non-overlapping taxa; (2) type II trees include a taxon (J′) absent from other source trees that is a close relative of a taxon (J) in other informative source trees, in this case J and J′ may not emerge as sister taxa; and (3) type III (not shown) are cases in which a potential source tree does not share any taxa with any other source tree. Including additional source trees with appropriate taxa (if they are available) will render types I and III trees informative. Type II trees often reflect taxonomic changes that split species or cases where different studies use closely related but distinct taxa (e.g., congeners). Enforcing constraints that include a J + J′ clade can solve the problems caused by type II trees.
Diversity 11 00109 g001
Figure 2. Taxon sampling for phylogenomic studies. (a) Histogram showing the number of taxa in each source tree. (b) Matrix occupancy graph. Lines indicate taxa present in each source tree. Taxa are sorted taxonomically, first by major clade (bar at the bottom of the graph) and then by order. Most taxa are present in at least one of the backbone trees (gray, top of the graph). In contrast, many phylogenomic source trees (black and red) have limited taxonomic overlap; this is especially true for trees based on whole-genome sequencing data (red). It is also clear that many phylogenomic trees are limited to specific taxonomic groups.
Figure 2. Taxon sampling for phylogenomic studies. (a) Histogram showing the number of taxa in each source tree. (b) Matrix occupancy graph. Lines indicate taxa present in each source tree. Taxa are sorted taxonomically, first by major clade (bar at the bottom of the graph) and then by order. Most taxa are present in at least one of the backbone trees (gray, top of the graph). In contrast, many phylogenomic source trees (black and red) have limited taxonomic overlap; this is especially true for trees based on whole-genome sequencing data (red). It is also clear that many phylogenomic trees are limited to specific taxonomic groups.
Diversity 11 00109 g002
Figure 3. Phylogenomic supertree generated by matrix representation with parsimony (MRP) analysis using all three backbone trees. (a) Large-scale structure with support from the MRP bootstrap analysis. All nodes received 100% bootstrap support except as noted. The line for Apodiformes is dashed; we did this to indicate that the ordinal circumscriptions in the IOC World Bird List (v. 7.3) nest Apodiformes within a paraphyletic Caprimulgiformes, assuming the supertree topology is correct. Superordinal groups (the “magnificent seven”) are numbered following Reddy et al. [39]. (b) Relationships among families within Apodiformes and Caprimulgiformes, emphasizing paraphyly of the latter. A complete supertree with all species labeled is available in Supplementary Materials.
Figure 3. Phylogenomic supertree generated by matrix representation with parsimony (MRP) analysis using all three backbone trees. (a) Large-scale structure with support from the MRP bootstrap analysis. All nodes received 100% bootstrap support except as noted. The line for Apodiformes is dashed; we did this to indicate that the ordinal circumscriptions in the IOC World Bird List (v. 7.3) nest Apodiformes within a paraphyletic Caprimulgiformes, assuming the supertree topology is correct. Superordinal groups (the “magnificent seven”) are numbered following Reddy et al. [39]. (b) Relationships among families within Apodiformes and Caprimulgiformes, emphasizing paraphyly of the latter. A complete supertree with all species labeled is available in Supplementary Materials.
Diversity 11 00109 g003
Figure 4. Clustering diagram emphasizing the similarities and differences among supertrees and backbone megaphylogenies. This “tree-of-trees” was generated by clustering Robinson-Foulds distances [98] among trees by neighbor joining. The tree-of-trees emphasizes the clustering of supertrees by method (i.e., there are three major groups: matrix representation with parsimony [MRP], matrix representation with likelihood [MRL], and extended majority rule consensus of bootstrap supertrees) and the distance between the supertrees and the backbone trees. The MRP tree for all backbones is emphasized because much of our discussion uses this tree; all trees are available in Supplementary Materials.
Figure 4. Clustering diagram emphasizing the similarities and differences among supertrees and backbone megaphylogenies. This “tree-of-trees” was generated by clustering Robinson-Foulds distances [98] among trees by neighbor joining. The tree-of-trees emphasizes the clustering of supertrees by method (i.e., there are three major groups: matrix representation with parsimony [MRP], matrix representation with likelihood [MRL], and extended majority rule consensus of bootstrap supertrees) and the distance between the supertrees and the backbone trees. The MRP tree for all backbones is emphasized because much of our discussion uses this tree; all trees are available in Supplementary Materials.
Diversity 11 00109 g004
Figure 5. Estimates of branch lengths for the matrix representation with parsimony (MRP) phylogenomic supertree. Branch lengths reflect analysis of the mitochondrial genes cytochrome b and NADH dehydrogenase subunit 2 using the GTR+I+Γ model and six partitions (corresponding to the three codon positions within each gene). (a) Unlabeled tree based on reduced data matrix (taxa with >25% missing data removed) with silhouettes to indicate major lineages. Colors are identical to those in Figure 3; a key to the color scheme is also available in Supplementary Materials. To emphasize the long branches for Turnicidae (hemipodes, also known as buttonquails) and Tinamiformes (tinamous) we include labeled subtrees for Charadriiformes (b) and Palaeognathae (c) extracted from the tree based on all data. A comparison of the Charadriiformes and Palaeognathae subtrees to similar trees based on nuclear sequence data is available in Supplementary Materials.
Figure 5. Estimates of branch lengths for the matrix representation with parsimony (MRP) phylogenomic supertree. Branch lengths reflect analysis of the mitochondrial genes cytochrome b and NADH dehydrogenase subunit 2 using the GTR+I+Γ model and six partitions (corresponding to the three codon positions within each gene). (a) Unlabeled tree based on reduced data matrix (taxa with >25% missing data removed) with silhouettes to indicate major lineages. Colors are identical to those in Figure 3; a key to the color scheme is also available in Supplementary Materials. To emphasize the long branches for Turnicidae (hemipodes, also known as buttonquails) and Tinamiformes (tinamous) we include labeled subtrees for Charadriiformes (b) and Palaeognathae (c) extracted from the tree based on all data. A comparison of the Charadriiformes and Palaeognathae subtrees to similar trees based on nuclear sequence data is available in Supplementary Materials.
Diversity 11 00109 g005
Figure 6. Timetree generated by penalized likelihood (as implemented in treePL), using the matrix representation with parsimony (MRP) phylogenomic supertree with branch lengths based on mitochondrial data. The timescale is presented below the tree (the Quaternary period is omitted). Colors are identical to those in Figure 3.
Figure 6. Timetree generated by penalized likelihood (as implemented in treePL), using the matrix representation with parsimony (MRP) phylogenomic supertree with branch lengths based on mitochondrial data. The timescale is presented below the tree (the Quaternary period is omitted). Colors are identical to those in Figure 3.
Diversity 11 00109 g006
Table 1. Phylogenomic studies of birds considered for use in this supertree 1.
Table 1. Phylogenomic studies of birds considered for use in this supertree 1.
StudyFocal Group# of SpeciesUsed as Source Tree?Loci Targeted 2
Faircloth et al. [55]NEORNITHES9 2.5K UCE probe set
McCormack et al. [36]NEOAVES33YES2.5K UCE probe set
Baker et al. [56] PALAEOGNATHAE7 Subset of Faircloth et al. [55] loci
Jarvis et al. [37]NEORNITHES48YESWhole genomes
Sun et al. [40]Phasianidae (peafowl)15 5k UCE probe set
Prum et al. [38]NEORNITHES197YESAHE probe set
Bryson et al. [57]Passerellidae30YES5k UCE probe set
Hosner et al. [58]Cracidae23 5k UCE probe set
Hosner et al. [21]Phasianidae90 5k UCE probe set
Manthey et al. [59]Piranga11YES5k UCE probe set
McCormack et al. [22]Aphelocoma1 (3) 5k UCE probe set
Meiklejohn et al. [60]Phasianidae (gallopheasants)18 5k UCE probe set
Ottenburghs et al. [61]Anatidae–Anserini19YESWhole genomes
Persons et al. [62]Phasianidae (grouse)11 5k UCE probe set
Zarza et al. [63]Aphelocoma3YES5k UCE probe set
Burga et al. [64]Phalacrocorax7YESWhole genomes
Hosner et al. [42]Phasianidae115YES5k UCE probe set
Reddy et al. [39]NEORNITHES235YESlegacy with data mining
Wang et al. [65]Phasianidae20YES5k UCE probe set
White et al. [66]Nyctibiidae12YES5k UCE probe set
Yonezawa et al. [67]PALAEOGNATHAE YESlegacy with data mining
Andersen et al. [68]Alcedinidae21YES5k UCE probe set
Bruxaux et al. [69]Goura6YESSubset of UCE and AHE loci
Campillo et al. [70]Arachnothera17YES5k UCE probe set
Chen et al. [71]Phasianidae27YES5k UCE probe set
Musher & Cracraft [72]Pachyramphus18YES2.5K/5k UCE probe set
Smith et al. 2018 [73]Psittaculidae–Loriini54YES5k UCE probe set
Younger et al. [74]Newtonia4YES5k UCE probe set
Sackton et al. [75]PALAEOGNATHAE15YESWhole genomes
1 Redundant trees were omitted. 2 UCE (Ultraconserved Element) probe sets are described at https://www.ultraconserved.org/; The AHE (Anchored Hybrid Enrichment) probe set was used by Prum et al. (2015).
Table 2. Supertree resolution given different backbone trees 1.
Table 2. Supertree resolution given different backbone trees 1.
Backbone: Resolved% Branches
MethodBigBirdBrownJetzBranchesCollapsed
MRP+++6961.28%
MRP+ 6428.94%
MRP + 6872.55%
MRP +6980.99%
MRP++ 6892.27%
MRP+ +6911.99%
MRP ++6941.56%
MRL+++7040.14%
MRL+ 6902.13%
MRL + 6980.99%
MRL +7040.14%
MRL++ 7030.28%
MRL+ +7050.00%
MRL ++7030.28%
MRP bootstrap+++7030.28%
MRL bootstrap+++7050.00%
1 Continued on the next page.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Diversity EISSN 1424-2818 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top