Tropical rain forests present unique challenges to identifying plant species. They are characteristically diverse, with an area as small as two hectares potentially containing 300+ vascular woody species of trees, shrubs, and lianas [1
]. Many of these species can be extremely rare and/or poorly known [2
] and often their most diagnostic characters for identification (e.g., leaves, fruit and flowers) occur high in the canopy out of sight and reach. In many cases fruits and/or flowers are required for accurate identification, which can hinder progress on identification of tropical species in remote localities for years or even decades. This combination of factors renders identification of species in tropical forests to be characteristically slow and dependent on trained experts in taxonomy.
Taxonomy, the scientific discipline of identifying and assigning names to species, may be one of the world’s oldest professions, yet today it is undoubtedly an uncommon trade. Although there has been much recent discussion about the discipline of taxonomy being in a state of decline [3
], a recent paper [4
] suggested the opposite is true. Between 1864 and 2010, the number of authors describing new species, articles describing new species and total new species described in the zoological record has increased. This may be attributed to the development of new research methods and approaches or the “publication landscape” changing as science publications have become more interdisciplinary in nature. This trend, plus the rise of the international DNA barcoding initiative, are shifting the ways new species are discovered and expanding the traditional horizons of the discipline of taxonomy. The International Barcode of Life Project (iBOL) was established in 2009 and uses standardized portions of a species genome to identify species, with the aim to aid traditional taxonomy and species identification. Now species can be identified from plant leaf fragments and cambium [5
], roots [6
], herbal medicine preparations bought in stores [7
], meat bought at the market [8
], fecal remains of animals [9
], and even from digested plant tissues inside the guts of insects [10
]. These new and innovative methods hold much promise for addressing some of the challenges of identifying plant species in the tropics and are expected to accelerate the discovery of new species.
The primary mission of iBOL is to provide a platform for the construction of a global DNA barcode reference library of species, the Barcode of Life Data Systems (BOLD) [11
], and to promote increased geographic and taxonomic coverage of the species currently represented. BOLD is an online resource developed by the Canadian Centre for DNA Barcoding (CCDB), which stores the DNA barcode records, all supporting trace file sequence data as well as the data on the voucher collections for each DNA sample. BOLD also enables global community access to the data with online tools for visualization, species validation, and analysis. It is now a central data repository and informatics hub for DNA barcoding projects worldwide.
This project aims to construct a DNA barcode library for the tropical flora of Australia in collaboration with iBOL and the CCDB. As a starting point we present analyses of 1572 DNA barcodes from 848 species from Queensland. In addition to providing a broad coverage of the species that occur in the region, the project sampled multiple individuals per species for approximately half of the sampled species (473) to investigate the occurrence of infraspecific molecular variation in DNA barcode loci.
The Australian state of Queensland is the world’s sixth largest sub-national political region, spanning more than 1,850,000 km2
. Over half of Queensland lies in the southern tropics. Great diversity of landscapes and climates has allowed a large number of plant species to evolve. The 2014 Census of the Queensland Flora [12
] lists 14,174 native plant species, making Queensland the most species rich state of Australia, plus it contains three out of the 12 major Australian centres of plant endemism [13
]. This project focuses on plants occurring in the tropical northern part of the state, specifically within the Wet Tropics Bioregion and the Iron Range-McIlwraith Range region of Cape York Peninsula. However, it also includes some species with ranges that extend outside this area into Queensland’s western monsoonal, arid regions and the southern subtropical zone (Figure 1
The global importance of the Wet Tropics Bioregion for biodiversity is well recognized and most of the region is included within the Queensland Wet Tropics World Heritage Area [14
]. The region is considered one of the best-preserved living museums, containing assemblages of species representing multiple different eras of the Earth’s evolutionary history including lineages of relict and recently radiated origins. Thus progress towards a complete DNA barcode library for this bioregion may be considered a priority and a valuable asset to Australia and to the world.
The current DNA barcode library consists of three plastid loci: rbcLa, matK, and the trnH-psbA intergenic spacer region and is hosted on the BOLD online database. The research was initiated through a project to generate DNA barcodes for 500 Australian tropical tree species. Additional species were subsequently added through contributions of postgraduates and research collaborations with the Australian Tropical Herbarium. This project lays the foundation for a more complete plant DNA barcode library for Australian tropical flora to be completed over time. It is expected that the compilation of this genetic resource will accelerate both academic research in the region and applied uses of DNA barcode data in fields such as quarantine, forensics, ecological restoration, climate change impacts, and citizen science.
DNA samples were obtained from a combination of fresh dried leaves stored on silica gel and herbarium specimens. For most species, fresh leaf material was obtained from field research plots, biodiversity survey expeditions, and local arboreta. Multiple samples (two to six) of 473 species were collected. For each collection, one leaf was stored on silica gel. For species that could not be located in the field, herbarium specimens from the Australian Tropical Herbarium were destructively sampled. Specimens no more than 20 years of age were selected, with preference given to specimens collected within the last ten years. Leaf tissue fragments were loaded into 96 well plates using sterilized forceps, and sent to the Canadian Center of DNA Barcoding (CCDB) for DNA extraction following their protocols [15
]. All DNA samples are vouchered by herbarium specimens held at the Australian Tropical Herbarium (CNS). Voucher specimen data were submitted to BOLD via their standard submission template. We followed the standard data submission protocol from BOLD for specimen data entry [16
]. A total of 773 specimens in the dataset include photographs of the voucher specimens. Specimen data were linked to each sequence via the BOLD web platform under the project folder titled Barcoding Australia’s Tropical Flora (BATF). PCR amplifications for the two official DNA barcode loci rbcLa
, plus a third, trnH-psbA
, which is popular for its high PCR amplification success were conducted jointly between the CCDB and the Australian Tropical Herbarium following the CCDB amplification protocols for plant and fungi [17
] and sequencing was conducted at the CCDB following their protocols [18
]. DNA samples of all specimens are held at the Australian Tropical Herbarium with aliquots of some species also held at CCDB and the Smithsonian Institution’s National Museum of Natural History. Aliquots are available to the scientific community by request.
Accuracy of DNA barcodes was assessed through the BOLD taxon ID tree function, neighbor joining analyses and BLAST searches via GenBank. Species that displayed molecular variation among samples were further investigated. These included: (A) species that showed discordant results between the DNA sequence derived phylogeny and morphologically recognized species; and (B) species that showed variation within species but no discordance with other closely related species.
For species that fell into category (A), voucher specimens were carefully checked, and the raw sequence data (trace and contig files) re-examined. This process discovered a small number of vouchers that were incorrectly identified, and resolved some instances of incorrect base calls. For species that fell into category (B), the raw sequence data (trace and contig files) were re-examined. Species showing confirmed infraspecific variation were then investigated further to identify any geographic and/or ecological patterns.