Bioassessment of a Drinking Water Reservoir Using Plankton : High Throughput Sequencing vs . Traditional Morphological Method

Drinking water safety is increasingly perceived as one of the top global environmental issues. Plankton has been commonly used as a bioindicator for water quality in lakes and reservoirs. Recently, DNA sequencing technology has been applied to bioassessment. In this study, we compared the effectiveness of the 16S and 18S rRNA high throughput sequencing method (HTS) and the traditional optical microscopy method (TOM) in the bioassessment of drinking water quality. Five stations reflecting different habitats and hydrological conditions in Danjiangkou Reservoir, one of the largest drinking water reservoirs in Asia, were sampled May 2016. Non-metric multi-dimensional scaling (NMDS) analysis showed that plankton assemblages varied among the stations and the spatial patterns revealed by the two methods were consistent. The correlation between TOM and HTS in a symmetric Procrustes analysis was 0.61, revealing overall good concordance between the two methods. Procrustes analysis also showed that site-specific differences between the two methods varied among the stations. Station Heijizui (H), a site heavily influenced by two tributaries, had the largest difference while station Qushou (Q), a confluence site close to the outlet dam, had the smallest difference between the two methods. Our results show that DNA sequencing has the potential to provide consistent identification of taxa, and reliable bioassessment in a long-term biomonitoring and assessment program for drinking water reservoirs.


Introduction
Clean freshwater resources are becoming increasingly scarce globally [1][2][3][4].Due to climate change, economic development and population growth, approximately four billion persons of the world's population are facing severe water scarcity, with nearly half of them living in India and China [1,5].Drinking water security may substantially hamper the sustainable development of humanity, particularly in developing countries.To better protect and manage diminishing freshwater resources, researchers have developed and tested multiple biological indices for assessing water quality and the ecological integrity of aquatic ecosystems since biological assemblages provide a direct measure of the aquatic ecosystems' conditions [6].Multimetric indices of fish, macroinvertebrates and periphyton have been effectively used to assess surface water quality in the USA [7].For example, in an integrated biosurvey as a tool for evaluation of aquatic life use attainment and impairment in Ohio surface waters, Yoder (1991) [8] reported that biological monitoring and assessment indicated that approximately 50% of 645 Ohio stream/river segments were impaired while chemical monitoring and assessment showed no signs of impairment.Biological assessment has become an integral part of water resource protection and management.
One of the key components for a successful bioassessment program is to accurately characterize the composition of biological assemblages.Traditional bioassessment methods rely on specialized analysts to identify taxonomic groups, a time-consuming process.The quality of taxonomic analysis largely depends on analysts training, experience, and interpretation of the taxonomic literature.For instance, several studies on inter-analysts comparison of the diatom identification showed that inconsistency among independent analysts can contribute uncertainty to the bioassessment [9,10].The inconsistency among the analysts may be more problematic for long-term monitoring programs, particularly with climate change.With the development of DNA sequencing technology [11][12][13][14], DNA sequencing has been applied to bioassessment, particularly in freshwater [15][16][17][18][19][20][21].High throughput sequencing (HTS) was developed to characterize biological assemblages in environmental samples.The method is faster in terms of sampling processing, and may become cheaper as the technology improves, and more importantly with the ability to provide more reliable and richer biological information than the traditional morphology-based method [22][23][24].Different plankton assemblages as bioindicators were characterized using DNA sequencing including bacterioplankton [18,[24][25][26], and phytoplankton [27][28][29][30][31][32][33].Baird and Hajibabaei (2012) [17] envisioned a new paradigm (i.e., Biomonitoring 2.0) in ecosystem assessment based on a HTS platform, though a complete paradigm shift may require more research.
Several researchers have assessed the effectiveness of water quality assessment using the traditional optic microscopic method and the DNA sequencing method [21,23,34,35].Due to several limitations, such as incomplete taxonomic coverage in DNA reference libraries and biases related to molecular procedures, the two methods may not generate identical compositional data, particularly at the species level.Despite the difference, biotic indices based on the data generated by the two methods were highly consistent.Multiple researchers suggested that DNA sequencing generates a richer amount of information on biotic diversity with consistent and increased taxonomic resolution and thus has a great potential to improve the effectiveness of current bioassessment.Not surprisingly, most of these studies focused on benthic diatoms in streams and rivers, since diatoms are commonly used as bioindicators in lentic ecosystems.The studies that systematically compare the plankton assemblages characterized by traditional optical microscopy (TOM) and HTS sequencing methods in drinking water reservoirs are still limited.
China is facing challenges in both water quality and quantity.With uneven water distribution, the North China Plain is highly water-stressed, while water resources are relatively more abundant in the south [5,36,37].Based on the strategic demands for China's regional sustainable development, the Chinese government has launched the South-North Water Diversion (SNWD) Project to transfer water from the Yangtze River and its tributaries to the more arid and industrialized North China Plain [36][37][38].As the largest drinking water source in China [38], affecting more than 53 million people in Beijing and other receiving-water regions, the water quality in the Danjiangkou Reservoir, source water for the middle route of the SNWD, is required to be in good quality, be stable long-term, be continuously improved, and be able to adapt to further changes such as climate, and hence the establishment of a continuous ecological monitoring database on the reservoir is particularly important.Such a database can provide scientific data support for best and adaptive management practices.
In this study, we compared the effectiveness of the TOM and HTS methods in assessing water quality in the Danjiangkou Reservoir.

Study Area
As one of the largest river impoundments in the Yangtze River basin, the Danjiangkou Reservoir (32 • 36 ~33 • 48 N; 110 • 59 ~111 • 49 E), with a maximum depth of about 80 m, is located at the juncture of Hubei and Henan provinces of central China.Its drainage area includes the upper Hanjiang and Danjiang rivers, with a total area of 95,000 km 2 (Figure 1).The Danjiangkou Reservoir has a variety of functions, such as flood control, electricity generation, irrigation, shipping and drinking water supply [36][37][38].practices.In this study, we compared the effectiveness of the TOM and HTS methods in assessing water quality in the Danjiangkou Reservoir.

Study Area
As one of the largest river impoundments in the Yangtze River basin, the Danjiangkou Reservoir (32°36′~33°48′ N; 110°59′~111°49′ E), with a maximum depth of about 80 m, is located at the juncture of Hubei and Henan provinces of central China.Its drainage area includes the upper Hanjiang and Danjiang rivers, with a total area of 95,000 km 2 (Figure 1).The Danjiangkou Reservoir has a variety of functions, such as flood control, electricity generation, irrigation, shipping and drinking water supply [36][37][38].The reservoir is located in the north sub-tropic monsoon climate region with an annual mean temperature of 15~16 • C and annual precipitation 800~1000 mm, of which 80% is concentrated in the period from May to October.The monthly maximum precipitation is 193.7 mm and the minimum is 31.0mm.Soil types include alluvial soil, lime concretion black soil, yellow brown loam and purple soil.The watershed around the reservoir is dominated by mountainous areas (85%) and forest coverage (76%) [39].

Field Sampling
Based on the previous studies [40,41], we selected the five most representative sampling stations in the reservoir.In each station, water samples were collected at the 0~50 cm below the water surface for HTS analysis of planktonic bacterial and eukaryotic assemblages and TOM analysis of the phytoplankton and physico-chemical variables in May 2016 (rainy season).Qushou (Q) station is 100 m upstream of the water outlet dam of the water conveyance canal of the Middle Route Project of SNWD.Heijizuo (H) station is close to the confluence of the Danjiang and Guan rivers, two major tributaries to the reservoir in Henan province.Songgang (S) station is located in a reservoir bay which was influenced by a shipping dock and tourists.Kuxin (K) station is located in the middle of the reservoir while Taizishan (T) station is in the confluence of the two sub-basins of the reservoir (Figure 1).There were three replicates for each sample.

Morphological Identification of Phytoplankton Assemblages
Phytoplankton samples were preserved immediately with 1% Lugol's solution.The water samples for phytoplankton analysis were stored in a 2 L glass sedimentation utensil.After 48 h sedimentation, each sample was condensed to about 30 mL, and then stored in darkness at 4 • C until analysis.Phytoplankton were identified under a microscope (Nike E200) according to Hu and Wei (2006) [42].Phytoplankton analysis followed the standard method [43].Before counting, each sample was gently mixed.Phytoplankton cells were counted in a Fuchs-Rosental counting chamber at 400 × in Z line microscope fields.Algal biovolumes were calculated from measured morphometric characteristics (diameter, length and width).Conversion to biomass was based on 1 mm 3 of algal volume being equivalent to 1 mg of fresh weight biomass.A minimum of 500 individual units were counted with a counting error of less than 10%.Three subsamples were analyzed for each sample.

DNA Extraction and Sequencing
Samples were first filtered through a 0.45 µm diameter filter for planktonic eukaryote sequencing and the filtrates were then collected through a 0.22 µm diameter filter for bacterioplankton sequencing.Plankton genomic DNA in the stored filters was extracted using the E.Z.N.A. ® Water DNA Kit (OMEGA bio-tek, Norcross, GA, USA), following the manufacturer's instructions.Electrophoresis and Nano Drop ND 2000 (Thermo Scientific, Pittsburgh, PA, USA) were used to estimate the quantity of extracted DNA.The V3-V4 region of bacterial 16S rRNA gene was amplified with 338F and 806R [18,46] and the V5-V7 region of eukaryote 18S rRNA gene was amplified with 0817F and 1196R with sample-identifying barcodes [47].High throughput sequencing (Illumina MiSeq PE300 platform, Illumina, Inc., CA, USA) was performed by Shanghai Majorbio Bio-Pharm Technology Co., Ltd.(Shanghai, China).
In both cases, the MiSeq sequencing data were processed using the QIIME Pipeline [11].The operational taxonomic units (OTUs) were determined (at 97% similarity level) using USEARCH.The OTU number of each sample was used to represent taxa richness.Rarefaction curves and a Shannon-Wiener index (H') were generated, and the ACE, Shannon, Simpson, and Chao1 estimators were calculated to compare the richness and diversity of plankton.Taxonomic classification at the phylum and genus levels was performed using the ribosome database project (RDP) algorithm.

Physico-Chemical Variables
Physico-chemical variables were measured using standard methods [43].Water temperature (T), pH, electric conductivity (EC), and dissolved oxygen (DO) were measured in situ using a YSI 6920 (YSI Inc., Yellow Springs, OH, USA).Secchi depth (SD) was determined with a 30 cm diameter Secchi disk.Water samples for chemical analysis were transported to the laboratory within 24 h, stored in a refrigerator at 4 • C, and analyzed within one week after sample collection.
Permanganate index (COD Mn ) was calculated using the potassium permanganate index method and chemical oxygen demand (COD) was measured by the potassium dichromate method.Total phosphorus (TP) was determined with acidified molybdate to form reduced phosphor-molybdenum blue and measured spectro-photometrically.Total nitrogen (TN) was assayed with alkaline persulfate digestion and UV spectro-photometry, and ammonia nitrogen (NH 4 + -N) was measured with Nessler's reagent spectro-photometric method.Chlorophyll a (Chla) concentration was estimated spectro-photometrically after extraction in 90% ethanol [43].
The evaluation for the water trophic state was based on the five variables of TN, TP, COD Mn , SD and Chla [48].Using the linear interpolation method, we converted the value of each environmental variable into the assigned value (E), and then used the arithmetic mean of each assigned value as the trophic state index (EI).EI values between 60~100 indicate hyper-eutrophic conditions, between 50~60 mildly eutrophic conditions and between 20~50 moderately eutrophic conditions [48].

Data Analysis
Datasets between HTS and TOM were compared at the phylum/genus level.All genera belonging to the concept of phytoplankton, including Cyanobacteria (from the 16S sequencing dataset), Bacillariophyta, Chlorophyta, and other groups of eukaryotic phytoplankton (from the 18S sequencing dataset) were determined from the cleaned HTS datasets.The numbers of OTUs were compared with their corresponding number of phytoplankton species (or genera) found by TOM.
The per-mutational multivariate analysis of variance (PERMANOVA) and analysis of similarity (ANOSIM) were performed using R software with the "vegan" package to assess the significant differences in assemblage structure among the sampling stations.Non-metric multi-dimensional scaling (NMDS) was also conducted with the R function "metaMDS" in the same package, and the first two NMDS axis scores (NMDS I and NMDS II) for each station were used as reduced multi-dimensional data for the plankton assemblages.The relative abundance of biomass data was used for the NMDS analysis.Generalized procrustes analysis (GPA) was used to compare the ordination based on phytoplankton assemblages and DNA barcoding [49].To assess the differences in the environmental variables among stations, one-way analysis of variance (ANOVA) was applied with statistical significance set prior at p < 0.05.All data analyses were performed in R (Ver.3.4.0,R Development Core Team, 2017).

Evaluation of Water Quality and Trophic Status
The physico-chemical variables indicated that the water quality of the reservoir was overall good as drinking water according to the environmental quality standards of surface water of China (GB38382-2002).However, TN was higher than the Class III surface water standard.TN at H station was 54.08% higher than that of other stations.Both COD Mn and TP met Class II surface water standards while other indicators met Class I surface water standards (Table 1).There were significant differences in the environmental variables among the stations except TP (p < 0.05).According to TN, the five stations could be divided into three different groups (i.e., H station, K and T stations, Q and S stations, Table 1).
The mean E value of TN was 60.05.The E values in the H and T stations were higher, over 60.00.The E values of the other four environmental variables were between 20.00~49.05.The mean EI values, ranged between 38.78~41.16,were significantly different among the five stations (p < 0.05).[48], we converted the value of each environmental variable into the assigned value (E) using the linear interpolation method, and then used the arithmetic mean of each assigned value as the trophic state index (EI).

Phytoplankton Assemblages Characterized by TOM
A total of 39 taxa were recorded, belonging to five divisions, 17 families and 26 genera.Bacillariophyta, Chlorophyta and Cyanophyta accounted for 51.28%, 23.08% and 15.38% of the total genus/species and 72.53%, 22.15% and 3.78% of the total biomass, respectively (Figure 2).H' values varied from the highest of 5.66 at Q station to the lowest of 0.67 at T station (Table A1), suggesting that the reservoir was under a moderate trophic condition.

Phytoplankton Assemblages Characterized by TOM
A total of 39 taxa were recorded, belonging to five divisions, 17 families and 26 genera.Bacillariophyta, Chlorophyta and Cyanophyta accounted for 51.28%, 23.08% and 15.38% of the total genus/species and 72.53%, 22.15% and 3.78% of the total biomass, respectively (Figure 2).H' values varied from the highest of 5.66 at Q station to the lowest of 0.67 at T station (Table A1), suggesting that the reservoir was under a moderate trophic condition.The NMDS analysis showed that phytoplankton assemblages substantially varied among the five stations (Figure 3a).The NMDS axis I mainly reflected the change of Chlorophyta and Bacillariophyta in the riverine and reservoir areas.The biomass of Chlorophyta, dominated by Pandorina sp.(56.34% of the total biomass) at H station, was much higher than that of the other four stations.In contrast, the biomass of Bacillariophyta at H station was the lowest.The NMDS axis II was negatively correlated with the total phytoplankton biomass.Aulacoseira granulata (Ehr.)Simonsen accounted for 53.70% and 36.81% of the total biomass in K and T stations, respectively, while Navicula sp.accounted for 35.25% and 30.84% in Q and S stations, respectively.The NMDS analysis showed that phytoplankton assemblages substantially varied among the five stations (Figure 3a).The NMDS axis I mainly reflected the change of Chlorophyta and Bacillariophyta in the riverine and reservoir areas.The biomass of Chlorophyta, dominated by Pandorina sp.(56.34% of the total biomass) at H station, was much higher than that of the other four stations.In contrast, the biomass of Bacillariophyta at H station was the lowest.The NMDS axis II was negatively correlated with the total phytoplankton biomass.Aulacoseira granulata (Ehr.)Simonsen accounted for 53.70% and 36.81% of the total biomass in K and T stations, respectively, while Navicula sp.accounted for 35.25% and 30.84% in Q and S stations, respectively.

Phytoplankton Assemblages Characterized by TOM
A total of 39 taxa were recorded, belonging to five divisions, 17 families and 26 genera.Bacillariophyta, Chlorophyta and Cyanophyta accounted for 51.28%, 23.08% and 15.38% of the total genus/species and 72.53%, 22.15% and 3.78% of the total biomass, respectively (Figure 2).H' values varied from the highest of 5.66 at Q station to the lowest of 0.67 at T station (Table A1), suggesting that the reservoir was under a moderate trophic condition.The NMDS analysis showed that phytoplankton assemblages substantially varied among the five stations (Figure 3a).The NMDS axis I mainly reflected the change of Chlorophyta and Bacillariophyta in the riverine and reservoir areas.The biomass of Chlorophyta, dominated by Pandorina sp.(56.34% of the total biomass) at H station, was much higher than that of the other four stations.In contrast, the biomass of Bacillariophyta at H station was the lowest.The NMDS axis II was negatively correlated with the total phytoplankton biomass.Aulacoseira granulata (Ehr.)Simonsen accounted for 53.70% and 36.81% of the total biomass in K and T stations, respectively, while Navicula sp.accounted for 35.25% and 30.84% in Q and S stations, respectively.

Plankton Assemblages Characterized by HTS
The HTS results of 16S rRNA and 18S rRNA gene showed that the library coverage and Simpson index of 18S rRNA planktonic eukaryotic assemblages (18S assemblages) were higher than those of 16S rRNA bacterioplankton assemblages (16S assemblages) (Tables 2 and A1, and Figure 4).In contrast, the ACE index, Chao1 index and H' values of the 18S assemblages were lower than those of the 16S assemblages.The OTUs number, ACE index, Chao1 index, H' and Simpson index of the 18S assemblages were significantly different among the stations (p < 0.05).

Plankton Assemblages Characterized by HTS
The HTS results of 16S rRNA and 18S rRNA gene showed that the library coverage and Simpson index of 18S rRNA planktonic eukaryotic assemblages (18S assemblages) were higher than those of 16S rRNA bacterioplankton assemblages (16S assemblages) (Tables 2 and A1, and Figure 4).In contrast, the ACE index, Chao1 index and H' values of the 18S assemblages were lower than those of the 16S assemblages.The OTUs number, ACE index, Chao1 index, H' and Simpson index of the 18S assemblages were significantly different among the stations (p < 0.05).
The NMDS analysis of the DNA sequencing data showed that 16S and 18S assemblages varied among the stations (Figures 3b, 5, 6 and A1).The spatial variation of plankton assemblages among the stations was greater than the measurement error among the replicates of each station (Figures 3b  and 5).The difference in bacterioplankton assemblages among the stations was lower than that of eukaryote assemblages (Figures 4 and A1).The bacterioplankton assemblages along the NMDS axis I from left to right may reflect different habitat conditions, and along the NMDS axis II, the differences among bacterioplankton assemblages were less evident (Figure A1a).The OTU percentages of Cyanobacteria_norank and SubsectionI_Family I_norank in H station were higher than those of the other four stations while the OTUs of Synechococcus were much lower.The relative abundance of dominant Cryptomonadales_uncultured along the NMDS axis I from left to right decreased from 43.86% to 14.56%, and along the NMDS axis II from bottom to top, the OTUs of Choanozoa_incertae_sedis gradually dropped from 3.76% to 1.53% (Figure A1b).
The NMDS analysis of the DNA sequencing data showed that 16S and 18S assemblages varied among the stations (Figures 3b, 5, 6 and A1).The spatial variation of plankton assemblages among the stations was greater than the measurement error among the replicates of each station (Figures 3b  and 5).The difference in bacterioplankton assemblages among the stations was lower than that of eukaryote assemblages (Figures 4 and A1).The bacterioplankton assemblages along the NMDS axis I from left to right may reflect different habitat conditions, and along the NMDS axis II, the differences among bacterioplankton assemblages were less evident (Figure A1a).The OTU percentages of Cyanobacteria_norank and SubsectionI_Family I_norank in H station were higher than those of the other four stations while the OTUs of Synechococcus were much lower.The relative abundance of dominant Cryptomonadales_uncultured along the NMDS axis I from left to right decreased from 43.86% to 14.56%, and along the NMDS axis II from bottom to top, the OTUs of Choanozoa_incertae_sedis gradually dropped from 3.76% to 1.53% (Figure A1b).

Comparison of Phytoplankton Assemblages by TOM and HTS
The plankton assemblages identified by the two methods were substantially different (Tables 2  and A1; Figures 2-6 and A2).At the phylum level, the number of the phyla identified by HTS was more than that by TOM.However, the genera number with almost the same taxa was similar (Figures 2 and 4).Cyanophyta, Chlorophyta, Bacillariophyta, Chrysophyta and Cryptophyta were detected by both methods with less of the same genera, followed by Microcystis, Anabaena, Navicula and Cryptomonas.Bacillariophyta was dominant with a mean of 51.28% of the total genera/species by TOM while Cryptophyta was dominant with mean 98.27% of the total OTUs by HTS.Dinokaryota, Charophyta, Euglenophyta and the other rare groups were only identified by HTS.
The NMDS analysis showed the substantial spatial difference of the plankton assemblages among the stations using either method (Figures 3, 5, 6 and A1).The plankton assemblages of H station were completely different from those of the other four stations (Figures 3, 5 and 6).The stations could be divided into three group types such as H station, Q and S stations, K and T stations (Figure 3).
Based on the two datasets, GPA was used to compare the difference in assemblages by the two methods.The correlation between TOM and HTS in a symmetric Procrustes rotation was 0.61 (Figure 6), revealing good concordance between the two datasets.The residuals varied among the sampling stations.Q station, located just above the outlet dam, had the smallest residual while H station, located in the confluence of two tributaries with strong river impact, had the largest residual (Figure 6).

Discussion
Our study showed that the plankton spatial pattern in the reservoir was consistent using both the TOM and HTS datasets (Figure 3).The reservoir in general has good water quality and thus the spatial differences among the sampled stations may largely reflect the variation of habitat and hydrological conditions.For instance, station H, a strong riverine site located at the confluence of two major tributaries, was consistently different from the rest of the stations in the NMDS plots.Our results are consistent with findings by other researchers [23,50,51].For instance, both DNA sequencing and TOM analyses generally captured frequency shifts of abundant taxa over the seasonal samples [50].Vasselon et al. (2017) [23] compared HTS and morphological water quality index values in 33 river sites in Island Mayotte and found that the water quality status was congruent between the two methods.Eiler et al. (2013) [29] reported that DNA-sequencing-derived phytoplankton composition differed significantly among lakes with different trophic status, showing that DNA sequencing could resolve phytoplankton assemblages at a level relevant for ecosystem management.

Comparison of Phytoplankton Assemblages by TOM and HTS
The plankton assemblages identified by the two methods were substantially different (Tables 2  and A1; Figures 2-6 and Figure A2).At the phylum level, the number of the phyla identified by HTS was more than that by TOM.However, the genera number with almost the same taxa was similar (Figures 2 and 4).Cyanophyta, Chlorophyta, Bacillariophyta, Chrysophyta and Cryptophyta were detected by both methods with less of the same genera, followed by Microcystis, Anabaena, Navicula and Cryptomonas.Bacillariophyta was dominant with a mean of 51.28% of the total genera/species by TOM while Cryptophyta was dominant with mean 98.27% of the total OTUs by HTS.Dinokaryota, Charophyta, Euglenophyta and the other rare groups were only identified by HTS.
The NMDS analysis showed the substantial spatial difference of the plankton assemblages among the stations using either method (Figures 3, 5, 6 and A1).The plankton assemblages of H station were completely different from those of the other four stations (Figures 3, 5 and 6).The stations could be divided into three group types such as H station, Q and S stations, K and T stations (Figure 3).
Based on the two datasets, GPA was used to compare the difference in assemblages by the two methods.The correlation between TOM and HTS in a symmetric Procrustes rotation was 0.61 (Figure 6), revealing good concordance between the two datasets.The residuals varied among the sampling stations.Q station, located just above the outlet dam, had the smallest residual while H station, located in the confluence of two tributaries with strong river impact, had the largest residual (Figure 6).

Discussion
Our study showed that the plankton spatial pattern in the reservoir was consistent using both the TOM and HTS datasets (Figure 3).The reservoir in general has good water quality and thus the spatial differences among the sampled stations may largely reflect the variation of habitat and hydrological conditions.For instance, station H, a strong riverine site located at the confluence of two major tributaries, was consistently different from the rest of the stations in the NMDS plots.Our results are consistent with findings by other researchers [23,50,51].For instance, both DNA sequencing and TOM analyses generally captured frequency shifts of abundant taxa over the seasonal samples [50].Vasselon et al. (2017) [23] compared HTS and morphological water quality index values in 33 river sites in Island Mayotte and found that the water quality status was congruent between the two methods.Eiler et al. (2013) [29] reported that DNA-sequencing-derived phytoplankton composition differed significantly among lakes with different trophic status, showing that DNA sequencing could resolve phytoplankton assemblages at a level relevant for ecosystem management.
Procrustes analysis showed that consistency between the two methods varied among the stations in the reservoir (Figure 6).The two methods generated the best agreement at station Q.The station, located just above the outlet dam, may best integrate spatial variability in the reservoir.Both TOM and 18S sRNA results indicated that the sample from this station had the highest taxa richness.In contrast, the least agreement between the two methods was at station H. Located in the downstream of the two tributaries, the station may be heavily influenced by the rivers and consequently its plankton composition with more benthic algae and less true planktonic algae was substantially different from the rest of the stations.The site-specific discrepancy between the two methods may also be due to the incomplete taxonomic coverage in the reference library, particularly for benthic diatoms.For instance, Vasselon et al. (2017) [23] reported that only 13% of the benthic diatom species was shared by the two methods in 33 river sites in Island Mayotte.
It was expected that the plankton assemblages identified by TOM and HTS would be substantially different terms of their composition.Compared to TOM, HTS can detect a wide variety of plankton, including nanoplankton and rare taxa.Other studies also found similar results in freshwater ecosystems [32,34,50,52].Xiao et al. (2014) [32] found that the species compositions detected by TOM and 454 HTS did not always match at the taxa level after analyzing 300 weekly water samples over 20 years in Lake Gjersjøen, Norway, a drinking water source.Studies in Lake Tegel, Germany, showed that because the 480 used diatom sequences of the 18S region were generated from world-wide occurrences, only a small number of individuals precisely matched on the species level [52].A deep-branching taxonomically unclassified cluster was frequently detected by DNA sequencing, but could not be linked to any group identified by microscopy.DNA sequencing allows approximately three orders of magnitude larger SSU rDNA sequencing [16].Deleting rare species can affect the sensitivity of biotic indices to detect environmental degradation [53].In the absence of other nuclear markers less susceptible to copy number variation, rDNA-based diversity studies need to be adjusted for confounding effects of copy number variation [50].Evans et al. (2007) [54] assessed the effectiveness of several genes (cox1, rbcL, 18S and ITS rDNA) to distinguish cryptic species within the model "morphospecies", Sellaphora pupula agg., and found that tree topologies were very similar although support values were generally lower for cox1.To assess the proportional biomass of diatoms and dinoflagellates along the Swedish west marine coast, Godhe et al. (2008) [27] designed two real-time PCR assays with special primers to find that the linear regression of the proportion of SSU rDNA copies of dinoflagellate and diatom origin versus the proportion of dinoflagellate and diatom biovolumes or biomass per liter was significant.Thus, for diatoms, linear regression of the number of small subunit ribosomal DNA (SSU rDNA) copies versus biovolume or biomass per liter was significant, but no such significant correlation was detected in the field samples for dinoflagellates.Exploring the alternative markers (e.g., ~1400 bp of rbcL; 748 bp at the 3_end of rbcL (rbcL-3P); large ribosomal subunit (LSU D2/D3) and UPA) to diatom barcoding (e.g., COI-5P, rbcL-3P) should be used as the primary marker for diatom barcoding, while LSU D2/D3 should be sequenced as a secondary marker to facilitate bioassessment [55].
The high reproducibility and potential for standardization and parallelization makes the high throughput sequencing approach an excellent candidate for the simultaneous monitoring of plankton assemblages in drinking water quality, mainly including both phytoplankton and bacterioplankton.For a long-term biomonitoring and assessment program in the Danjiangkou Reservoir, a critical drinking water source for northern China, our study shows that the DNA barcoding method has great potential.DNA sequencing can provide a rapid and consistent identification of taxa.The next step is to design a specific assay with a specific DNA extraction method and a specific primer for the local flora in the reservoir.

Conclusions
In summary, the plankton spatial pattern among the stations in the Danjiangkou Reservoir was consistently detected by both TOM and HTS methods, reflecting the variation of habitat and hydrological conditions in the reservoir.To develop a consistent and accurate long-term ecological monitoring database for drinking water quality, DNA sequencing may serve as a promising alternative.Relative abundance (%) of taxa and OTUs of phytoplankton using traditional optical microscopy method (TOM) and high throughput sequencing method (HTS).OTUs percentage was only that the number of OTUs of one assemblages accounted for the total OTUs number of phytoplankton identified from the 16S and 18S sequencing datasets.

Figure 1 .
Figure 1.Locations of the five sampling stations in the Danjiangkou Reservoir and the water conveyance canal of the Middle Route Project of South-North Water Division in China.Station codes represent the first letter of their names: Q: Qushou, K: Kuxin, S: Songgan, H: Heijizuo, T: Taizishan.The reservoir is located in the north sub-tropic monsoon climate region with an annual mean temperature of 15~16 °C and annual precipitation 800~1000 mm, of which 80% is concentrated in the period from May to October.The monthly maximum precipitation is 193.7 mm and the minimum is 31.0mm.Soil types include alluvial soil, lime concretion black soil, yellow brown loam and purple

Figure 1 .
Figure 1.Locations of the five sampling stations in the Danjiangkou Reservoir and the water conveyance canal of the Middle Route Project of South-North Water Division in China.Station codes represent the first letter of their names: Q: Qushou, K: Kuxin, S: Songgan, H: Heijizuo, T: Taizishan.

Figure 4 .
Figure 4. Relative abundance (%) of 16S (a) and 18S (b) assemblage structures at phylum levels in different stations.Only taxa with 0.05% abundance or more in the total dataset were identified in the legend.Station codes represented the first letter of their names: Q: Qushou, K: Kuxin, S: Songgan, H: Heijizuo, T: Taizishang.

Figure 4 .
Figure 4. Relative abundance (%) of 16S (a) and 18S (b) assemblage structures at phylum levels in different stations.Only taxa with 0.05% abundance or more in the total dataset were identified in the legend.Station codes represented the first letter of their names: Q: Qushou, K: Kuxin, S: Songgan, H: Heijizuo, T: Taizishang.

Figure 5 .
Figure 5. NMDS plot showing both spatial variation among five stations and measurement errors for each station based on the relative abundance of 16S and 18S sequencing barcoding at the genus level.Station codes represented the first letter of their names: Q: Qushou, K: Kuxin, S: Songgan, H: Heijizuo, T: Taizishang.Three identical letters represent three repetitions of each station.There were three repetitions for each sample.

Figure 5 .
Figure 5. NMDS plot showing both spatial variation among five stations and measurement errors for each station based on the relative abundance of 16S and 18S sequencing barcoding at the genus level.Station codes represented the first letter of their names: Q: Qushou, K: Kuxin, S: Songgan, H: Heijizuo, T: Taizishang.Three identical letters represent three repetitions of each station.There were three repetitions for each sample.

Figure A2 .
Figure A2.Relative abundance (%) of taxa and OTUs of phytoplankton using traditional optical microscopy method (TOM) and high throughput sequencing method (HTS).OTUs percentage was only that the number of OTUs of one assemblages accounted for the total OTUs number of phytoplankton identified from the 16S and 18S sequencing datasets.

Figure A2.
Figure A2.Relative abundance (%) of taxa and OTUs of phytoplankton using traditional optical microscopy method (TOM) and high throughput sequencing method (HTS).OTUs percentage was only that the number of OTUs of one assemblages accounted for the total OTUs number of phytoplankton identified from the 16S and 18S sequencing datasets.

Table 1 .
Main physico-chemical characteristics of water samples.Station codes represented the first letter of their names: Q: Qushou, K: Kuxin, S: Songgan, H: Heijizuo, T: Taizishang.Mean ± standard error; (2) Different lowercase letters in the same column showed that the indicator was significant among stations at p < 0.05 level; (3) According to China SL395-2007 evaluation of the water trophic state
Note: (1) Mean ± standard error; (2) Different lowercase letters in the same column indicated that the indicator was significant among stations in p < 0.05.