Assessing the Ecological Status of European Rivers and Lakes Using Benthic Invertebrate Communities: A Practical Catalogue of Metrics and Methods

The Water Framework Directive requires that the ecological status of surface waters be monitored and managed if necessary. A central function in ecological status assessment has the Biological Quality Elements—organisms inhabiting surface waters—by indicating human impact on their habitat. For benthic invertebrates, a wide array of national methods are used, but to date no comprehensive summary of metrics and methods is available. In this study, we summarize the benthic invertebrate community metrics used in national systems to assess the ecological status of rivers, (very) large rivers, and lakes. Currently, benthic invertebrate assemblages are used in 26 national assessment systems for rivers, 13 assessment systems for very large rivers, and 21 assessment systems for lakes in the EU. In the majority of systems, the same metrics and modules are used. In the Red Queen’s race of ecosystem management this may be a disadvantage as these same metrics and module likely depict the same stressors but there is growing evidence that aquatic ecosystems are subject to highly differentiated, complex multiple stressor impacts. Method development should be fostered to identify and rank impacts in multi-stressor environments. DNA-based biomonitoring 2.0 offers to detect stressors with greater accuracy—if new tools are calibrated.


Introduction
Protecting the integrity of the biodiversity and functioning of an ecosystem are key factors underpinning the continuous supply of ecosystem services [1]. In freshwater habitats, these are most importantly associated with supply of safe food and drinking water, self-purification, transportation, as well as recreation opportunities and are at the core of the Sustainable Development Goals [2,3].
The European Union implemented several laws to sustain natural resources and ensure environmental protection. As human impact on both biodiversity and ecosystem services increases such efforts are a primary concern of development and law-making [4][5][6][7]. The European initiatives and efforts are a good example how collaborative governing can help to overcome significant environmental challenges.
The Water Framework Directive (WFD, Directive 2000/60/EC) is among the most prominent pieces of legislation that pertain to EU freshwater and coastal habitats and prescribes monitoring the chemical (CS) and ecological status (ES) of surface waters-both lakes and rivers-in each EU member state. ES reflects the quality of the ecosystem structure and functioning of any surface water and is defined based on the deviation of observed communities of Biological Quality Elements (BQEs) from pristine or near-natural reference conditions. In particular, the river-specific assessment of ES is to be undertaken by assessing the composition and abundance of aquatic flora, or composition and abundance of benthic invertebrate fauna, or composition, abundance, and age structure of fish fauna. For lakes, assessment of ES also includes composition, abundance, and biomass of phytoplankton.
In line with the WFD, each EU member state implemented water body type-specific methods and tools to assess Ecological Status Class (ESC), and these approaches were intercalibrated to generate comparable results across the EU [8,9]. Assessing ESC of any water body follows a standard line of action [9][10][11]. In a first step, adequate and standardized sampling procedures are used to obtain a sample of the BQE community at a designated site. To obtain the relevant parameters of the BQE community, sampling focuses on measures composition, abundance, biomass, or age structure. Following this data generation step, specialized software solutions-hereafter called Ecological Status Class Assessment Tools (ESCATs)-are used to calculate values describing the community, and to relate these values to reference conditions and threshold values delimiting the different ESCs [8,9]. Based on deviation from reference and the threshold values in an ESC are assigned into five categories as: high, good, moderate, poor, or bad. A high ESC is defined as showing no to minimal deviations from a-theoretically pristine but in reality, mostly minimally disturbed-reference condition (sensu [12]), while a good ESC may reflect human activity but only to a slight extent. The other ESCs harbour communities that are significantly more disturbed than those observed in habitats of good ESC.
Naturally, a variety of different options were pursued to develop and ultimately intercalibrate ESC estimation tools, following monitoring traditions and available expertise [13,14]. However, a particularly prominent and frequently used BQE group is the benthic invertebrate fauna, and with excellent reason: benthic invertebrate assemblages are not only relatively easy to identify, but they also have narrow ecological niches which render them highly sensitive to changes in their environment-including anthropogenic disturbance [15][16][17]. Further, there is a strong tradition of using benthic invertebrates in biomonitoring, as their value as indicators of habitat conditions was recognized early (e.g., [18,19]). For benthic invertebrate assemblages, composition and abundance are to be measured and used for ESC assessment.
To quantify and compare these community parameters in ESCATs, different modules focusing on the sensitivity/tolerance and metrics are used. Modules are usually constructed based on taxon-specific indicator values or combinations of metrics that relate to the probability of a particular BQE community succession along a disturbance gradient. Based on composition and/or abundance of an observed BQE community all indicator values can then be summed or averaged, optionally including abundances as weights, to obtain a single numerical descriptor of the sampled habitat. Examples for modules include the Average Score Per Taxon index (ASPT), the Biological Monitoring Working Party index (BMWP), or the Saprobic index sensu Zelinka and Marvan [20][21][22]. Metrics usually are single numerical descriptors that are obtained by simple enumeration, via an alphadiversity index (such as Margalef's index [23] or Shannon diversity [24]) or by calculating the proportion of a certain functional group observed in the BQE community (e.g., the number of sensitive or filter-feeding taxa) and are often used in combinations as multimetric indices [11,25].
However, there is surprisingly little information available on how ESCATs actually use benthic invertebrate assemblage data for WFD-compliant ESC estimation. In particular, there is a lack of comparative summaries for methods applied in rivers and lakes, and no attempt has yet been made to catalogue modules and metrics that are used in different ESCATs. Here, we provide a first summary of benthic invertebrate-based ESCATs used in rivers and lakes. We moreover present a catalogue of modules and metrics constituting ESCATs and discuss advantages and shortcomings of different modules and metrics for biomonitoring in general and specifically in respect to future biomonitoring approaches.

Data Acquisition and Access
Data on construction of national ESCATs were compiled from all primary articles (i.e., peer-reviewed articles and technical reports) that were submitted to the European Union in accordance with WFD regulations (see Supplementary Information), detailing metrics, indices, and modules used for ESC estimation in rivers, (very) large rivers, and lakes. Based on this database, we assessed which types of modules and metrics lay base to the respective assessment system. We did not include methods targeting acidification, as these were not implemented in each EU member state. Further, river-specific systems are used in (very) large rivers as well; this approach is shown separately for the purpose of this contribution. Further, different ESCATs are in use a number of countries, reflecting geographical differentiation.
We tabulated modules and metrics and assessed how frequently these are used in the diversity of ESCATs. For the purpose of this study, we consider as "modules" (usually used to refer to sensitivity/tolerance metrics [11]) tools that directly return an assessment result: an integrated index value from taxon-specific indicator values or metric combinations. Likewise, we consider as metrics numerical descriptors of bioindicator communities that deliver single values and can be integrated to a multimetric index. If a single module is used for assessments we treat it as depicting general degradation, as no further information on stressors is integrated. Based on our initial assessment, we developed a comparative framework in which the different national river ESCATs are grouped according to the number of shared modules and metrics.

Ecological Status Class Assessment across Europe
Currently, there are 26 assessment systems using benthic invertebrate assemblages for ESC estimation in rivers, 13 assessment systems for very large rivers-representing 38 ESCATS-and 21 assessment systems for lakes that represent 19 ESCATs.
For the assessment of rivers, three countries make use of decision tables: Denmark (Danish Stream Fauna Index, DSFI), and Bulgaria and Ireland (Q-value tables). Decision tables do not require computation of module or metric values, but rather assess ESC based on decision-table guided expert judgement. A total of six ESCATs are based on a single module only and used in Bulgaria (Q-value tables), Denmark, (Danish Stream Fauna Index, DSFI), Greece (Hellenic Evaluation Score, HES), Ireland (Q-value tables), Spain (Iberian BMWP), and Sweden (ASPT). All other ESCATs rely on the combination of at least one module and at least one metric. Of these, 18 are true multimetric ESCATs that integrate several metrics for ESC assessment (Table 1). Table 1. Summary of national ESCATs for wadeable rivers, grouped according to similarities in modules and metrics used. Biomonitoring strategies differ among EU member states and associated countries, which is reflected in application of different modules and metrics. Bulgaria and Ireland use the Q-value approach, while Norway, Spain, Greece, Luxembourg, and Denmark use a single module. In the majority of EU member states, general degradation modules like the ASPT, the BMWP or the DSFI are complemented with additional metrics on diversity, functional ecology, and sensitivity/tolerance of benthic invertebrates. A large minority of ESCATs rely on a combination of organic pollution and general degradation modules with additional metrics. Abbreviations: O.P., organic pollution module; G.D., general degradation module; SI, Saprobic index; GDI, general degradation index; DI, diversity index; TD, taxonomic diversity metrics; CC, community composition metrics; FE, feeding ecology metrics; HM hydromorphology metrics; LC, life cycle metrics; ST, sensitive taxa metrics. All other abbreviations as listed in the glossary.  In very large rivers, all existing methods integrate a saprobic and/or general degradation index with other metrics (Table 2). Table 2. Summary of ESCATs used to assess ecological status specifically in (very) large rivers. Biomonitoring in very large rivers employs a similar range of metrics and modules as are used in wadeable rivers. All abbreviations and comments as in Table 1  For the assessment of lakes, two ESCATs are based on a single module only (used in Finland and Sweden, respectively), while the remainder of assessment approaches integrates at least one module and several metrics (Table 3). Table 3. Summary of ESCATs used to assess ecological status of lakes in the European Union. Ecological status assessment in lakes focuses on modules and metrics detecting general degradation (GDI) as well as deviations in taxonomic diversity via diversity indices (DI), direct measurements of taxonomic diversity (TD), and community composition (CC). Further, functional metrics focusing on feeding ecology traits (FE) or habitat requirements concerning hydromorphology (HM) or life cycle traits (LC) of the observed benthic invertebrate assemblages are used. Additionally, sensitive taxa (ST) are used in ecological status assessment. All abbreviations as in Tables 1 and 2

Types of Approaches: Decision Tables, Modules and Metrics
Decision tables present conditions that describe the status of an observed BQE community. These typically build on the occurrence and abundance of taxa and provide if-then solutions to assign ESC. Examples are the DSFI and the Q-value tables. The DSFI is also used outside of Denmark, in two other ESCATs (Estonia and Latvia).

Saprobic Indices
Saprobic indices (SI) were developed early on and are amongst the oldest approaches used to assess the status of aquatic ecosystems. They are based on the niche spaces occupied by different taxa, which can be expressed in ecological competence/preference points that serve as taxon-specific indicator values. These reflect the occurrence probability of indicator taxa along an ecological gradient of organic load, and, to a lesser degree, hydromorphology. Indicator values are available for a range of taxa, including not only benthic invertebrates but also aquatic flora. SIs are calibrated according to the ecological gradient observed in a specific region and describe the fit of the observed community to specific saprobic conditions; thus, various national adaptations of indicator values exist. SIs are currently used in seven national systems for rivers and large rivers each, but not in lakes. The most commonly used approaches were introduced by Pantle and Buck [18] (hereafter referred to as SI PB ) and Zelinka and Marvan [22] (SI ZM ). SI PB uses abundance of genus-level identified BQE in combination with taxon-specific indicator values to infer a saprobic index. SI PB is used in two river and two very large river ESCATs. By contrast, SI ZM relies on species-level identification and indicator weights in addition to indicator values for each taxon and integrates these values with the observed abundances to infer a saprobic index. SI ZM is used in six river and five very large river ESCATs. In both approaches, the observed SI is related to threshold values to infer a saprobic quality class or an ecological quality ratio based on saprobic conditions.

General Degradation Indices
General degradation indices (GDIs) follow the same principles as SIs, i.e., taxonspecific indicator values are developed based on occurrence of taxa along a disturbance gradient. The most commonly applied GDIs are the BMWP and the ASPT indices that rely on family-level identification of indicator taxa and require no abundance data [20,21]. This makes for a rapid and versatile application of these indices possible but comes with a trade-off concerning specificity and accuracy. Regionally specific GDIs were developed and calibrated to detect human-induced impairment with greater efficacy. National variants of river GDIs that comprise waterbody type-specific variants and threshold values exist in Austria, Belgium (Flanders), France, Germany, Greece, and Slovenia; adaptations of the French GDI are also used in other countries.
BMWP and ASPT: The Biological Monitoring Working Party and the Average Score Per Taxon indices are based on occurrence of families of benthic invertebrates. For each family, the assigned indicator value reflects occurrence probability in minimally disturbed or, ideally, pristine conditions. The BMWP is calculated as the sum of all indicator values and the ASPT is calculated as the BMWP value divided by the number of scoring (observed) families. The ASPT is used in 13 river, four very large river and six lake ESCATs, while the BMWP is used in three river, and two very large river and lake ESCATs each.
National GDIs: National GDIs can emulate the BMWP/ASPT approach but may also have higher specificity for particular water body types and may also include information on abundances of benthic invertebrates. Effectively, the various national GDIs follow the same principle in assigning indicator values to sets of taxa associated with specific habitat conditions, but usually rely on higher taxonomic resolution. GDIs are usually calibrated to detect impairment of habitats rather than specific stressors; an exception are the GDIs employed in Slovenia and Croatia, that specifically take hydromorphological alteration into account [26].

Synopsis of Single Metrics Used as Benthic Invertebrate Community Descriptors
Diversity indices are calibrated against reference conditions for use in biomonitoring. To this end, communities at sites along a disturbance gradient are sampled and their alpha diversity described by means of a diversity index. Margalef's index (D') [23,27], Shannon diversity (H') [24,27], and the corresponding Evenness (L') calculated as a derivative of Shannon diversity [27], are most commonly used. Alternatively, the First Hill Number calculated as the exponential function of Shannon diversity may be used. Calculation of diversity indices should be based on species-level identification and properly assessed abundances. Diversity indices follow different functions, according to their construction: Margalef's index follows a relatively linear function, while Shannon diversity follows a logarithmic function (but can be linearized by calculating its exponential function, as is done for the First Hill Number). For ESC estimation in rivers, Shannon diversity is most commonly used (7 ESCATs), followed by Margalef's index (4 ESCATs), and Evenness (2 ESCATs). Very large river methods may rely on Shannon diversity (4 ESCATs), Margalef's Index (1 ESCAT) or Shannon diversity computed based on a preselected set of sensitive EPT-taxa (1 ESCAT). In lakes, Shannon diversity is used in 10 ESCATs, Simpson's and Margalef's index in one ESCAT each and the First Hill Number in two ESCATs.
Raw taxa numbers (taxon richness) may be employed in addition to or as an alternative to diversity indices. Here, total diversity is expressed as number of taxa at a predefined taxonomic resolution encountered at a designated sampling site. Further, the number of taxa recorded in one or several groups can be used as metric. To this end large-bodied taxa are selected, and their diversity recorded at predefined taxonomic levels. Usually, number of Ephemeroptera, Plecoptera, Trichoptera, Diptera, Coleoptera, Bivalvia, Odonata, Oligochaeta, Gastropoda, or Chironomidae (a group of Diptera) taxa are employed, either singly or in combination. Combinations of taxa are constructed to indicate ecosystem integrity or impairment, and to make the metric more robust to changes in community composition. The number of bioindicator taxa encountered at a site usually reaches its peak in pristine or minimally disturbed conditions where microhabitat diversity and structure are unperturbed. Typically sets of taxa are summarized to obtain values, including the following metrics: Composition of the bioindicator community is an important metric, and usually focuses on sets of indicator taxa. These usually are the same relatively large-bodied taxa as targeted for a taxa-numbers metric, due to their relatively predictable occurrence in pristine/minimally disturbed or degraded conditions. In most cases community composition metrics are constructed taking abundances into account (either as raw abundances or abundance classes), so that proportions of indicator groups are compared. The response of this metric is a shift in proportions of taxa along a disturbance gradient and is assessed by calculating proportions that sets of taxa contribute to a particular benthic invertebrate taxa community.

•
Proportion of Ephemeroptera, Plecoptera, and Trichoptera specimens (P EPT ): used in nine river, two very large river, and one lake ESCAT. A beta-diversity index to quantify differences in community composition: the Bray-Curtis dissimilarity (β Bray-Curtis ) is used in two river ESCATs to characterize how well an observed benthic invertebrate fauna community and the reference community match.
Functional metrics based on benthic invertebrate assemblages are less frequently considered in EU assessments, and when used usually focus on feeding ecology or the hydromorphological niche of an observed community. Specific indices have been developed to describe integrated feeding guilds, but also simple proportions of single feeding types (e.g., predators) are used. Proportions of feeding guilds are assumed to follow a specific succession along the river continuum, and deviation therefrom can be quantified as signal of human impact. In particular, deviations in the proportions of feeding guilds may relate to organic input, increased sediment load, or changes in hydromorphology. Bioindicator communities likewise assemble according to natural hydromorphological gradients along rivers, where sections typically exhibit dominance of certain taxa. Quantifying changes in these dominance patterns can support identification of pressures related to hydromorphological alterations. In particular, functional metrics can be grouped in feeding guild metrics, hydromorphology metrics, life cycle metrics, and sensitive taxa metrics:

1.
Feeding guild metrics take prevalence of certain feeding strategies of benthic invertebrate assemblages into account and include: • Rhithron feeding type index (RETI) [28]: proportion of grazer and shredder taxa in the total share of specimens of grazer, shredder, filter-feeding and detritvorous taxa; used in four river and two very large river ESCATs. • Proportion of predators (P Pre ): share of specimens of predatory taxa; used in one river ESCAT. • Proportions of grazers (P Gra ): share of specimens of grazer taxa; used in one river ESCAT.

•
Proportions of detritivorous taxa (P Det ): Share of specimens of detritivorous taxa (feeding on detritus); used in one river and one lake ESCAT.

•
Proportions of gatherers (P Gat ): share of specimens of gathering taxa (feeding on benthic fine particulate organic matter); used in two lake ESCATs. • Proportion of xylal-feeding, shredder, active filter feeders and passive filter feeders (P XSAP ): share of specimens of xylal-feeding taxa (i.e., taxa feeding on wood), shredder taxa, active filter feeders (feeding on fine particulate organic matter that is actively filtered from the water body), and passive filter feeders (feeding on fine particulate organic matter that is passively filtered from the water body); used in one river ESCAT.

•
Longitudinal Zonation Index (LZI) [29]: analogous to the SI where calculation may follow Pantle and Buck [17] or Zelinka and Marvan [22]-describes the fit of the observed community to particular hydromorphological conditions by using taxon-specific ecological competence/preference points that describe the occurrence probability of a taxon along a hydromorphological gradient from spring to estuary; used in three river and two very large river ESCATs.

Comparing ESCATs between Countries Based on the Most Frequently Used Modules and Metrics
In rivers, the most commonly used modules for organic pollution or general degradation comprise the SI ZM , and the ASPT and BMWP. Among diversity indices, Shannon diversity is used most often in river ESCATs, followed by Margalef's index. Taxonomic diversity is usually assessed based on taxonomic richness and EPT richness metrics. Proportions of EPT taxa are also frequently used as a community composition metric. The most frequently used feeding guild metric is the RETI; the corresponding hydromorphological metrics are the LZI and P Lit . Further, the number of sensitive taxa is commonly used in river ESCATs.
In (very) large rivers, a very similar set of modules and metrics are frequently used: the SI ZM , the ASPT, Shannon diversity, overall taxonomic richness, and EPT richness as well as proportion of EPT taxa and the LZI. Additionally, proportions of Oligochaeta for community composition metrics and proportions of akal-, littoral-, and psammal-inhabiting taxa as hydromorphological metrics are used.
In lakes the most frequently used metrics differ slightly. Similar to river assessments, ASPT and BMWP are used as well as Shannon diversity, Margalef's Index, and taxonomic richness. However, lake assessments also frequently use proportions of Odonata, gatherers and may include proportions of r-selected and K-selected taxa.
In an attempt to generalize patterns of river ESCAT construction, we propose that, based on these patterns, four main groups can be distinguished: First, ESCATs relying exclusively on decision tables as used in Bulgaria, Ireland, Luxembourg, and Denmark. Second, ESCATs using a single module only as currently used in Norway, Spain, or Greece. Third, ESCATs relying on a combination of modules and metrics comprising at most the ASPT or a similar index, Shannon diversity, taxonomic richness and EPT richness and few if any other ecological metrics. ESCATs of the third group are used to assess river ecological status in Norway, Spain, Belgium (Wallonia), Belgium (Flanders), Latvia, Lithuania, Estonia, Cyprus, Italy, Poland, Portugal, and France. Fourth, ESCATs extensively using ecological metrics or pursuing altogether different strategies are used in ecological status assessments of rivers in Spain, Romania, Sweden, Slovakia, Croatia, Slovenia, Germany, the Czech Republic, and Austria. In this group, distinct indices were often developed to account for large ecological gradients represented by many different river types.
Concerning (very) large rivers and lakes, making coarse generalizations is challenging. Similarities of ESCATs exist, and in some cases (e.g., Austria and Slovenia) the same ESCATs are used, but the diversity of approaches developed for these systems as compared to river ESCATs is much greater.

Advantages of Different Module and Metric Types
The ways modules and metrics are defined follow different philosophies [10,11]. Modules and metrics can either be designed to allow for a rapid and robust assessment of an ecosystem, or to detect specific stressors that are particularly relevant for a habitat or set of habitats at high resolution to simplify management decisions.
For instance, fast and versatile application, reflected in coarse taxonomic resolution and limited integration of ecological parameters, may be favored over other more resource demanding approaches. Modules such as the ASPT or the BMWP are prime examples for this approach, as they do not require abundance data, are based on family-level taxonomic resolution, and can readily be applied to a broad spectrum of aquatic habitats [20,21,31]. Due to the ease of use, definition of ESC boundaries and establishing reference conditions can be speedily undertaken in typology-based approaches for estimating reference conditions. However, it should be noted that model-based estimates of reference conditions can outperform typology-based approaches if typology classification is not biologically meaningful [32][33][34][35]. Likewise, metrics based on taxon richness are robust and easily adopted [36]. Depending on the focal indicator taxa group, taxon richness may either decrease (e.g., total number of taxa, number of EPT taxa) or increase (e.g., number of Diptera taxa) with increasing stressor impact (e.g., [11,37,38]). However, they cannot be trained to a particularly high specificity when using coarse taxonomic resolution-if deviating from the reference benchmark their informativeness concerning the stressor is relatively limited [39][40][41]. Therefore such modules and metrics can be important tools when establishing ecological status of a hitherto unassessed habitat or when ESC estimation is to be conducted under resourcelimited conditions. However, an ordination or modelling-based approach to a priori define reference conditions (and select metrics and modules) usually provides better resolution than the simple use of a taxonomically coarse index such as ASPT [42][43][44][45][46].
Diversity indices and community composition metrics take an intermediate position between rapid and high-resolution modules and metrics, requiring abundance or relative abundance data but not ecological information for metric calculation [44]. Both approaches quantify shifts in proportions of taxa under stressor impact that stem from differences in niche space occupied by individual taxa-resulting in clear deviance from reference conditions. In particular changes in the relative abundance of taxa associated with specific habitat conditions can be used to identify habitat modification. For instance, an increase in the relative abundance of Oligochaeta and Diptera may indicate an accumulation of fine sediments and organic matter at a sampling site [47]. Conversely, a decrease in the proportion of, e.g., Ephemeroptera, Plecoptera, and/or Trichoptera taxa may signal habitat homogenization (i.e., a man-made simplification of habitat conditions resulting in the loss of microhabitats), changes in food resource composition, or organic pollution [41,[47][48][49].
Alternatively, modules and metrics may aim at resolving stressor impact at a high level of detail and focus on ecological characteristics of indicator species as well as abundance. As concerns modules, type-specific GDIs are especially useful for detecting and characterizing impairment on rivers but have not yet been established for lakes. Precisely calibrated GDI modules are highly relevant in many river ESCATs, and often are key in detecting stressor impact. Further, SIs are robust at quantifying the degree of anthropogenic organic load in rivers and can be calibrated to a high degree of specificity and accuracy [37]. Using SIs for ESC estimation in lakes is not as common, mostly because lake assemblages do not respond predictably to organic pollution (potentially due to a relatively greater proportion of benthic invertebrate taxa breathing atmospheric oxygen in these habitats) [17,50]. Other ecologically based modules have been developed following the SI example with substantial effort placed on acquiring the autecological characterization of species used as bioindicator taxa-culminating in a database now detailing ecological preferences all major bioindicator species [17]. The significance of such data for biomonitoring is tremendous: ecological metrics such as the RETI, or proportions of certain feeding guilds or taxa associated with specific hydromorphological conditions are widely used and enable differentiation of stressors [39,46,49]. In combination with properly defined reference conditions these high-resolution modules and metrics can be used to detect impact of organic pollution, hydromorphological alteration, or changes in land use relating to allochthonous matter input (e.g., large woody debris) [41,48,51,52]. In addition, metrics focusing on phenology of aquatic insects or reproductive strategies of the bioindicator communities can be used to assess long-term stability of an ecosystem and can be calibrated to detect impact of unrecorded disturbance events or the relatively slow response to climate change [53,54].
Ultimately, all of these different approaches have their advantages: either by providing rapid and easy assessment options or by providing precise information on the prevalent stressors. From a management perspective, both qualities are desirable and support decision-making. In light of the growing body of evidence for the complex interplay of multiple stressors in aquatic ecosystems [55][56][57], having precise information may, however, finally prove more important than getting that information quickly.

The Way Forward, Part 1: Improving ESC Assessment
To construct ESCATs, a combination of both rapid and high-resolution modules and metrics can be selected. Usually, however, only one of the two approaches is followed because of national assessment traditions and ambitions. A significant challenge for ESC assessment is the combination of multiple stressors, that all exert-in function of their combination and magnitude-distinct roles in different habitats [58,59]. In addition to this challenge, many of the currently used ESCATs lack information on stressor-response relationships, and thus may fail to identify stressors, or accurately rank stressor importance [59,60]. This is particularly true for rapidly applied modules and metrics with a long history of use and impedes designing and implementing best management measures. Improving ESC assessment will therefore require a shift towards modules and metrics based on ecological characteristics of the bioindicator communities as well as calibration of new and better ESCATs targeting the most important stressors [6,61,62]. Indeed, many of the currently used ESCATs were designed to depict impact of organic pollution-which, due to the implementation of the EU Directive on Urban Wastewater Treatment (91/271/EEC) and the WFD, in large areas of Europe no longer is the most pressing stressor-and do not cover emerging or multiple stressors including pollution by microplastics [6,[61][62][63][64]. Naturally, assessment systems need to be adapted to reflect environmental changes brought about by the prevalent stressors. For aquatic ecologists, this is the Red Queen's race of ecosystem management: timely providing such tools as may serve to maintaining the integrity of ecosystems at a certain stage of societal development.

The Way Forward, Part 2: Development of Future Biomonitoring Tools
Improving ESC assessment can only be achieved by calibrating ESCATs to detect stressors and quantify the magnitude of their impact on aquatic ecosystems. An especially promising approach for this purpose is offered by the integration of modern molecular tools, such as DNA metabarcoding, in ESC assessment-effectively the implementation of biomonitoring 2.0 [65,66].
Implementing novel tools like DNA metabarcoding will require adapting novel ES-CATs. This is because DNA metabarcoding and other molecular techniques cannot deliver the exact same data on BQE communities as is currently used for assessment [67,68]. At present, for benthic invertebrates taxa lists at various levels of taxonomic identification are used in connection with (mostly) abundance data. The standard sampling and assessment protocols allow for establishing an area-standardized estimate of the taxon richness and individuals at a sampling site to produce a taxa x abundance matrix. Based on these data, community composition and abundance can be described and compared to reference conditions. A generic biomonitoring 2.0 workflow can make use of samples obtained following these sampling protocols but may also be applied to environmental samples [69]. In the latter case, no voucher material is available for later quality control. Standard samples may be sorted to obtain the specimens, or preservative ethanol may be decanted and filtered to obtain material for DNA extraction. Following DNA extraction, PCR or a baitcapture approach may be used to enrich and subsequently sequence target gene fragments using high-throughput sequencing (HTS). Next, bioinformatic analyses deconstruct the sequencing raw data (usually containing several replicates of each sample) into molecular operational taxonomic units (MOTUs; groups of sequences derived by, e.g., a thresholdbased approach that are treated as taxa) that can be assigned to true taxa by use of reference libraries. MOTUs assigned to the same true taxon can then be summarized, and the number of individual HTS reads combined to allow for an estimate of taxon-specific read numbers in the sample.
Throughout such a biomonitoring 2.0 workflow, critical and well-founded decisions must be made to adopt the most suitable molecular and bioinformatic methods to control potential sources of error, and to reliable and repeatable generate data for ESC assessment. Still, some limitations of molecular methods remain: molecular methods do not produce abundance data that is identical to that used in existing ESCATs, and often only deliver occurrence data with reasonably high plausibility. Due to stochastic and choice-induced processes, also taxa lists produced by molecular methods are not identical to those delivered by the currently used standard methods [70][71][72][73].
Acknowledging these differences between standard and molecular data, we expect that some modules and metrics may still be used in a biomonitoring 2.0 framework following re-calibration (i.e., re-definition of reference conditions using molecular data). This is particularly pertinent to taxa number metrics, and modules using occurrence data only such as various GDIs. Stringent re-definition of reference conditions and module and metric re-calibration will be necessary for other metrics, particularly for such as integrating ecological characteristics of bioindicator taxa in assessment. The list of the most frequently used modules and metrics presented here may serve as target to optimize performance of molecular tools for use in biomonitoring.
However, as at the same time ecological status class assessment is developed (e.g., [74]), the purpose of biomonitoring 2.0 should rather be to develop a comprehensive novel toolbox to win the Red Queen's race of ecosystem management instead of trying to follow in the same steps.
Supplementary Materials: The following are available online at https://www.mdpi.com/2073 -4441/13/3/346/s1, Table S1: Supplementary information detailing references for national ES-CATs and intercalibrated approaches as used for the assessment of ecological status according to the Water Framework Directive in lakes, rivers and very large rivers in the European Union and associated countries.  Acknowledgments: S.V. acknowledges the patience of his family (including an elderly dog) as this manuscript was entirely written while on paternity leave. We gratefully acknowledge the reviews of four academic referees whose comments improved this manuscript.

Conflicts of Interest:
The authors declare no conflict of interest. Proportion of epipotamal-associated taxa P EpR Proportion of epirhithral-associated taxa P EPT Proportion of Ephemeroptera, Plecoptera and Trichoptera specimens P EPTCD Proportion of Ephemeroptera, Plecoptera, Trichoptera, Coleoptera and Diptera specimens P EPTD Proportion of Ephemeroptera, Plecoptera, Trichoptera and Diptera specimens P ET Proportion of Ephemeroptera and Trichoptera specimens P ETD Proportion of Ephemeroptera, Trichoptera and Diptera specimens P G Proportion of Gastropoda specimens P Gat Proportion of gathering taxa P GOlD Proportion of Gastropoda, Oligochaeta and Diptera specimens P HyR Proportion of hyporhithral-associated taxa P Limno Proportion of limnophilic taxa P Lit Proportion of littoral-associated taxa P Lith Proportion of lithal-associated taxa P MeR Proportion of metarhithral-associated taxa P Neg Proportion of «negative» taxa P Neo Proportion of Neozoa specimens P O Proportion of Odonata specimens P OCh Proportion of Oligochaeta and Chironomidae specimens P OD Proportion of Oligochaeta and Diptera specimens P Ol Proportion of Oligochaeta specimens P OrCh Proportion of Orthocladiinae (Chironomidae) specimens P ovp Proportions of ovoviviparous taxa P P Proportion of Plecoptera specimens P Pel

Abbreviations
Proportions of pelal-inhabiting taxa P Pos Proportions of «positive» taxa P Pre Proportions of predators P r-strat Proportions of r-selected taxa P r/K-strat Proportions of r-and K-selected taxa P Rheo Proportions of rheophilic taxa P Sens Proportions of sensitive taxa P T Proportion of Trichoptera specimens PTI Potamon-Typie Index P TP Proportion of Trichoptera and Plecoptera specimens P Typ Proportions of typical taxa P upv Proportions of uni-and polyvoltine taxa P XSAP Proportions of xylal-feeding, shredder, active filter feeders and passive filter feeders Q-Value Q-Value tables RETI Retention Feeding Type Index Rheo Rheo Index SI PB Saprobic Index sensu Pantle and Buck SI ZM Saprobic Index sensu Zelinka and Marvan SPEAR organic Species-at-risk by organic pollution