A Need for Standardized Reporting: A Scoping Review of Bioretention Research 2000–2019

Bioretention cells are a type of low-impact development technology that, over the past two decades, have become a critical component of urban stormwater management. Research into bioretention has since proliferated, with disparate aims, intents and metrics used to assess the “performance” of bioretention cells. We conducted a comprehensive, systematic scoping review to answer the question of “How is the field performance of bioretention assessed in the literature?”, with the aim of understanding (1) how is the performance of bioretention defined in the literature? (2) what metrics are used to assess actual and theoretical performance? A review of 320 studies (mostly peer reviewed articles) found that performance was defined in terms of hydrologic controls, while investigations into water quality pathways and mechanisms of contaminant transport and fate and the role of vegetation were lacking; additionally, long term field and continuous modelling studies were limited. Bioretention field research was primarily conducted by a small number of institutions (26 institutions were responsible for 50% of the research) located mainly in high income countries, particularly Australia and the United States. We recommend that the research community (I) provide all original data when reporting results, (II) prioritize investigating the processes that determine bioretention performance and (III) standardize the collection, analysis and reporting of results. This dissemination of information will ensure that gaps in bioretention knowledge can be found and allow for improvements to the performance of bioretention cells around the world.


Introduction
Stormwater management (SWM) has traditionally focussed on flood control through conveyance and water quantity control, with the goal of protecting human life and property during extreme events. Increased understanding of the importance of maintaining surface water resources and the significant adverse impacts caused by urbanization has shifted this conveyance-centred approach towards an approach that aimed to mimic a catchment's natural, pre-urbanization hydrologic patterns that promote infiltration and evapotranspiration. The goal of this type of development is to improve

Materials and Methods
The methodology for this scoping review followed established protocols [9,[12][13][14] of five core steps and one optional step: first, identify the research question; second, identify relevant studies; third, select studies; fourth, extract data (referred to as "charting the data" by some sources); fifth, collate and report results. A graphical representation of the review team's methodology is shown in a PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) diagram ( Figure  1).

Identifying Relevant Studies
Relevant studies were identified from peer-reviewed literature. Eleven databases (See Supplementary Information File S1, Table S1, spanning the fields of engineering, urban planning, architecture, earth sciences and biology, were searched between 25 September and 31 October 2018 and again on 4 April 2020. Ninety-three "review articles" were subsequently reviewed for their citation list, with any additional citations screened and added. An intentionally broad range of search terms on variants of the name "bioretention" was used in searching, as well as terms referring to stormwater and hydrology. Search parameters can be found in Supplementary Information File S1. The search resulted in downloading 8095 citations across 11 databases followed by a deduplication process using Endnote [15] that removed 4360 citations.

Identifying Relevant Studies
Relevant studies were identified from peer-reviewed literature. Eleven databases (See Supplementary Information File S1, Table S1, spanning the fields of engineering, urban planning, architecture, earth sciences and biology, were searched between 25 September and 31 October 2018 and again on 4 April 2020. Ninety-three "review articles" were subsequently reviewed for their citation list, with any additional citations screened and added. An intentionally broad range of search terms on variants of the name "bioretention" was used in searching, as well as terms referring to stormwater and hydrology. Search parameters can be found in Supplementary Information File S1. The search resulted in downloading 8095 citations across 11 databases followed by a deduplication process using Endnote [15] that removed 4360 citations.

Criteria Inclusion Exclusion
Must meet the definition of a bioretention cell, even if it is referred to by another name A bioretention cell was defined as a site-specific water quality and water quantity control device, containing vegetation and engineered soil media, and receiving urban stormwater runoff. Stormwater systems defined as biofiltration, bioinfiltration, bioswale, were included if the description of the system matched the definition of a bioretention cell given above.
Drainage systems that were not bioretention cells. Any stormwater systems or methods that function similarly to a bioretention cell, but were not designed as such, were also not included, such as a vegetated drainage ditch in high-infiltration soils, or a clogged infiltration basin that vegetated naturally.
Bioretention cell must treat urban stormwater runoff The bioretention cell(s) being studied must have received urban stormwater runoff, or an approximation of urban stormwater runoff (e.g., simulated runoff).
Any stormwater system that treated a type of water other than urban stormwater (e.g., wastewater, agricultural runoff).

Independently assesses bioretention
Only studies that assessed a bioretention cell independently. Assessment primarily concerned hydrologic and/or contaminant transport and fate.
Studies that evaluated a bioretention cell in combination with other systems, and where the effect of the bioretention system on the measurements taken were not separated, e.g., studies where the data was collected at the river/stream/watershed level, and where the specific effect of the bioretention cell(s) could not be separated from other variables such as land-use practices. Studies were also excluded if the objective was to assess where to place LID/GI measures and did not include any performance data of bioretention.

Criteria Inclusion Exclusion
Type of study Studies that evaluated bioretention cells in the field or through a conceptual model. A field study was defined as a study where the bioretention cell(s) being studied was (were) built into the ground in an outdoor setting. Conceptual models incorporated an element of computer simulation in the study.
Studies that evaluated bioretention cells in highly controlled environments, such as a lab or a mesocosm. "Case studies" or any type of article that did not provide metrics or criteria for evaluating bioretention cells.
Types of publication Only peer-reviewed journal articles that generated original research findings.
Conference proceedings, theses, review articles.

Language
Only articles published in English.
Articles written in all other languages than English.
Accessibility of full-text publications Publications required full text articles.
Articles that could not be accessed using institutional access or direct correspondence with authors or abstracts without full-text articles.

Data Extraction
Each article was assigned one reviewer for data extraction. To ensure consistency among reviewers, 18 out of 320 articles (5.6%) were extracted by two reviewers. The extracted data included study information such as who was performing the research, the location and specific design parameters of the bioretention cell being studied, how the study was conducted, and the type of results that were reported. The data extraction template, including explanations for each data extraction field, is included in Supplementary Information File S3. The extracted data are also included as a downloadable spreadsheet in Supplementary Information File S4. Data extraction included four broad categories ( Figure 2).
Water 2020, 12, x FOR PEER REVIEW 5 of 35 Type of study Studies that evaluated bioretention cells in the field or through a conceptual model. A field study was defined as a study where the bioretention cell(s) being studied was (were) built into the ground in an outdoor setting. Conceptual models incorporated an element of computer simulation in the study.
Studies that evaluated bioretention cells in highly controlled environments, such as a lab or a mesocosm. "Case studies" or any type of article that did not provide metrics or criteria for evaluating bioretention cells.
Types of publication Only peer-reviewed journal articles that generated original research findings.
Conference proceedings, theses, review articles.

Language
Only articles published in English.
Articles written in all other languages than English.
Accessibility of fulltext publications Publications required full text articles.
Articles that could not be accessed using institutional access or direct correspondence with authors or abstracts without full-text articles.

Data Extraction
Each article was assigned one reviewer for data extraction. To ensure consistency among reviewers, 18 out of 320 articles (5.6%) were extracted by two reviewers. The extracted data included study information such as who was performing the research, the location and specific design parameters of the bioretention cell being studied, how the study was conducted, and the type of results that were reported. The data extraction template, including explanations for each data extraction field, is included in Supplementary Information File S3. The extracted data are also included as a downloadable spreadsheet in Supplementary Information File S4. Data extraction included four broad categories ( Figure 2). The publication information within the study characteristics category was automatically extracted from the citation metadata by Endnote. The author affiliation (academic, government, consulting, or other) was extracted from the listed affiliation of the first author at the time of publication. A collaboration was defined as a publication in which the authors had different affiliation types. Information about the methodology, study aim and definitions was extracted from the full-text, such as information on the duration of the study, its stated aims, etc.
The system characteristics category compiled a field bioretention system's location, dimensions and configuration (e.g., presence of an underdrain). Data were extracted by the reviewers and  The publication information within the study characteristics category was automatically extracted from the citation metadata by Endnote. The author affiliation (academic, government, consulting, or other) was extracted from the listed affiliation of the first author at the time of publication. A collaboration was defined as a publication in which the authors had different affiliation types. Information about the methodology, study aim and definitions was extracted from the full-text, such as information on the duration of the study, its stated aims, etc.
Water 2020, 12, 3122 6 of 35 The system characteristics category compiled a field bioretention system's location, dimensions and configuration (e.g., presence of an underdrain). Data were extracted by the reviewers and analyzed either in Excel or with GIS software [17,18]. Bioretention cell locations were charted with data on their administrative area [19] and Köppen-Geiger climate zone [20]. Characteristics were recorded on a per-cell basis; as field sites frequently include multiple bioretention cells, the denominator when reporting was much higher than the number of studies (602 cells for which data were extracted, across 320 studies). Some field locations included multiple systems, while others were researched multiple times as the system aged. A field location that was studied by five different researchers or included five independent bioretention cells generated the equivalent amount of academic output. Thus, the variable "cell-times studied" was defined as the sum of the number of times a specific bioretention cell was investigated across all studies.
The model characteristics category compiled information on the models used, the processes the models represented and their computational basis. The majority of data in the model characteristics category were extracted by one reviewer experienced in modeling, subject to the same 5% overlap with other reviewers for quality assurance and control.
The key findings category compiled the results presented in the studies, including whether the study focused on hydrology, contaminant transport and fate processes or other types of findings, how the authors defined "performance", the protocols used for lab and field analysis, etc. The specific contaminants under investigation were tracked and classified as metals, nitrogen, phosphorus, total suspended solids (TSS), organic contaminants (e.g., polyaromatic hydrocarbons), pathogens (e.g., E. coli, viruses), and/or general. The "General" category included measurements such as pH, chemical oxygen demand (COD), temperature, chloride ion concentration and electrical conductivity. A study was identified as using an "established protocol" whenever the methodology referenced a prior researcher or standard for how to report or calculate a parameter. A "researcher-developed protocol" was defined as reporting methods or calculations that were shown but did not reference any precedents or standards. "Black-Box" studies were defined as those that did not specify pathways and only measured results at the inlet and outlet.

Study Characteristics
Bioretention research in academia has grown steadily since 2000. The Prince George's County Maryland Low-Impact Design Manual [21], published in 1993, is the first known guidance document that describes a bioretention system. The first known academic research on bioretention was presented at the American Society of Civil Engineers' Annual Water Resources Planning and Management Conference in 1999 [22]. As shown in Figure 3 The literature has been dominated by academics in engineering, with 82% percent of authors (263 out of 320) having their discipline in civil, environmental, chemical, or biological engineering. All articles published before 2006 were authored by researchers in engineering disciplines. Since then, the field has diversified, with an average of 12% of articles published between 2006 and 2015 and 23% of articles published between 2016 and 2019 having authors in fields other than engineering such as environmental science, chemistry, horticultural science, and more. More than 90% of publications (296 out of 320) had a first author affiliated with an academic institution, usually within civil engineering. Only 3% of publications included an author from a government agency and 3% of publications included an author from the private sector. Less than 20% of publications with a first author from an academic institution also included an author from outside of academia, with only 15% (47/320) publications  Although 163 different institutions led research in bioretention field or model performance (defined by the home institution of the lead author), research has mostly originated from a few institutions in clusters in the northeastern United States, eastern Australia, and more recently, China. Notable centres of high research output include North Carolina State University (USA), the University of Maryland (USA), Monash University (Australia), Villanova University (USA) and Xi'an University of Technology (China). These five institutions have each published more than ten peerreviewed studies and account for 24% (77/320) of reviewed articles. Twenty-six institutions, publishing more than three studies apiece, accounted for 50% (159/320). Universities in the northeastern United States were the dominant publishers from 2000 to 2019, with no single other geographic area accounting for significant research in 2000-2005. Australian and Chinese research clusters appeared in the 2006-2010 and the 2011-2019 periods, respectively, and these countries accounted for the second-and third-most publications overall, after researchers from the United States.

Bioretention System Location
The location of a bioretention cell was a key variable influencing both the design of and results from bioretention field studies. The 237 field articles in our dataset examined a total of 953 bioretention systems across 457 unique locations (some locations contained more than one cell). These 953 cells were studied an average of 1.6 times each, with 44% reported in more than one study. Overall, performance data was published from a total of 1289 cell-times studied. Studied bioretention cells were located in 19 countries from every populated continent except for Africa ( Figure 4).
Bioretention field research originated in the United States and was then rapidly adopted in Australia; these two countries accounted for the majority of bioretention field research output with 59% (727/1289) and 22% (310/1289) of cell-times studied, respectively ( Figure 4). Other early adopters were Canada and countries in Western Europe, which accounted for 3% (37/1289) and 8% (108/1289) of cell-times studied, respectively. Beginning in 2015, the dominance of Australia and the United States dropped from >90% of cell-times studied before 2016 to approximately 65% afterwards. Much of this change was due to a rapid increase in research originating in China, which accounted for 8% (103/1289) of cell-times studied overall but 15% (98/668) after 2016.
Unsurprisingly, field studies have tended to be conducted in cities close to lead research Although 163 different institutions led research in bioretention field or model performance (defined by the home institution of the lead author), research has mostly originated from a few institutions in clusters in the northeastern United States, eastern Australia, and more recently, China. Notable centres of high research output include North Carolina State University (USA), the University of Maryland (USA), Monash University (Australia), Villanova University (USA) and Xi'an University of Technology (China). These five institutions have each published more than ten peer-reviewed studies and account for 24% (77/320) of reviewed articles. Twenty-six institutions, publishing more than three studies apiece, accounted for 50% (159/320). Universities in the northeastern United States were the dominant publishers from 2000 to 2019, with no single other geographic area accounting for significant research in 2000-2005. Australian and Chinese research clusters appeared in the 2006-2010 and the 2011-2019 periods, respectively, and these countries accounted for the second-and third-most publications overall, after researchers from the United States.

Bioretention System Location
The location of a bioretention cell was a key variable influencing both the design of and results from bioretention field studies. The 237 field articles in our dataset examined a total of 953 bioretention systems across 457 unique locations (some locations contained more than one cell). These 953 cells were studied an average of 1.6 times each, with 44% reported in more than one study. Overall, performance data was published from a total of 1289 cell-times studied. Studied bioretention cells were located in 19 countries from every populated continent except for Africa ( Figure 4).  Bioretention field research originated in the United States and was then rapidly adopted in Australia; these two countries accounted for the majority of bioretention field research output with 59% (727/1289) and 22% (310/1289) of cell-times studied, respectively ( Figure 4). Other early adopters were Canada and countries in Western Europe, which accounted for 3% (37/1289) and 8% (108/1289) of cell-times studied, respectively. Beginning in 2015, the dominance of Australia and the United States dropped from >90% of cell-times studied before 2016 to approximately 65% afterwards. Much of this change was due to a rapid increase in research originating in China, which accounted for 8% (103/1289) of cell-times studied overall but 15% (98/668) after 2016.
Unsurprisingly, field studies have tended to be conducted in cities close to lead research institutions. American field research was clustered on the Eastern Seaboard from Washington D.C. to Middlesex, Connecticut [23][24][25][26], while there are three Australian clusters centered close to or within Melbourne [27][28][29], Brisbane [27,28,30], and Sydney [28,31]. The rapid increase in research from China nearly all came from the province of Shaanxi [32][33][34][35], which developed as a cluster after 2015. The median distance between a bioretention cell and a research institution was 18 km. Most studies, i.e., 90%, were within 230 km of a research institution, and 75% were within 60 km.
Generally, we observed that the spread of bioretention field research between countries followed a pattern where investigation first occurred in institutions and areas of similar climate to established research clusters and then spread to institutions and areas in more diverse climate zones. The clearest example of this is in Australia, where between 2006-2015, all 127 of the cell-times studied were in temperate climate zones without a dry-season (Cfa or Cfb), the same climate zone as the original American cluster on the Eastern Seaboard, but after 2016, 12% (22/183) were in regions with a dry season (Csa or Csb). In China, a similar pattern has emerged as most of the first (2014-2015) domestically published results on full-scale bioretention cells were in temperate climate zones around Guangdong and Hong-Kong. At the same time, the now-larger research cluster near Xi'an University first built several pilot and prototype-scale systems before expanding to investigations of full-scale systems and publishing research findings in 2017. One exception to this pattern is a bioretention cell that was studied in Beijing [35] (located on the border of the temperate and arid-steppe climate zones) in 2014, which supports the contention that research locations are being driven by proximity to institutions as well as by the climate. More recent research in tropical climate zones has explicitly noted the lack of research on bioretention in tropical regions as a barrier to implementation [36]. These regions face climate-related challenges such as high rainfall intensity and long dry seasons. Interestingly, the first bioretention cells in our dataset from Brazil were from regions with a Cfa climate zone, for which there has been a significant amount of research in Australia, while more recently Brazilian research has been performed in an As (tropical savanna, dry summer) climate zone. Researchers from Colombia have also reported good results in translating bioretention practices to tropical conditions [37], indicating that a community of practice is developing for bioretention research in tropical areas. Two publications [38,39] from the Ferdowsi University (Iran) in collaboration with researchers from Australia, provide examples of knowledge transfer between two geographically separated arid regions.
The rise of bioretention research has often followed environmental policy within the country of research institutions. We found either guidelines on LID technologies or governmental policies promoting their use [2,[40][41][42][43][44][45][46][47] in all countries with studied field bioretention cells. However, the existence of policy alone was insufficient in spurring or accelerating bioretention research. For example, the US EPA Clean Water Act established regulations targeting non-point source pollution in 1992 [48], and guidelines and research for Low Impact Development followed suit [21,49]. These guidelines were followed by bioretention research starting in the year 2000. Similarly, in Australia, Water Sensitive Urban Drainage came into usage in the 1990s to protect water quality and conserve water [50], however research in Australia only began to be published after 2005. Several studies and guidelines on source control of urban drainage were published in European countries between 1990 and 2006 [42]. The expansion of publications on bioretention in China in 2015-2018 closely follows the Sponge City initiatives launched by the Chinese government in response to increased surface water flooding in rapidly urbanizing areas [2]. However, in the UK, no studies have followed from policy and guidelines that were published in 2007 and 2015 for stormwater urban drainage systems [1]. Similarly, in Brazil, a Sustainable Urban Drainage Manual was published in 2004 [51] and further legislation passed in 2007 [45], but output in English-language journals only began in 2017.
Overall, we observed that the locations where bioretention field research occurred depended on policy, climate similar to that of established research clusters, and the presence of research institutions with prolific researchers in the field. All but three of the 19 countries in which bioretention cells were located in our study are members of the Organization of Economic Cooperation and Development (OECD), a group of mostly high-income countries [52], indicating a gap in applications of bioretention for less affluent countries. Although less well studied than in Global North countries, water scarcity and polluted water supplies caused by untreated stormwater are identified problems in South Africa [53,54] and in India [55], and in both regions a combination of urbanization and climate change are further threatening water supplies [56,57], all problems that bioretention can help to ameliorate. As mentioned in Section 2.2, this scoping review only included research published in English, so any articles published in regional journals in other languages are not captured in the reviewed literature. This may explain the lack of results from South and Central America, where there were only eight cell-times studied from Chile (1) [58], Colombia (1) [37] and Brazil (6) [59], and that there were no studied cells in Africa, Eastern Europe, the Middle East, Central Asia and South Asia.

Terminology and Definitions
Bioretention systems have been called many different names in the literature, including bioretention, biofilter, bioinfiltration, bioswale, and rain garden. Sixty percent of articles (214/320) used more than one term to define the system, though bioretention was often the first mentioned term. While these terms are used interchangeably in most cases, they can also refer to other water treatment technologies that incorporate filtration through natural media. "Bioretention" is the preferred term for the technology by most researchers globally, as 70% of the publications in this scoping review (224/320) used the term "bioretention" and only 30% of publications used terms such as bioswale, biofilter, bioinfiltration, and rain garden. In 32% of the articles (102/320), the authors defined the bioretention system themselves (i.e., no citation included). Davis, Hunt (two north-eastern American researchers) and the Prince George's County are the next most cited sources for definitions, at 14% (46/320), 7% (22/320) and 4% (14/320), respectively, again highlighting the influence this region has had on the field of study. Though solutions to stormwater issues generally evolve locally [42], there is some international agreement for the use of the term "bioretention", suggesting an international community is coalescing around this shared technology.
The choice and frequency of words used within the literature provided insight into how performance is defined and assessed by researchers. Bioretention performance was defined by authors in terms of hydrology (e.g., reducing runoff volume and peak flow) and reduction of pollutants (also often referred to as treatment of water quality). In definitions of performance, "runoff" was the most frequently used word (102 instances) and "pollutant" and "reduce/reducing" were the next most frequently used word with 77 and 70 instances, respectively. Other most frequently used words in the definition of performance include: water quality, peak flow, volume, infiltration, filtration and urban.

Physical Characteristics
A schematic of a typical bioretention cell composed of the most frequently reported components is shown in Figure 5. The most commonly identified aspects of a bioretention cell (i.e., vegetation, organic matter, media and underdrains) aligned well with system descriptions provided in early publications [60,61] and guidelines [21]. Bioretention components that directly affect the system's hydrologic performance (e.g., media, underdrain) were often described in more detail than components that affect the system's biological or ecological performance, such as plant species names, plant traits and selection of vegetation. Although bioretention cells are almost always described as vegetated, researchers tend to have a limited understanding of the impact of vegetation on performance. The engineered soil media in the bioretention cell is the primary element that determines its hydrologic performance. Researchers typically followed local or commonly cited guidelines in selecting media. The high proportion of sand in the bioretention soil mix, as well as the inclusion of organic matter, was reflective of design guidelines [21,62] that recommend primarily sandy materials and low clay content for bioretention media. Fifty-four bioretention cells studied in 29 field studies incorporated a novel amendment in their soil mix, which is commonly used to enhance the hydraulic and/or chemical characteristics. For example, zeolite and granular activated carbon have been studied for metal sorption [63], alumina or ferric oxide based water treatment residuals for phosphorus sorption [64,65], and fly-ash for microbial and heavy metal sorption [66].
The presence of vegetation was frequently reported, but little detail was given on vegetation characteristics or the role vegetation plays in achieving hydrologic or ecological performance objectives. Overall, the vast majority of papers did not identify plants according to their scientific names or have any discussion of specific plant traits (see Table 2 for more details).
Bioretention research was often conducted on single systems, including ones purpose-built for research or demonstration. Fifty-five articles (out of 320 studied, 17%) published results from one bioretention cell, 37 articles (12%) using results from two bioretention cells, and 21 articles (7%) from three cells. Fifty-three articles (17%) did not provide a clear indication of how many cells were installed, while 81 studies (57%) with a modelling component either based their models on theoretical bioretention cells or compared their models to catchments without bioretention. Aside from one modelling study and one life-cycle assessment study (classified as Study Type: "Other"), all journal articles that assessed more than 20 cells were field surveys, which generally studied the long-term performance of existing cells. While the greatest number of cells studied within one article was 78 [67], the greatest number of cells studied in pilot-scale and full-scale field studies were 18 [68] and 12 [69,70], respectively. The engineered soil media in the bioretention cell is the primary element that determines its hydrologic performance. Researchers typically followed local or commonly cited guidelines in selecting media. The high proportion of sand in the bioretention soil mix, as well as the inclusion of organic matter, was reflective of design guidelines [21,62] that recommend primarily sandy materials and low clay content for bioretention media. Fifty-four bioretention cells studied in 29 field studies incorporated a novel amendment in their soil mix, which is commonly used to enhance the hydraulic and/or chemical characteristics. For example, zeolite and granular activated carbon have been studied for metal sorption [63], alumina or ferric oxide based water treatment residuals for phosphorus sorption [64,65], and fly-ash for microbial and heavy metal sorption [66].
The presence of vegetation was frequently reported, but little detail was given on vegetation characteristics or the role vegetation plays in achieving hydrologic or ecological performance objectives. Overall, the vast majority of papers did not identify plants according to their scientific names or have any discussion of specific plant traits (see Table 2 for more details).
Bioretention research was often conducted on single systems, including ones purpose-built for research or demonstration. Fifty-five articles (out of 320 studied, 17%) published results from one bioretention cell, 37 articles (12%) using results from two bioretention cells, and 21 articles (7%) from three cells. Fifty-three articles (17%) did not provide a clear indication of how many cells were installed, while 81 studies (57%) with a modelling component either based their models on theoretical bioretention cells or compared their models to catchments without bioretention. Aside from one modelling study and one life-cycle assessment study (classified as Study Type: "Other"), all journal articles that assessed more than 20 cells were field surveys, which generally studied the long-term performance of existing cells. While the greatest number of cells studied within one article was 78 [67], the greatest number of cells studied in pilot-scale and full-scale field studies were 18 [68] and 12 [69,70], respectively. The studied bioretention cells were predominantly small systems, receiving runoff from a small drainage area, and not often connected to other bioretention cells or other stormwater management technologies. Most bioretention cells (299/329 reported) were not part of a treatment train where two or more stormwater management practices were placed in series with each other. The system size was defined as the ground footprint of the bioretention cell at the bottom of the ponding area. The system size distribution was left-skewed (see Figure 6a), with cells less than 5 m 2 accounting for 25% of all cells that reported a system size (103/413). Similarly, the catchment areas supplying stormwater to these systems were also left-skewed, with 65% of reported catchment areas (214/253) being less than 0.25 hectares, and areas > 500 m 2 being the most commonly represented size, as shown in Figure 6b. Based on the reported system sizes and catchment areas, the ratio of drainage area to system size was calculated, with the histogram of the ratios shown in Figure 6c. bioretention systems incorporated in treatment trains (See Table 1 and criteria "independently assesses bioretention"), the lack of data on bioretention as part of a stormwater treatment train appears to be a significant gap in the literature, as we were careful to include all results where the performance of the bioretention cell could be independently assessed. More information and study is needed to improve our understanding of how to effectively combine bioretention systems with additional stormwater management technologies to achieve the full spectrum of hydrologic targets.

Bioretention Modelling
Models are an essential tool to overcome the site-specific limitations of field-based research and to allow researchers to investigate a broader spectrum of questions than can be answered through measurement alone. Models are used to transfer results from the lab to the field scale and to allow researchers to understand often-complex internal dynamic processes involved in the movement of water, fate, transport and retention of contaminants within a bioretention cell. Overall, 41% (130/320) of studies had a model component, of which 24% (31/130) contained both model and field results. Considering different versions of a given model (e.g., grouping a two-and a one-dimensional version of the same model) 14 models were used more than once and 53 models were used a single time in a total of 141 model instances (defined as independent model parameterizations) (Figure 7). The number of model instances has increased rapidly from a single publication with a model component in 2004 [71] to 24 model instances in 2019. Although our study's exclusion criteria may have partially contributed to the low number of bioretention systems incorporated in treatment trains (See Table 1 and criteria "independently assesses bioretention"), the lack of data on bioretention as part of a stormwater treatment train appears to be a significant gap in the literature, as we were careful to include all results where the performance of the bioretention cell could be independently assessed. More information and study is needed to improve our understanding of how to effectively combine bioretention systems with additional stormwater management technologies to achieve the full spectrum of hydrologic targets.

Bioretention Modelling
Models are an essential tool to overcome the site-specific limitations of field-based research and to allow researchers to investigate a broader spectrum of questions than can be answered through measurement alone. Models are used to transfer results from the lab to the field scale and to allow researchers to understand often-complex internal dynamic processes involved in the movement of water, fate, transport and retention of contaminants within a bioretention cell. Overall, 41% (130/320) of studies had a model component, of which 24% (31/130) contained both model and field results. Considering different versions of a given model (e.g., grouping a two-and a one-dimensional version of the same model) 14 models were used more than once and 53 models were used a single time in a total of 141 model instances (defined as independent model parameterizations) (Figure 7). The number of model instances has increased rapidly from a single publication with a model component in 2004 [71] to 24 model instances in 2019.
As with field-based research, modelling efforts have emphasized the hydrologic and hydraulic functions of bioretention over contaminant transport and fate. Only 25% (35/141) of modelling articles addressed contaminant transport and fate, while 75% (106/141) of modelling articles addressed bioretention hydrology or hydraulics. The US EPA's Storm Water Management Model (SWMM) was the most commonly used model, at 31% (44/141 model instances), followed by RECARGA at 7% (10/141) and HYDRUS-1D or 2D at 5% (7/141). Seven models were used three or more times in this study's dataset ( Table 3).
functions of bioretention over contaminant transport and fate. Only 25% (35/141) of modelling articles addressed contaminant transport and fate, while 75% (106/141) of modelling articles addressed bioretention hydrology or hydraulics. The US EPA's Storm Water Management Model (SWMM) was the most commonly used model, at 31% (44/141 model instances), followed by RECARGA at 7% (10/141) and HYDRUS-1D or 2D at 5% (7/141). Seven models were used three or more times in this study's dataset (Table 3). Models were used to study a variety of different hydraulic and contaminant transport problems, and had varying levels of complexity from empirical equations describing bioretention as a "blackbox" to fully mechanistic models describing bioretention contaminant transport and fate, or hydrology [72], using time-and space-dependent partial differential equations [35]. Mechanistic models were used in most studies that looked at specific processes in bioretention, representing 88% (104/118) and 61% (20/33) of mechanistically-explicit hydrologic and contaminant transport and fate studies, respectively. The processes captured during data extraction can be divided into those dealing primarily with hydrology and those pertaining to contaminant transport and fate, and are shown graphically in Figure 8. Models were used to study a variety of different hydraulic and contaminant transport problems, and had varying levels of complexity from empirical equations describing bioretention as a "black-box" to fully mechanistic models describing bioretention contaminant transport and fate, or hydrology [72], using time-and space-dependent partial differential equations [35]. Mechanistic models were used in most studies that looked at specific processes in bioretention, representing 88% (104/118) and 61% (20/33) of mechanistically-explicit hydrologic and contaminant transport and fate studies, respectively. The processes captured during data extraction can be divided into those dealing primarily with hydrology and those pertaining to contaminant transport and fate, and are shown graphically in Figure 8. General modelling best-practices would be to determine whether an existing tool could be used before developing a new tool. However, single-use models were common, representing 38% (53/141)   The 141 model instances explicitly investigated 34 unique processes. The most commonly investigated processes were the fundamental hydrologic processes of a bioretention, including infiltration (87%, 122/141), overflow (72%, 102/141), and drainage through an underdrain (65%, 92/141). Evapotranspirative flows were considered by approximately half (49%, 69/141) of model instances, reinforcing the idea that plant dynamics are under-represented in bioretention field research.
General modelling best-practices would be to determine whether an existing tool could be used before developing a new tool. However, single-use models were common, representing 38% (53/141) of model instances. Single-use models were often used to investigate more specialized processes (e.g., filter cake formation [98], metal complexation [99]) that were not found in any of the more commonly applied models. To aid researchers in identifying applicable modelling tools for different applications, Table 3 shows the focuses and processes included in the most common modelling tools found in this study, and Supplementary Information File S4 Table S3 has a complete list of all models evaluated and the processes they included.
In some cases, it was difficult to tell which processes were considered, due to inconsistencies in reporting information regarding model parameter selection, calibration and evaluation. More rigorous attention is needed when describing and reporting model development and evaluation to ensure the transparency and reproducibility of modelled results. Moreover, publicly available models should be used over developing custom-built models, if the current tools adequately describe the processes of interest, and open-source models should be preferred to those with proprietary software, as open-source models provide full transparency by allowing users to access and edit the code. The US EPA SWMM program is particularly noteworthy in this respect since it is widely used, user-friendly and completely open-source.

Temporal Characteristics
Long-term performance of established systems is inadequately examined in the existing bioretention literature. Most field bioretention systems studied less than 20 rain events on systems that were operational for less than two years. As shown in Figure 9a, 72% (135/187) of studies that reported a duration of 0-2 years with 6 months to 1 year being most common. The distribution of the number of events observed per publication is shown in Figure 9b. More than half of the studies (55% or 87/154) that reported the number of events monitored collected data from less than 20 rainfall events. Overall, only 8% of studies (13/154) included more than 100 events. The highest number of storms studied in a publication was 966 [100].
The operational life of bioretention cells has been estimated to be anywhere between 35 and 50 years [101,102], so there is a need for a more critical examination of the performance of mature (>10 years) systems. Of the 320 articles included in this review, 17 studies examined bioretention cells at 10 or more years of maturity. The oldest bioretention was 34 years old and was assessed as part of a field survey studying the impacts of heavy metal concentration in topsoil [103]. Seventy-one percent (239/370) of the bioretention cells reported a system age of less than 5 years (see Figure 9c). Ninety-one sites were less than one year old, suggesting that they were purpose-built for the study or that the study took advantage of new construction to gather data. Older bioretention cells were often studied as part of field surveys, where data was collected over a short period of time (usually at a single point in time) across many bioretention cells.
Studying mature bioretention systems can be difficult for a range of reasons, for example, lack of access, as well as land-use and property ownership changes. Modelling tools are one way to overcome these obstacles. Models can be built based on detailed results taken from a few bioretention cells, and then evaluated through their application in other systems. Models can also use field data to contribute to long term or continuous simulations to determine long term performance changes. The operational life of bioretention cells has been estimated to be anywhere between 35 and 50 years [101,102], so there is a need for a more critical examination of the performance of mature (>10 years) systems. Of the 320 articles included in this review, 17 studies examined bioretention cells at 10 or more years of maturity. The oldest bioretention was 34 years old and was assessed as part of a field survey studying the impacts of heavy metal concentration in topsoil [103]. Seventy-one percent (239/370) of the bioretention cells reported a system age of less than 5 years (see Figure 9c). Ninetyone sites were less than one year old, suggesting that they were purpose-built for the study or that the study took advantage of new construction to gather data. Older bioretention cells were often studied as part of field surveys, where data was collected over a short period of time (usually at a single point in time) across many bioretention cells.
Studying mature bioretention systems can be difficult for a range of reasons, for example, lack of access, as well as land-use and property ownership changes. Modelling tools are one way to overcome these obstacles. Models can be built based on detailed results taken from a few bioretention cells, and then evaluated through their application in other systems. Models can also use field data to contribute to long term or continuous simulations to determine long term performance changes.
The plurality of modelled studies relied upon event-based analysis (38%, 54/141) or modelled over the seasonal or annual scales (37%, 53/141), with only 14% (20/141) investigating multi-annual processes. Wadzuk, Lewellyn, Lee and Traver [100] had the longest modelling study duration of 12 years, while the longest study in a field setting was 7 years long and performed by Guo, et al. [104] and Komlos and Traver [105]. Komlos and Traver's study was also notable in that it studied a cell for 7 years, providing a rare combination of studying a relatively mature system (greater than 2 years maturity at start of study) cell for a long time (7 years) [105]. Many modelling tools enabled practitioners to explore system processes at different spatial scales, which could be difficult to evaluate using field monitoring programs alone. Most modelling work focused solely on the bioretention system (53%, 75/141) or catchment (23%, 32/141) scale; 20% (28/141) of model instances explicitly modelled both the bioretention system and the catchment together, and 4.2% (6/141) modelled aquifer or subsurface hydrologic response to a bioretention cell. The plurality of modelled studies relied upon event-based analysis (38%, 54/141) or modelled over the seasonal or annual scales (37%, 53/141), with only 14% (20/141) investigating multi-annual processes. Wadzuk, Lewellyn, Lee and Traver [100] had the longest modelling study duration of 12 years, while the longest study in a field setting was 7 years long and performed by Guo, et al. [104] and Komlos and Traver [105]. Komlos and Traver's study was also notable in that it studied a cell for 7 years, providing a rare combination of studying a relatively mature system (greater than 2 years maturity at start of study) cell for a long time (7 years) [105]. Many modelling tools enabled practitioners to explore system processes at different spatial scales, which could be difficult to evaluate using field monitoring programs alone. Most modelling work focused solely on the bioretention system (53%, 75/141) or catchment (23%, 32/141) scale; 20% (28/141) of model instances explicitly modelled both the bioretention system and the catchment together, and 4.2% (6/141) modelled aquifer or subsurface hydrologic response to a bioretention cell.

Processes and Results
Data extracted on the key findings of each article provided insight into the metrics used to assess the performance of bioretention cells. In this section, the data were classified into four categories: hydrologic processes and metrics, contaminant transport and fate processes and metrics, the methodology for results reporting, and other key findings. Hydrologic or water quantity processes (327 publications) were significantly better represented in published results than water quality or contaminant fate and transfer processes (185 publications). Collection of water quality data is more expensive and time-consuming than collection of hydrologic data, hence the dominance of hydrologic studies in the field. The hydrology of bioretention systems was a better understood phenomenon within the modelling research as compared to processes of contaminant transport and fate.

Hydrologic Processes and Metrics
Studied or modelled bioretention water balances typically included infiltration, drainage and overflow but often ignored or omitted evapotranspiration. Twenty-five percent of field studies reported on infiltration (89 out of 356 pathways reported), 10% on drainage (37/356) and 13% on overflow (46/356), representing the movement of water through the media, through the underdrain, and overtop of the ponding zone, respectively. Only 8% of field studies reported or estimated evapotranspiration rates from the studied bioretention system (27/356). Field researchers occasionally measured evapotranspiration via weighing lysimeter [106][107][108] but otherwise estimated evapotranspiration through water balances or evapotranspiration models [25,[109][110][111].
Similarly to field research, hydrological modelling processes included infiltration (92%, 98/106), overflow (79%, 84/106), drainage (72%, 76/106), and evapotranspiration (58%, 61/106). Of the water quantity models that neglected to model infiltration, three [112][113][114] were fully empirical and so implicitly considered infiltration, one [115] considered only detention storage and overflow from bioretention systems and two were from a paper that only modelled evapotranspirative flows [116]. Models that did not consider overflow investigated the dynamics of water movement within the subsoil only (e.g., the model instances of R2D and Hydrus-2D from Aravena and Dussaillant [58]), while those that did not consider underdrain drainage simulated full exfiltration bioretention systems. Evapotranspiration is commonly neglected in event-scale models, with the implicit or explicit assumption that evapotranspiration losses are negligible during and after a single event.
Volume reduction and peak flow reduction were the most commonly used hydrologic metrics (see Table 4), but there was a lack of consistency in calculating this metric. Davis [117] defined volume discharge ratio (outflow divided by inflow) and suggested plotting that against exceedance probability to evaluate bioretention performance. While this approach has been employed by a few authors in the United States [109,118,119], most report volume reduction as a percentage change, or (inflow-outflow) divided by inflow (e.g., [104,[120][121][122]), with no associated criteria for acceptable percent change. However, if researchers were to synthesize results from multiple papers in a systematic review to evaluate performance against criteria, it would require a transformation of data. A black-box approach, i.e., reporting flow and volume reductions between inlet and outlet only, was employed in 28% of field studies (62/221) and used exclusively (i.e., no other pathways reported) in 20% (44/221) of field studies.

Contaminant Transport and Fate, Processes and Metrics
The most commonly used performance metric for contaminant transport and fate were effluent concentration and mass loading, followed by calculated metrics such as a reduction in concentration, reduction in mass, and comparisons with regulatory criteria or water quality standards (Table 5). Field research has moved from primarily looking at the bioretention system as a black-box (defined here as looking solely at the inlet and outlet contaminant concentration or mass) to distinguishing the pathways by which contaminants are reduced. More studies have recently separated pathways into degradation, sorption, and loss to infiltration, as shown in Figure 10. Advances in measurement technology may account for the increase in field studies that study removal pathways, such as with Chen, et al. [125], who studied the precise location of nitrification and denitrification via DNA extraction from soil samples.

Contaminant Transport and Fate, Processes and Metrics
The most commonly used performance metric for contaminant transport and fate were effluent concentration and mass loading, followed by calculated metrics such as a reduction in concentration, reduction in mass, and comparisons with regulatory criteria or water quality standards (Table 5). Field research has moved from primarily looking at the bioretention system as a black-box (defined here as looking solely at the inlet and outlet contaminant concentration or mass) to distinguishing the pathways by which contaminants are reduced. More studies have recently separated pathways into degradation, sorption, and loss to infiltration, as shown in Figure 10. Advances in measurement technology may account for the increase in field studies that study removal pathways, such as with Chen, et al. [125], who studied the precise location of nitrification and denitrification via DNA extraction from soil samples. The contaminants studied have remained relatively constant over time ( Figure 11). In field studies, metals and nitrogen accounted for the largest share of results reported overall, at 25% (218/864) and 25% (215/864), respectively. Modelling studies focused most on nutrients (51%, 18/35), followed by suspended solids (34%, 12/35) and metals (29%, 10/35). Zinc, copper and lead were the three most commonly studied metals with 50, 46, and 33 results reported in studies with a field component (out of 218), respectively, whereas nitrate/nitrite, total nitrogen, and  Table 4. Performance metrics used for evaluating hydrologic and contaminant transport and fate performance in bioretention cells. The percentage of total metrics gives the number of times each metric was used divided by the total number of metrics used in all hydrology studies (372).

Performance Metric Description Formula/Data Requirements Limitations % of Total Metrics
Volume Reduction (V R ) Reduction in effluent volume between the inlet and the outlet, typically for the course of a storm event or 24-h period, or monthly or annually. Used as a regulatory metric in some jurisdictions.
Does not discriminate between removal pathways. In some cases the time period for reduction or the reduction metric is not clearly defined.

(129/372)
Peak Flow reduction (R p ) Reduction in peak flow caused by the bioretention cell, typically over the course of a storm event. Used as a regulatory metric in some jurisdictions.
Does not give information on total volumes. Over time it has been realized that many of the adverse outcomes from high peak flows are more associated with the total volume than the flow [123], so this metric is generally being phased-out in favour of volume-based metrics in a regulatory context [2,42,47,124].

(60/372)
Flow Rate (Q, m 3 ) Flow rate of the system. Often reported in comparison to a threshold flow rate determined by regulators.
Requires flow rate monitoring in at least one point in bioretention cell.
Non-normalized value, and therefore difficult to compare across sites. 16 (58/372) Hydraulic Conductivity (k, m s -1 ) Hydraulic conductivity of soil. Often given as the saturated conductivity (k sat ). Design manuals will frequently have the acceptable k sat of the engineering media specified.
Varies based on monitoring equipment and methodology, but generally follows Darcy's Law K = J i Highly variable spatially and temporally, so not always comparable between cells or in different places in the same cell. Hydraulic conductivity of the native soil is important for determining overall efficacy along with the conductivity of the cell. Change in the time of flow caused by the bioretention cell or other LID system.

(26/372)
Inconsistent calculation or usage. The lag time may measure the delay in the timing of the peak flow between the inlet (t qp,in ) and the outlet (t qp,out ) or may measure the time between the start of inflow (t 1,in ) and the start of outflow (t 1,out ).

Drain-down Time (T D , hr)
Time it takes for the ponding zone in a bioretention cell to drain. Typically, a required maximum value is included in design manuals, to prevent leaving stagnant water.
Requires water level and/or the effluent flow rate to be monitored Inconsistently utilized by researchers. Often a threshold maximum value is given in a design manual, so attention is only paid if the drain-down time exceeds that threshold.

(12/372) Hydraulic Retention Time (τ H )
Amount of time water spends in the system. Frequently calculated at steady-state so the time period needs to be specified.
The τ H will change over the course of a storm event, so this metric may give erroneous results or be difficult to compare when different normalization times are used.

(11/372)
Notes: Q P = Peak flow (m 3 /s). Table 5. Performance metrics used for evaluating contaminant transport and fate performance in bioretention cells. The percentage of total metrics gives the number of times each metric was used divided by the total number of metrics used in all contaminant fate and transport studies (1580).

Performance Metric Description Formula/Data Requirements Limitations % of Total Metrics
Effluent concentration (C out ) Concentration of the contaminant of interest in the effluent from the bioretention cell. We have also included general water quality parameters, such as temperature, in this category where they are measured at the effluent.Concentrations can also be expressed as the Event Mean Concentration (EMC), which is a flow-weighted concentration metric. Bulk reduction in contaminant concentration between the influent and the effluent.
As with the effluent concentration, it is not always clear what pathways are considered in the effluent concentration (C out ). Additionally, the contaminant may be accumulated in the bioretention cell and released later or redirected towards groundwater. Concentrations also change across a hydrograph, so removal by concentration may not give a relevant result; if EMC are used this is less important.

(393/1580)
Removal by mass or Summation of Loads(R M ) Bulk reduction in contaminant concentration between the influent and the effluent.
It is not always clear what pathways are considered in removal, and whether this represents accumulation in the system, removal by hydraulic processes (such as infiltration), transformation or mineralization.

Mass Loading
Integral of the total mass entering or leaving the bioretention cell. Similar to the EMC but expressed as a mass rather than a concentration.

M T = M in dt
Shows mass loadings, but not removal quantities or processes. Typically used to support mass removal calculations. A measure of when a metric is larger than the regulated or target value, typically on release from the system.
Typically but not exclusively expressed on a concentration basis.
Can also be used for hydrologic parameters.
C out ≥ C target When expressed on a concentration basis, the overall mass loading from the system can still be high even if the concentration remains low.

(42/1580)
Measure of equilibrium partitioning between a contaminant and soil. Used to investigate accumulation and transport through porous media.
Can be expressed using different isotherms depending on sorption dynamics. Desorption can follow a different isotherm. The contaminants studied have remained relatively constant over time ( Figure 11). In field studies, metals and nitrogen accounted for the largest share of results reported overall, at 25% (218/864) and 25% (215/864), respectively. Modelling studies focused most on nutrients (51%, 18/35), followed by suspended solids (34%, 12/35) and metals (29%, 10/35). Zinc, copper and lead were the three most commonly studied metals with 50, 46, and 33 results reported in studies with a field component (out of 218), respectively, whereas nitrate/nitrite, total nitrogen, and ammonia/ammonium were the three most commonly reported nitrogen species with 74, 52, and 39 (out of 218), respectively. Phosphorus, total suspended solids (TSS), water quality characteristics (noted as "General" in Figure 11, such as pH and conductivity) and organic contaminants comprised 10-13% of field results reported each. In modelling studies, organic contaminants represented~5% (2/35) while the "General" characteristics were less frequently modeled. Pathogens comprised 4% (32/864) of field results and 5% (2/35) of model results reported, and all other types of results (e.g., temperature and ecological indicators) comprised 3% (27/864) of total field results reported. The focus on metals and nutrients reflects societal concerns with contaminants that are harmful to aquatic species (metals, solids), drinking water quality (nitrogen, pathogens), and surface water ecological status such as eutrophication (phosphorus). The different target pollutants investigated also required modeling different processes. Most of the contaminant transport and fate models included processes that diverted water from the bioretention system's effluent through infiltration (69%, 24/35) or overflow (51%, 18/35) while models investigating other compounds with more complex fates, such as organic contaminants, considered sorption (34%, 12/35) and degradation (26%, 9/35). Models investigating chemical transport and fate often failed to account for the role of vegetation. Only one model instance [126] investigated plant uptake, while many either ignored plants entirely or only looked at their impact on hydrology through evapotranspiration. This indicates a gap among researchers evaluating bioretention performance, where the focus has largely been on hydraulic and hydrologic functions as opposed to ecologic and environmental details.
Unlike the hydrologic modelling work, where researchers have favoured established programs like SWMM, RECARGA or HYDRUS, no single model dominated the contaminant transport and fate modelling efforts. Versions of SWMM that investigated water quality issues were used in 17% (6/35) of model instances, the Model for Urban Stormwater Improvement Conceptualisation (MUSIC) was used in 11% (4/35) of model instances while MicroPollutants In RaingardEns (MPiRe) (the flow module of which is based on the one used in MUSIC) was used in 9% (3/35) of model instances, leaving most water quality modelling done with either custom-built models used to investigate specific processes or models used only by a single author. Depending on the research questions being asked, The different target pollutants investigated also required modeling different processes. Most of the contaminant transport and fate models included processes that diverted water from the bioretention system's effluent through infiltration (69%, 24/35) or overflow (51%, 18/35) while models investigating other compounds with more complex fates, such as organic contaminants, considered sorption (34%, 12/35) and degradation (26%, 9/35). Models investigating chemical transport and fate often failed to account for the role of vegetation. Only one model instance [126] investigated plant uptake, while many either ignored plants entirely or only looked at their impact on hydrology through evapotranspiration. This indicates a gap among researchers evaluating bioretention performance, where the focus has largely been on hydraulic and hydrologic functions as opposed to ecologic and environmental details.
Unlike the hydrologic modelling work, where researchers have favoured established programs like SWMM, RECARGA or HYDRUS, no single model dominated the contaminant transport and fate modelling efforts. Versions of SWMM that investigated water quality issues were used in 17% (6/35) of model instances, the Model for Urban Stormwater Improvement Conceptualisation (MUSIC) was used in 11% (4/35) of model instances while MicroPollutants In RaingardEns (MPiRe) (the flow module of which is based on the one used in MUSIC) was used in 9% (3/35) of model instances, leaving most water quality modelling done with either custom-built models used to investigate specific processes or models used only by a single author. Depending on the research questions being asked, contaminant transport and fate models can fall on a continuum between black-box or empirical representation of contaminant removal [127] to process-based models based on the mechanistic understanding of chemical behaviour in a bioretention cell. A promising mechanistically-based model is MPiRe, which was developed for modelling micropollutants such as pesticides [72] but has since been applied to the more traditional water quality metric of faecal microorganisms.

Reporting of Results
In both field and modelling bioretention studies, the collection, analysis and reporting of results does not follow any standardized or established practices. The lack of standardization presents difficulties for meta-analyses or systematic reviews, as results from each study must be transformed to allow for comparison. In field studies, guidelines for reporting results do exist (e.g., the International Stormwater Best Management Practice Database [128] and the US EPA 2009 publication of guidelines for urban stormwater BMP monitoring [129]), but they have not been taken up widely. Similarly, in modelling, protocols for evaluating models exist [130], but these recommendations were rarely incorporated.
In contaminant transport and fate studies (see Figure 12a), researchers often used established protocols for ensuring the reproducibility of laboratory techniques used for water sampling in field studies [27,131,132]. In hydrologic studies (see Figure 12b), researcher-developed protocols (such as the calculation of volume reduction) were used exclusively until 2008. In 2009 and after, a few studies began using established protocols [100,124,[133][134][135][136][137][138][139][140][141], which appears to correspond to the publication of the US EPA guidance on standardized reporting [129]. Standardization in collecting and reporting results for stormwater management technologies has been increasing, but there is no dominant guideline or standard that researchers are following.
Water 2020, 12, x FOR PEER REVIEW 26 of 35

Reporting of Results
In both field and modelling bioretention studies, the collection, analysis and reporting of results does not follow any standardized or established practices. The lack of standardization presents difficulties for meta-analyses or systematic reviews, as results from each study must be transformed to allow for comparison. In field studies, guidelines for reporting results do exist (e.g., the International Stormwater Best Management Practice Database [128] and the US EPA 2009 publication of guidelines for urban stormwater BMP monitoring [129]), but they have not been taken up widely. Similarly, in modelling, protocols for evaluating models exist [130], but these recommendations were rarely incorporated.
In contaminant transport and fate studies (see Figure 12a), researchers often used established protocols for ensuring the reproducibility of laboratory techniques used for water sampling in field studies [27,131,132]. In hydrologic studies (see Figure 12b), researcher-developed protocols (such as the calculation of volume reduction) were used exclusively until 2008. In 2009 and after, a few studies began using established protocols [100,124,[133][134][135][136][137][138][139][140][141], which appears to correspond to the publication of the US EPA guidance on standardized reporting [129]. Standardization in collecting and reporting results for stormwater management technologies has been increasing, but there is no dominant guideline or standard that researchers are following. In modelling practice, standardization and quality assurance can be achieved via model evaluation vs. measured or observed results and sensitivity analysis [130]. The rigour used in evaluating models varied, however, and 45% (64/141) of studies did not include any quantitative evaluation of model accuracy or fit. More robust modelling exercises tended to use either the r² value or the Nash-Sutcliffe coefficient to compare measured and modeled results, which together, were used in 33% (47/141) of model instances. Similarly, most models did not explicitly consider their sensitivity to input parameters, with 80% (113/141) not mentioning sensitivity analysis in any way. In modelling practice, standardization and quality assurance can be achieved via model evaluation vs. measured or observed results and sensitivity analysis [130]. The rigour used in evaluating models varied, however, and 45% (64/141) of studies did not include any quantitative evaluation of model accuracy or fit. More robust modelling exercises tended to use either the r 2 value or the Nash-Sutcliffe coefficient to compare measured and modeled results, which together, were used in 33% (47/141) of model instances. Similarly, most models did not explicitly consider their sensitivity to input parameters, with 80% (113/141) not mentioning sensitivity analysis in any way. Of the studies that included sensitivity analysis 6% (8/141) used global sensitivity analysis techniques, while 16% (22/141) used local or one-at-a-time techniques. More thorough sensitivity analysis methods provide confidence that the results of a model will be valid across a range of assumptions [142]. Thorough sensitivity analysis can also aid in improving models by suggesting areas where more accurate measurements would increase the precision of results [142]. Using established, previously evaluated models where the sensitivity is well understood, can reduce the need for full sensitivity analysis with every model instance, although the impact of the model sensitivity on particular model applications still needs to be considered. The uncertainty of model predictions was rarely provided, which means decision makers attempting to use these results will lack a critical understanding of the confidence with which the model predictions were made. Following best-practices in rigorously testing and evaluating the models used (e.g., Jakeman, et al. [143]) is a prerequisite for understanding how well these models can accurately represent bioretention performance.
Standardization of results reporting, monitoring and performance assessment are all common procedures in other disciplines, and stormwater managers and researchers can learn from outside of their discipline. A move to standardization has been seen in other environmental fields, such as in river restoration [144] and wetlands assessment [145], and should be reviewed as guides for the standardization of stormwater monitoring.

Other Types of Key Findings
Overall, the diversity of other findings beyond hydrology and contaminant transport and fate has increased since 2010. Hydrologic and contaminant processes are not the only performance measures and findings reported in the literature. The next most common category of results (234 out of 507 results) published by researchers was soil characteristics of the bioretention cell. These findings often include soil pH, moisture content, conductivity, and contamination levels. Other findings that are frequently reported include design and maintenance recommendations (92 results), findings related to the vegetation (68 results), and findings related to the habitats and ecosystems of either the bioretention cell or downstream areas (33 results). Figure 13 shows a breakdown of topics relating to bioretention systems that have been studied in the reviewed articles. Studies including design and operational recommendations increased the most in 2013 and continue to be a regular part of the study findings. Studies using life cycle assessment and other method to assess greenhouse gas emissions have increased since 2013 as well. Multidisciplinary research teams could improve the variety and types of studies, which could lead to a better understanding of bioretention co-benefits and drawbacks. Clearly, the least studied portion of a bioretention cell has been the vegetation, and broader collaboration could help to understand the role of vegetation in long-term maintenance, reduction of pollutants, or how to best design bioretention to function as landscape features. study findings. Studies using life cycle assessment and other method to assess greenhouse gas emissions have increased since 2013 as well. Multidisciplinary research teams could improve the variety and types of studies, which could lead to a better understanding of bioretention co-benefits and drawbacks. Clearly, the least studied portion of a bioretention cell has been the vegetation, and broader collaboration could help to understand the role of vegetation in long-term maintenance, reduction of pollutants, or how to best design bioretention to function as landscape features.

Conclusions and Recommendations
This scoping review has provided an overview of the field of bioretention field and modeling studies, to determine gaps in the literature and to answer the overall research question: "How is the field performance of bioretention assessed in the literature?" Under that primary research question are two sub-questions: (1) how is the performance of bioretention defined in the literature? (2) what metrics are used to assess actual and theoretical performance?
Bioretention field performance is being assessed mainly by researchers in engineering disciplines located in temperate regions, especially the Eastern Seaboard of the United States and large urban areas of Australia. Bioretention research has emerged in places with strong policy and research cluster interest, and then spread gradually to locations with other climates. Research is primarily conducted in Global North countries, and the literature is missing perspectives from locations with water stresses in the Global South that could benefit from this technology.
Field research in bioretention was conducted through the investigation of systems in the field and through modeling studies that translate research into removal pathways from laboratory or bench-scale studies to the field scale. Although bioretention cells are expected to be operational for 35 to 50 years, research on bioretention cells is focused primarily on systems that were purpose-built for research and conducted over a short duration. Modeling studies suffer from similar limitations, with event-based simulations outnumbering continuous simulations throughout the literature.
The definitions of "performance" used by researchers consistently emphasized hydrology and reduction of pollutants in runoff; however, in practice more studies looked at hydrology than contaminant transport and fate. Many hydrology-focussed studies used "volume reduction" to assess performance, although the methods of deriving volume reduction were inconsistent and not always adequately described. Studies that focussed on contaminant transport and fate often reported overall "removal" by either concentration or mass; again, definitions were inconsistent and dependent on the context.
We have three main recommendations from this review: I. Provide all original data on inlet/outlet flows or concentrations along with calculated values, such as volume reduction or removal. As many performance metrics can depend on the specific context of a bioretention cell or a study, reporting the underlying data is essential in allowing results to be generalized to other locations and applications. In addition to this, detailed information on the locations where bioretention is practiced (e.g., latitude/longitude) and the physical characteristics of studied bioretention cells, such as the year they were built, vegetation species and characteristics, and characteristics of the bioretention media and of the native soil are necessary to ensure that lessons learned in one location can be used to inform researchers in other parts of the world. II.
Prioritize investigating the processes that determine bioretention performance. As a profession, we need to better understand the underlying mechanisms (biological, physical and chemical) that lead to volume reduction and water treatment. Experiments that combine modeling and field monitoring results have been successfully used to investigate different aspects of performance, allowing the complex processes to be estimated. Additionally, research methods allowing for the investigation of specific processes, such as lysimetric data for investigating the role of plants in the bioretention water balance, are critical tools in understanding the ultimate efficacy of bioretention systems. More detailed research into the role of plants would be particularly warranted, as they are a key feature of bioretention design that has been somewhat neglected by the civil-engineering dominated bioretention community of practice. III.
Standardize the collection, analysis and reporting of results for stormwater management best practices, including bioretention systems. We recommend that researchers follow the reporting standards outlined in the US EPA 2009 publication of guidelines for urban stormwater BMP monitoring [129]), as they are the most current and up-to-date standards available. Researchers should continue to use the word "bioretention" to refer to these systems, and other terms should be phased-out or provided as secondary names. Harmonization of investigative methods will allow for the optimization of bioretention cell performance in a way that is currently difficult due to problems translating research from one system to another.