First Flush Stormwater Runoff in Urban Catchments: A Bibliometric and Comprehensive Review

: First ﬂush is a phenomenon in stormwater runoff that has been considered a topic of great interest in the ﬁeld of nonpoint source pollution. Despite several attempts to deﬁne the ﬁrst ﬂush quantitively, the speciﬁed characteristics of the phenomenon vary among sources. To address these uncertainties, a bibliometric and comprehensive review on published articles related to ﬁrst ﬂush was conducted. A corpus of 403 research articles was obtained from the Scopus database, which was then parsed using the CorText Manager for the bibliometric analysis. The study examined quantitative deﬁnitions of ﬁrst ﬂush from various sources; climate and topographic characteristics of monitoring and experimental sites where the studies on ﬁrst ﬂush were performed; the sample collection methods applied; the ﬁrst ﬂush values obtained on the studies and how it inﬂuenced the nonpoint source pollution in urban watersheds. A network map, two contingency matrices, and a Sankey diagram were created to visualize the relationship of signiﬁcant keywords related to ﬁrst ﬂush, as well as their co-occurrences with journals, countries, and years. It was found that the strength of the ﬁrst ﬂush effect could vary depending on the geographical location of the site, climatic conditions, and the pollutants being analyzed. Therefore, initial rainfall monitoring, runoff sampling, and water quality testing were seen as critical steps in characterizing the ﬁrst ﬂush in urban catchments. Furthermore, the characterization of ﬁrst ﬂush was found to be signiﬁcant to the selection of best management practices and design of low-impact development (LID) technologies for stormwater runoff management and nonpoint source pollution control.


Introduction
More than half of the world's population live in urban areas based on the report published by the United Nations, and by 2050, it is projected that around 68% of the global population will be residing in urban areas. The population of urban dwellers was also projected to increase by 2.5 billion individuals between 2018 and 2050, approximately 90% of which will be concentrated in the Asian and African regions [1]. Urban areas only constitute a small portion of the earth's total surface; however, these areas can present the largest environmental issues among different forms of land use. The expansion of urban areas can lead to loss of wildlife habitat and agricultural lands, disruption of the natural water cycle, and greater pollutant generation [2,3]. Between 1992 and 2015, 3.3 Mha of forest lands and 24.3 Mha of croplands were lost due to urban expansion [4]. Apart from the significant changes in natural landscapes and the productivity of agricultural areas, hydrological patterns in urban settings are greatly affected by the changes in land surface characteristics. Impermeable surfaces reduce the infiltration potential of water and consequently increase rainfall-to-runoff conversion. Densely populated urban areas also pose greater environmental concerns due to the high rate of pollutant generation that can adversely affect the environment.
One of the major concerns arising from the expansion of urban areas is the rapid degradation of the urban water environment. Wastewater outfalls are the primary point sources (PS) of pollution in densely populated areas. Urban wastewaters contain a multitude of pollutants (i.e., organics, nutrients, pathogens, etc.) that can be directly discharged on receiving water bodies. Aside from PS, urban areas also contribute to water quality degradation by means of non-point source (NPS) pollutant discharges. Both PS and NPS pollution in urban areas raise environmental concerns; however, NPS pollutants originate from diffuse sources and are prone to pulse discharges, making them difficult to manage or eliminate [5]. Continuous land use change in urban areas was also found to have an effect in the quality of stormwater runoff during precipitation. Previous studies suggest that land use change contributes to an increase in runoff volume during rainfall events [6,7].
Stormwater runoff is considered a major source of NPS pollution that degrades water quality in urban areas [8,9]. Urban stormwater runoff usually exhibits the 'first flush' effect, wherein the discharge in the initial stage of a runoff event contains considerably higher amounts of pollutants compared with the latter phase of the event [10]. The first flush phenomenon can be classified as a 'concentration' first flush or 'mass' first flush. A concentration first flush mainly deals with the concentration of pollutants, whereas a mass first flush is a phenomenon that is both concentration-and flow-dependent [11]. The concept of first flush originated from the first NPS program conducted in Florida, United States in mid-1970s. The observed trend of pollutant loads and stormwater runoff volume throughout the storm event verified the occurrence of first flush. Apart from the distinct pattern of pollutant transport throughout an event duration, the concept of first flush was vaguely defined and lacks certain criteria. In view of these uncertainties, several studies attempted to define the first flush phenomenon quantitatively. However, the quantitative definitions formed from the said studies also had differences in terms of what cumulative mass to cumulative volume ratio would be considered as first flush. These definitions as well as emerging attempts to universally define first flush have been investigated in the comprehensive review section of this study.
The characteristics of first flush emerged as one of the primary factors for designing best management practices (BMPs) when stormwater treatment became mandatory [12]. Publications regarding the characterization, patterns, affecting factors, and remediation technologies for the treatment of urban stormwater runoff exhibiting first flush have been composed to assess the environmental effects of first flush [13][14][15]. In a study conducted in South Korea, first flush was used as a significant basis for the design and optimization of low-impact development (LID) facilities, particularly bioretention, green roof, infiltration trench, porous pavement, rain barrel, and vegetated swale [16]. In addition, another study suggested the use of both the maximum effluent concentration and pollutant removal target, which can be determined through the characterization of first flush, in determining the size and design of bioretention and detention facilities [17,18].
Due to the large number of publications regarding the first flush phenomenon, it is necessary to synthesize all currently available information and published literature to direct future studies and research initiatives. Therefore, this study was conducted to determine the current research trends regarding the first flush phenomenon in urban stormwater runoff and urban stormwater management through the analysis of an extensive collection of bibliographic information. Through conducting a comprehensive review, this study also examined quantitative definitions of first flush from different sources, analyzed the climatic and topographic characteristics of monitoring and experimental sites where the studies on first flush were performed, the sample collection methods applied, the first flush values obtained on the studies and the applications of first flush in the design of BMPs and LID technologies for the management of stormwater runoff. The results of the bibliometric and comprehensive study were also used to identify the challenges and future research directions regarding the first flush phenomenon in stormwater runoff.

Bibliometric Analysis
The research articles used for the bibliometric analysis were collected from the Scopus database [19]. As shown in the methodology framework in Figure 1, inputting the keyword "first flush" with the search string "TITLE-ABS-KEY ("first flush")" generated a total of 1202 research articles. To further filter results and generate research articles that are more relevant to urban catchments, specific keywords were added and the search string "TITLE-ABS-KEY ("first flush" AND "stormwater" OR "runoff" AND "urban") was inputted on the advanced search. The said search string resulted in 422 research articles. The resulting documents were then further limited to articles published from 1990 to 2022, resulting in 403 documents. The articles collected were downloaded as a Research Information System (RIS) file on 28 February 2022, which was then parsed into a corpus using the CorText Manager. The CorText platform was used to perform the bibliometric analysis, in particular the quantitative assessment of the scientific articles gathered [20]. Various scripts that include the number of articles per journal, the heterogeneous network map of keywords, countries, years, contingency matrix (keyword and journal; keyword and countries), and a river network or Sankey diagram were generated to analyze terms, references, and trends of the specific topics extracted from the collected research articles associated with the first flush related keyword search.

Comprehensive Review
Before conducting the comprehensive review, the research articles collected from the Scopus database were further filtered by adding relevant keywords. A preliminary term analysis was performed using CorText Manager to identify keywords that would be To identify the most productive journals publishing in the field of first flush research, the list builder feature of CorText was used. For the script parameters, "Journal" was selected under "field", and the list length was set to 500. A contingency matrix was also generated to visualize the relationship and magnitude of co-occurrence between keywords and journals. In generating the contingency matrix, keywords were set as the first field, while journals were selected as the second field. The number of nodes was set at 10 and chi 2 was selected as the contingency analysis measure.
A network (heterogeneous) map and a Sankey diagram were generated to analyze the relationship between keywords and the evolution of terms related to first flush throughout the years. For the script parameters of the network map, keywords were selected as the first field and time steps were selected as the second field. Under the Network Analysis and Layout tab, country was selected as the third variable, with chi 2 being the specificity measure. The Sankey diagram was generated by setting the number of slices to 5 under the Dynamics Table. To visualize the relationship and magnitude of co-occurrence between keywords and countries, another contingency matrix was generated. In generating the matrix, keywords were set as the first field, while country was selected as the second field. Like the contingency matrix of keywords and journals, the number of nodes for this matrix was set at 10 and chi 2 was selected as the contingency analysis measure.

Comprehensive Review
Before conducting the comprehensive review, the research articles collected from the Scopus database were further filtered by adding relevant keywords. A preliminary term analysis was performed using CorText Manager to identify keywords that would be added to the keywords used in the bibliometric analysis. Upon generating a homogeneous map of keywords, "water quality" and "runoff pollut *" were added to the initial set of keywords that were entered in the advanced search string of Scopus. The resulting documents were then filtered by limiting the following search to articles with English as the language used and that are open access or downloadable. The resulting number of documents after the filtering process was 75. Microsoft Excel was used to synthesize the site characteristics, sampling details, rainfall data, and first flush values found in the collected research articles. Additionally, Origin Pro 8.5 [21] was used to create box plots summarizing the event mean concentration (EMC) and first flush (FF) values.

Publication Journals and Contingency Matrix
The breakdown of the number of documents per journal for the keywords input for the bibliometric analysis is shown in Table 1. A total of 156 journals were found in the list wherein articles from the top ten journals constitute about half of the total number of articles. The three journals with the most published articles related to first flush research are Water Science and Technology with 43 documents, Science of Total Environment with 21 documents, and Environmental Science with 18 documents. Table 1. The number of documents per journal for the keyword search "first flush" or "stormwater" or "storm water" or "runoff" and "urban".

Rank
Journal Name Number of Articles A contingency matrix highlighting the co-occurrence of keywords, journal names and countries is shown in Figure 2. A red cell indicates a strong relationship between the two fields and denotes that the magnitude of co-occurrence between the two fields is high. A blue cell, on the other hand, indicates a weak relationship between the two fields and denotes a low magnitude of co-occurrence between them. Moreover, a white cell denotes neutrality between the two fields. It was observed that a high co-occurrence is present between the keyword "first flush effect" and Acta Scientiae Circumstanciae. A strong relationship was also found between the keyword "water quality" and Journal of Environmental Engineering. The connection between the keyword "heavy metals" was also found to have a high magnitude of co-occurrence with Water Science and Technology. Water (Switzerland) was found to have a high co-occurrence with the keywords "stormwater", and "water quality".
Another contingency matrix ( Figure 2b) was generated to visualize the relationship and magnitude of co-occurrence between significant keywords and countries that produced research articles on first flush from 1990 to 2022. It was observed that a high co-occurrence is present between the keyword "first flush effect" and China. The said strong co-occurrence of the country with the "first flush effect" was found to have a huge difference with other keywords such as "storm water", "heavy metals", and "urban runoff", which showed a weak relationship with the country. A strong co-occurrence between the keyword "event mean concentration" and South Korea is noticeable, indicating that a significant number of research articles from the country involved the calculation of event mean concentration (EMC) of various water quality parameters in studying the first flush phenomenon. The contingency matrix also suggests that the United States has the greatest number of high co-occurrences with the keywords generated in the script. The said country was found to have a strong relationship with the keywords "runoff", "heavy metals", "urban runoff", and "stormwater". However, a weak co-occurrence was observed with the same country and the keywords "first flush" and "first flush effect". y 2022, 9,

Network Mapping and Evolution of Keywords and Terms
A heterogeneous network map ( Figure 3) was generated using the CorText manager platform to identify the association of keywords, countries, and years related to first flush research. Keywords within and near a solid circle have high co-occurrences, while keywords surrounding a particular year indicate a high magnitude of co-occurrence between the keywords and the year. On the other hand, countries near to each other denote a high cooccurrence. It can be observed in the figure that first flush was represented by varying terms such as "first flush effect", "first-flush", and "first flush runoff". It is also seen that various terms describing the first flush phenomenon are distributed over a wide range of years, indicating that studies on the phenomenon have been constantly conducted throughout the years. The term "first-flush" was found to have peaked in 2003, while in more recent years such as 2018 and 2020, the terms "first flush runoff" and "first flush effect" were found to be more relevant. Another observation drawn from the network map is that counties with a high co-occurrence did not necessarily come from the same continent, which indicates that the first flush phenomenon continues to be a global focus of study. The keywords "low impact development" and "BMP" were also found in the network map, which indicates that the studies on first flush continue to be applied in the development of LID and BMP technologies worldwide. The high co-occurrence between the mentioned keywords and years also denotes the significance of first flush to the design of LID and BMP facilities. The keywords "best management practices" and "best management practices (BMPs)" had a high co-occurrence with the years 2003 and 2004. The keyword "low impact development", meanwhile, had a high co-occurrence with the more recent year of 2019.
Hydrology 2022, 9, x FOR PEER REVIEW 8 of 26 Figure 3. Heterogeneous network map of keywords, countries, and years of the research articles collected from the Scopus database using the keywords "first flush" and "stormwater" or "storm water" or "runoff" and "urban".
The evolution of terms related to first flush from 1990 to 2021 was visualized by creating a Sankey diagram, also called a river network. The Sankey diagram of the documents from the keyword search on "first flush" and "stormwater" or "storm water" or "runoff" and "urban" is shown in Figure 4. A darker colored tube connecting two keywords indicates a strong relationship between the two keywords. Moreover, the thickness of the tube denotes the level of co-occurrence between the two keywords. It was observed that the keyword "runoff & stormwater management" was split into three keywords: Figure 3. Heterogeneous network map of keywords, countries, and years of the research articles collected from the Scopus database using the keywords "first flush" and "stormwater" or "storm water" or "runoff" and "urban".
The evolution of terms related to first flush from 1990 to 2021 was visualized by creating a Sankey diagram, also called a river network. The Sankey diagram of the documents from the keyword search on "first flush" and "stormwater" or "storm water" or "runoff" and "urban" is shown in Figure 4. A darker colored tube connecting two keywords indicates a strong relationship between the two keywords. Moreover, the thickness of the tube denotes the level of co-occurrence between the two keywords. It was observed that the keyword "runoff & stormwater management" was split into three keywords: "stormwater management & heavy metals", "pollutant loads & pollutant graphs", and "denitrification & modeling". This indicates that the concept of runoff and stormwater management has transformed into specific research areas such as pollutant loads, heavy metals, and modeling from 2008 to 2012. Another significant transformation of keywords observed was the transformation of the keyword "first flush & particulate matter" to "best management practices & heavy metals" from 2012 to 2014, which indicates that studies on first flush have high relevance to the design of LID and BMP facilities.  . The Sankey Diagram for keywords "first flush" and "stormwater" or "storm water" or "runoff" and "urban" based on the Scopus database.

Definitions of 'First Flush'
Since it was first studied in the 1970s, first flush has been described as a common phenomenon in urban runoff pollution [22]. Studies explained first flush as the occurrence of significantly high pollutant concentrations during an early stage of a single rainfall event [23]. It can also be described either as a concentration first flush or mass first flush. The concentration first flush generally occurs when the runoff exhibits higher concentration or load compared to the later portion in a storm event, while mass first flush occurs when both the concentration and the runoff at the early stage is significantly higher with respect to the mass emission rate in the later runoff [24].
Generally, the definition of first flush revolving around the concentration of the initial phase of the runoff being higher than the later phases in a storm event has been confirmed by numerous studies [25][26][27]. However, a vast difference in the quantitative descriptions of the phenomenon in terms of its cumulative mass to cumulative volume ratio has been observed. The studies of Geiger (1987) and Barco et al. (2008), suggested that the occurrence of first flush can be confirmed when the slope of the cumulative pollutant mass fraction versus the cumulative volume fraction plot is greater than 45° [28,29]. The definition proposed by Deletic (1998)   . The Sankey Diagram for keywords "first flush" and "stormwater" or "storm water" or "runoff" and "urban" based on the Scopus database.

Definitions of 'First Flush'
Since it was first studied in the 1970s, first flush has been described as a common phenomenon in urban runoff pollution [22]. Studies explained first flush as the occurrence of significantly high pollutant concentrations during an early stage of a single rainfall event [23]. It can also be described either as a concentration first flush or mass first flush. The concentration first flush generally occurs when the runoff exhibits higher concentration or load compared to the later portion in a storm event, while mass first flush occurs when both the concentration and the runoff at the early stage is significantly higher with respect to the mass emission rate in the later runoff [24].
Generally, the definition of first flush revolving around the concentration of the initial phase of the runoff being higher than the later phases in a storm event has been confirmed by numerous studies [25][26][27]. However, a vast difference in the quantitative descriptions of the phenomenon in terms of its cumulative mass to cumulative volume ratio has been observed. The studies of Geiger (1987) and Barco et al. (2008), suggested that the occurrence of first flush can be confirmed when the slope of the cumulative pollutant mass fraction versus the cumulative volume fraction plot is greater than 45 • [28,29]. The definition proposed by Deletic (1998) and Bach et al. (2010) introduced the 40/20 ratio, stating that first flush can be observed when more than 40% of the cumulative pollutant load was transported within the first 20% of the cumulative runoff volume [30,31]. Bertrand-Krajewski et al. (1998) [32] devised the 80/20 ratio as a criterion for evaluating first flush. This ratio suggested that a significant first flush occurred when ≥80% of the total pollutant mass was transported within the first 30% of the runoff volume.
Earlier definitions of first flush utilized plots of pollutant mass against runoff volume or M(V) curves to define the first flush phenomenon, whereas recent studies use computational algorithms and mathematical models to provide new criteria for the characterization of first flush. For instance, in a study conducted by Perera et al. (2021), a Monte Carlo simulation was undertaken to conceptualize the first flush phenomenon in urban areas. The simulation results showed that the previously developed 40/20 and 80/30 criteria cannot be applied universally. Furthermore, it was observed that the first flush characteristics varied greatly within the initial 30% to 50% of the total runoff volume [33].
Considering quantitatively different definitions of first flush from various sources, a more recent study conducted by Kayhanian and Stenstrom in 2020 proposed a mass first flush ratio definition sketch, in which the mass first flush for any constituent based on a selected normalized cumulative mass and runoff volume can be determined. Using the said definition sketch, a particular first flush ratio can be classified as low, medium, or high. Moreover, whether a first flush has occurred or not can also be determined using the mentioned definition sketch [24].

Characteristics of the Study Sites
Previous studies have shown that the occurrence of first flush can be largely influenced by a wide range of variables including type of pollutant, site, and other hydrologic and climate-related factors [14,34]. Since pollutant emission is linked to continued urbanization and development, a majority of the investigations on the occurrence of first flush were conducted in urban areas with varying land uses such as residential, commercial, industrial, agricultural, transportation, etc. [16,[35][36][37]. Figure 5 shows the frequencies of each land use in the reviewed research articles with respect to the average area of the site being studied. One of the common consensuses of experimental studies conducted is that first flush is generally noticed in small catchments. For instance, more studies were conducted in residential areas with small catchments (i.e., less than 10 ha.) that can be linked to the previously reported correlation that first flush is present in areas with relatively high population density. Meanwhile, although there are studies conducted on large catchment sites such as in commercial and agricultural areas, the first flush phenomenon has been found to be uncertain due to the dilution and delay in transport of pollutants [23]. Recommendations of previous studies suggest that locally focused stormwater monitoring can aid data-driven decision making by city planners with respect to land use and water-quality regulations [38]. Several studies also examined the runoff quality in both urban areas and sites near streams or rivers [39][40][41]. These studies examined the water quality of streams after the generation of runoff in impervious surfaces. It was also noticeable that older articles focused on combined sewer overflows (CSO) for their respective first flush studies [42][43][44].

Characteristics of Monitored Storm Events
Available rainfall data based on the monitored storm events gathered from the reviewed research articles related to first flush are summarized in Table 2. A total of 26 studies specified the climate type in their respective study sites, of which 46.2% are in a subtropical climate, 26.9% are in a dry climate, 15.4% are in a continental climate, and 11.5% are in a temperate climate. For studies conducted in dry climates, the build-up and wash-off components were found to have significant effects on the characteristics of first flush [47]. Additionally, pollutant build-up and wash-off were found to be popular topics of interest for studies on first flush [48][49][50][51][52].
A summary of the characteristics of monitored storm events from the published literature is shown in Table 2. As can be observed, a wide variability on the range of values for the rainfall parameters, e.g., rainfall depth, duration and intensity, was found with a coefficient of variation between 1.0 and 1.6. It was identified on average that around six to ten storm events were monitored based on the included studies. The minimum rainfall depth was 0.3 mm with a maximum of 141 mm. The mean monitored rainfall depth was 20.1 mm slightly greater than the median depth of 14.2 mm with a skewness of 2.2. On average, the mean and median duration of rainfall was 5.4 h and 3.8 h, respectively. Peak rainfall intensity was 28.5 ± 33.8 mm/h (mean ± SD), which was 66% greater than the mean rainfall intensity of 9.8 ± 15.8 mm/h (mean ± SD). In a study conducted by Wong et al. (2016), no correlation was found between increasing storm intensity and the likelihood of a strong first flush effect [53]. Factors observed for the wide range of rainfall data are the difference in location and climate types, and the unpredictable nature of rainfall specified in the reviewed papers. It is apparent, however, that the values for rainfall depth and duration obtained from articles published in 2016 were higher compared to those from articles published in other years. The study sites' imperviousness was also investigated and a total of 57 sites were considered in obtaining the imperviousness data from the articles reviewed. The mean imperviousness rate was 52.1%, with a maximum value of 100% and a minimum value of 1.1%. Factors that affected the mean value of the imperviousness of the sites reviewed include the differences in land use and location of the study sites. Imperviousness was found to be a critical factor in selecting sites for studying first flush, as increased imperviousness was proven to increase the amounts of pollutants in the runoff [45,46].

Characteristics of Monitored Storm Events
Available rainfall data based on the monitored storm events gathered from the reviewed research articles related to first flush are summarized in Table 2. A total of 26 studies specified the climate type in their respective study sites, of which 46.2% are in a subtropical climate, 26.9% are in a dry climate, 15.4% are in a continental climate, and 11.5% are in a temperate climate. For studies conducted in dry climates, the build-up and wash-off components were found to have significant effects on the characteristics of first flush [47]. Additionally, pollutant build-up and wash-off were found to be popular topics of interest for studies on first flush [48][49][50][51][52]. A summary of the characteristics of monitored storm events from the published literature is shown in Table 2. As can be observed, a wide variability on the range of values for the rainfall parameters, e.g., rainfall depth, duration and intensity, was found with a coefficient of variation between 1.0 and 1.6. It was identified on average that around six to ten storm events were monitored based on the included studies. The minimum rainfall depth was 0.3 mm with a maximum of 141 mm. The mean monitored rainfall depth was 20.1 mm slightly greater than the median depth of 14.2 mm with a skewness of 2.2. On average, the mean and median duration of rainfall was 5.4 h and 3.8 h, respectively. Peak rainfall intensity was 28.5 ± 33.8 mm/h (mean ± SD), which was 66% greater than the mean rainfall intensity of 9.8 ± 15.8 mm/h (mean ± SD). In a study conducted by Wong et al. (2016), no correlation was found between increasing storm intensity and the likelihood of a strong first flush effect [53]. Factors observed for the wide range of rainfall data are the difference in location and climate types, and the unpredictable nature of rainfall specified in the reviewed papers. It is apparent, however, that the values for rainfall depth and duration obtained from articles published in 2016 were higher compared to those from articles published in other years.
Despite the extreme variability, rainfall depth and intensity were determined to be significant factors in the prediction and occurrence of first flush. Upon ranking variables in terms of relative importance in the occurrence of first flush, Perera et al. (2019) revealed that rainfall depth was the highest ranked factor, followed by rainfall intensity and rainfall duration [14]. Although ranked relatively low in the mentioned study, antecedent dryweather period or antecedent dry days (ADD) was found to have a significant effect on pollutant concentrations in other studies. Liu et al. (2019) revealed that event mean concentration values for total suspended solids (TSS), chemical oxygen demand (COD), ammonium (NH 4 ), and total phosphorus (TP) were higher in residential areas during events with a long antecedent dry-weather period [54]. Other water quality parameters were also found to have strong association with rainfall-related parameters. Jeung et al.
(2019) suggested that TSS and TP were closely related to rainfall intensity and rainfall duration [10].
With the variable nature of rainfall and related parameters, it would be difficult to predict the effect of a particular rainfall-related parameter on the characteristics of the first flush. Related studies suggest that the occurrence and magnitude of the first flush depend on numerous factors, particularly land use, the climate type, and rainfall characteristics in the site area [55,56]. Table 3 provides the stormwater runoff monitoring and sampling collection scheme on various land uses based on the 36 reviewed articles related to the study of first flush. The stormwater runoff sampling interval time was observed to be mostly between five min and one hour. A majority of the studies had reported longer intervals over the sampling time, while 25% had a fixed time interval throughout the duration of the storm and 19% had flow-based interval time.

Monitoring and Sampling Methodologies
It was observed that regardless of the sampling methods (i.e., grab and automatic sampling), more samples were generally collected in the beginning of the storm compared to the latter part of the storm. The frequent collection of samples was attributed to the concept of first flush, wherein the high variability in pollutant concentrations was observed during the initial phase of the stormwater runoff [45,57].
Among the research articles that specified the type of sampling used in the study, 30 studies employed grab or manual sampling, while 29 studies utilized automatic samplers. There was one study that reported the used of both grab and automatic sampling. There was no clear evidence whether the type of sampling could affect the quality of first flush; however, it was observed that automatic sampling was adopted in more recent studies in the United States and South Korea [10,38,57,58] while most grab sampling was preferred in China [54,[59][60][61][62]. The selection of the sampling collection method was found to be associated with factors related to the implementation of the study, such as funding, available manpower, and other site conditions [60,63].

Water Quality Parameters
A first flush characterization study typically involves both qualitative and quantitative analysis to determine the extent of NPS pollution. Thus, the selection of water quality parameters is integral both in analyzing the occurrence of first flush and when selecting the appropriate stormwater runoff treatment methods. Figure 6 shows the typical water quality parameters monitored and analyzed in stormwater runoff and first flush studies. Among the water quality parameters, the most typically monitored include TSS, TP, COD, TN, pH, heavy metals, and BOD that appeared in more than half of the reviewed articles. Among the reported studies, it was found out that TSS was often the primary water quality parameter analyzed due to the simplicity of the associated analytical method and its association with the particulate matter and sediments wash-off by stormwater runoff [73,74]. It was also indicated in the figure that certain nitrogen forms, e.g., NO 3 -N, ammonia, NH 3 -N, TKN and NO 2 -N were also typically monitored. On the other hand, the water quality parameters that were monitored least include hydrocarbons, PAHs, TDS, DOC, and some forms of phosphorus (i.e., particulate P and dissolved P, PO 4 -P). Microbial parameters such as fecal coliforms and Escherichia coli were also weakly monitored. Studies revealed that the magnitude of contamination is influenced by the continued watershed development, streamflow, and antecedent precipitation [72].

Water Quality Parameters
A first flush characterization study typically involves both qualitative and quantitative analysis to determine the extent of NPS pollution. Thus, the selection of water quality parameters is integral both in analyzing the occurrence of first flush and when selecting the appropriate stormwater runoff treatment methods. Figure 6 shows the typical water quality parameters monitored and analyzed in stormwater runoff and first flush studies. Among the water quality parameters, the most typically monitored include TSS, TP, COD, TN, pH, heavy metals, and BOD that appeared in more than half of the reviewed articles. Among the reported studies, it was found out that TSS was often the primary water quality parameter analyzed due to the simplicity of the associated analytical method and its association with the particulate matter and sediments wash-off by stormwater runoff [73,74]. It was also indicated in the figure that certain nitrogen forms, e.g., NO3-N, ammonia, NH3-N, TKN and NO2-N were also typically monitored. On the other hand, the water quality parameters that were monitored least include hydrocarbons, PAHs, TDS, DOC, and some forms of phosphorus (i.e., particulate P and dissolved P, PO4-P). Microbial parameters such as fecal coliforms and Escherichia coli were also weakly monitored. Studies revealed that the magnitude of contamination is influenced by the continued watershed development, streamflow, and antecedent precipitation [72].

Event Mean Concentration (EMC) and First Flush Values
Several studies have been carried out to examine land-use types to storm runoff pollution characteristics in a catchment. For instance, it was confirmed that a rapidly urbanizing catchment may result in significant spatial variations of runoff pollution which can eventually increase difficulties in the management of the overall quality of the stormwater    United States Mostly residential Chen et al., 2020 [2] China Residential   China Residential >30 min: every 20 min Not identified Nosrati, K., 2017 [66] China Mostly residential -  China China Urban Every 10 min Grab Li et al., 2007 [62] x indicates that a sample was collected at a particular point in the runoff duration; -indicates that no sample was collected in that particular point; a indicates a sampling interval of 2 min; b indicates that 1 sample was collected before the rainfall.

Event Mean Concentration (EMC) and First Flush Values
Several studies have been carried out to examine land-use types to storm runoff pollution characteristics in a catchment. For instance, it was confirmed that a rapidly urbanizing catchment may result in significant spatial variations of runoff pollution which can eventually increase difficulties in the management of the overall quality of the stormwater [75]. Due to the extreme variability in pollutant load throughout a given stormwater event, EMC, a key analytical parameter, was used as total pollutant mass (M) discharged during an event divided by the total volume (V) discharge of the storm event [76].
A summary of the EMC values of the most predominantly measured water quality parameters (in reference to Figure 7) such as TSS, COD, TN and TP for each land use type gathered from the reviewed research articles on first flush is presented in the box plots in Figure 8. The bar plots show the 25% and 75% percentiles (edges of the bar), the median (notch of the bar), confidence intervals (5%, upper and lower knees), fences, and outliers. As can be seen, the TSS EMC values were high particularly in the residential, commercial, industrial, mixed residential/commercial and agricultural land uses with the mean value ranging from 270 to 900 mg/L. A peak TSS EMC of almost 2000 mg/L was recorded in the residential and agricultural land uses, and the high EMC value for TSS was attributed to potential pollution source such as atmospheric dry and wet deposition, road deposited sediments, and vehicle emissions [68]. TP EMC values were high for residential, industrial, agricultural, and mixed residential/commercial land uses. Studies conducted in residential land use recorded a peak TP EMC value of almost 20 mg/L which can be attributed to anthropological activities in highly dense residential areas. Meanwhile, studies on TN EMC values focused only on residential, agricultural, and mixed residential/commercial land uses because these land uses were presumed to be major sources of TN compared to other land uses. The peak TN EMC value of almost 60 mg/L was found in mixed residential/commercial land use, which suggests that they are washed or scoured from the combined sewer and surfaces in the early storm. A similar trend was found with COD EMC values, where residential, agricultural, and mixed residential/commercial land uses recorded the highest values. Notably, high EMC values were recorded mostly in residential and mixed residential/commercial land uses where combined sewer overflow (CSO) is present. This finding suggests that strategies on CSO pollution management requires rigorous understanding of sources and trends of pollutants, as well as their mobilization and transport in a specific catchment area. As can be seen, the TSS EMC values were high particularly in the residential, commercial, industrial, mixed residential/commercial and agricultural land uses with the mean value ranging from 270 to 900 mg/L. A peak TSS EMC of almost 2000 mg/L was recorded in the residential and agricultural land uses, and the high EMC value for TSS was attributed to potential pollution source such as atmospheric dry and wet deposition, road deposited sediments, and vehicle emissions [68]. TP EMC values were high for residential, industrial, agricultural, and mixed residential/commercial land uses. Studies conducted in residential land use recorded a peak TP EMC value of almost 20 mg/L which can be attributed to anthropological activities in highly dense residential areas. Meanwhile, studies on TN EMC values focused only on residential, agricultural, and mixed residential/commercial land uses because these land uses were presumed to be major sources of TN compared to other land uses. The peak TN EMC value of almost 60 mg/L was found in mixed residential/commercial land use, which suggests that they are washed or scoured from the combined sewer and surfaces in the early storm. A similar trend was found with COD EMC values, where residential, agricultural, and mixed residential/commercial land uses recorded the highest values. Notably, high EMC values were recorded mostly in residential and mixed residential/commercial land uses where combined sewer overflow (CSO) is present. This finding suggests that strategies on CSO pollution management requires rigorous understanding of sources and trends of pollutants, as well as their mobilization and transport in a specific catchment area.     Table 4. The BMPs and LID technologies discussed in these studies include bioretention, infiltration trench, permeable pavement, and other GITs. The major findings of various studies were summarized below. Lee et al. (2010) concluded that the first flush runoff criterion was an important factor in designing BMPs to control and manage NPS pollution especially in urban areas. It was suggested that in order to design BMP as economically and effectively as possible, using the average rainfall depth was unnecessary and it would be best to design the BMP with 7.5 mm rainfall as the lower limit [77].
A novel methodology to optimize the sizes of different types of LIDs was proposed in the study of Baek et al. (2015) by conducting intensive stormwater monitoring and numerical modeling at a commercial site in South Korea. Using the mass first flush values obtained and through SWMM simulations, the authors suggested a runoff depth between 1.2 mm and 3.0 mm that quantified the first flush effect and recommended design criteria for LID facilities [16]. Shen et al. (2016) investigated the spatial variation and particle-size distributions of surface dust as well as the characteristics of runoff water first-flush effects (FF30) and the EMC values of ten common constituents including heavy metals and nutrients. It was concluded that coarser particles contributed to the coarse-phase pollutants found in the runoff, while finer particles constituted the dissolved pollutants. It was suggested that considering the coarse and dissolved pollutants in the runoff is significant to the design of LIDs/BMPs [59]. Morgan et al. (2017) studied the FF of SS from the outlet of an urban residential drainage system in Kimmage, Ireland. Their findings revealed that 11 out of 14 monitored storm events showed a moderate FF effect, while a significant FF effect occurred in a small proportion of events and was more likely to occur for finer solids, particularly the <10 μm  Figure 8 shows the box plots of FF values gathered from the reviewed research articles on first flush. It was observed that higher concentrations during the early stage of the storm event were indicative of the existence of concentration first flush. When a first flush occurs, the FF n can be considered as the split-flow control criteria. In the examined papers, it can be observed that the values of FF 30 for TSS, COD, and TP varied significantly except for TN. For instance, the FF 30 values for TSS and COD from gathered studies ranged from 5% to almost 90%, while the FF 30 values for TP varied from 10% to 80% at peak. A study by [62] confirmed that the magnitude of first flush of TSS can be correlated to the antecedent dry weather period. Moreover, rainfall events with longer antecedent dry weather conditions were more likely to result in a higher first flush. Meanwhile, the varying values of FF n of various pollutant from selected papers suggest that while first flush exists, the values for different pollutant concentrations differ due to many factors (site, climate, catchment size, land use, etc.). This is parallel to a previous study finding that EMCs and first flush are complex and site-specific. Furthermore, this means that other catchment characteristics play an important role in affecting the pollutant wash-off, rather than this being influenced by the impervious area fraction only [54]. It is important to emphasize that first flush may occur more frequently within smaller catchment areas and would generate more accurate first flush values. However, it should be noted also that FF n values generated from previous studies may be used as reference for further studies but should not be treated as absolute values in understanding NPS pollution in stormwater runoff. Another study [24] also highlighted that those significant insights obtained from first flush characterization studies can be used for several practical stormwater runoff management issues. When a first flush occurs, the FF n can be considered as the split-flow control criteria. In the examined papers, it can be observed that the values of FF 30 for TSS, COD, TN, and TP varied significantly. The weak correlation of FF n values from selected papers suggests that while first flush exists, the values for different pollutant concentrations differ due to many factors (site, climate, catchment size, land use, etc.). This is the parallel to a previous study finding that EMCs and first flush are complex and site-specific. Furthermore, this means that other catchment characteristics play an important role in affecting the pollutant wash-off, rather than this being influenced by the impervious area fraction only [36]. It should be noted also that FF n values generated from previous studies may be used as reference for further studies, but should not be treated as absolute values in understanding NPS pollution in stormwater runoff.

Application to the Design of BMPs/LID Technologies
Several studies have been found to be relevant to the design and development of LIDs and BMPs which are listed in Table 4. The BMPs and LID technologies discussed in these studies include bioretention, infiltration trench, permeable pavement, and other GITs. The major findings of various studies were summarized below.  Lee et al. (2010) concluded that the first flush runoff criterion was an important factor in designing BMPs to control and manage NPS pollution especially in urban areas. It was suggested that in order to design BMP as economically and effectively as possible, using the average rainfall depth was unnecessary and it would be best to design the BMP with 7.5 mm rainfall as the lower limit [77].
A novel methodology to optimize the sizes of different types of LIDs was proposed in the study of Baek et al. (2015) by conducting intensive stormwater monitoring and numerical modeling at a commercial site in South Korea. Using the mass first flush values obtained and through SWMM simulations, the authors suggested a runoff depth between 1.2 mm and 3.0 mm that quantified the first flush effect and recommended design criteria for LID facilities [16]. Shen et al. (2016) investigated the spatial variation and particle-size distributions of surface dust as well as the characteristics of runoff water first-flush effects (FF30) and the EMC values of ten common constituents including heavy metals and nutrients. It was concluded that coarser particles contributed to the coarse-phase pollutants found in the runoff, while finer particles constituted the dissolved pollutants. It was suggested that considering the coarse and dissolved pollutants in the runoff is significant to the design of LIDs/BMPs [59]. Morgan et al. (2017) studied the FF of SS from the outlet of an urban residential drainage system in Kimmage, Ireland. Their findings revealed that 11 out of 14 monitored storm events showed a moderate FF effect, while a significant FF effect occurred in a small proportion of events and was more likely to occur for finer solids, particularly the <10 µm fraction. It was emphasized that stormwater BMPs need to be designed in order to treat the entire rain event volume [78]. Seo et al. (2017) developed an integrated management system that was tested against field data collected in a sub-basin of the Gwanpyung-cheon stream in Daejeon, Republic of Korea. Continuous monitoring results indicated that the first four hours of runoff exhibited higher concentrations than normal levels at the study site, which could be useful in determining the necessary volume for efficient stormwater runoff treatment [79].
The study of Zhao et al. (2018) focused on the role of LID in the generation and control of urban diffuse pollution within paired urban micro-drainage units in China. Findings revealed that the urban sediment accumulation process, the characteristics of its removal by rainfall runoff, water quality of the surface runoff, and discharge loads were comparable between the paired urban drainage units. Results indicated that LID practices could influence the urban sediment dynamic build-up processes, wash-off, and transport processes [65].
Castañer and Pellegrino (2020) investigated the efficiency of Green Infrastructure Technologies (GIT) and vegetated systems that recover natural functions in impaired watersheds in reducing storm water diffuse contamination as part of the Jaguaré's Watershed Restoration Pilot Program initiative in São Paulo, Brazil. Results reported that the water quality volumes (WQVs) increased in storms of medium-to-high initial rainfall intensities. The study suggested the use of both the maximum effluent concentration and pollutant removal targets for bioretention facility sizing in humid sub-tropical and tropical rainfall distribution settings [17].
Based on the study of Khan et al. (2021) that focused on the characterization of TP and TS particle size distributions and associated fluxes from urban catchments in the City of Cambridge, Massachusetts, findings revealed that TP and TS in stormwater were not uniformly distributed among particle size fractions. It was therefore suggested that one feasible BMP design strategy was to combine both flow and particle size criteria [58].
Stormwater quality in three urban watersheds in Denver under rapid infill redevelopment for about a decade was studied by Gustafson et al. (2021). Results showed that no significant differences existed between local, neighborhood-scale EMCs among the three sites. However, the EMC values were significantly different than established city-wide values. The finding suggested that local-scale stormwater sampling provided a more accurate representation of NPS pollution coming from urban areas [38].
According to Tirpak et al. (2020), the permeable interlocking concrete pavement provided significant reductions of sediment and particulate nutrients during the 15-month monitoring period. Results demonstrated that the pollutant removal capacities of permeable pavement were still effective years after the construction phase. However, timely maintenance of permeable pavement which could further improve their performance by removing seasonally deposited pollutants throughout the year was highlighted [63].
In general, based on the reviewed articles on the application of BMPs and LID, their efficiency varies due to several factors including land use, climatic conditions, size of catchment area, selection of appropriate LID technologies, and maintenance strategies.

Challenges and Future Perspectives
The first flush phenomenon poses a grave environmental challenge, especially to areas with highly developed or urbanized catchments. The bibliographic information retrieved from published literature revealed that the first flush phenomenon remains a relevant topic in NPS or diffuse pollution research; however, 60% of the publications in this subject area mainly originated from the United States, China, and South Korea. This indicated that further advancements in first flush research can still be achieved through the promotion of international collaboration among researchers and institutions from different countries. One of the major challenges identified in this study is the relatively low scientific productivity of developing countries. In the study conducted by Yang et al. (2017), it was emphasized that developing countries have limited international research cooperation due to socioeconomic constraints [80]. Given the limitations induced by financial resources, provisions on additional research funding for developing nations can be a major contribution in addressing stormwater management research initiatives.
Land use and land-use changes significantly affect the types and amount of pollutants in stormwater runoff [81]. Despite the known effects of catchment characteristics in runoff water quality, a majority of the studies conducted for analyzing the effects or confirming the occurrence of the first flush phenomenon only focused on urban catchments. The occurrence of the first flush phenomenon in other land use types remains relatively unexplored, and thus, future research endeavors focusing on the runoff characteristics from various land use compositions are recommended. Studies that focus on the presence of emerging pollutants (e.g., pharmaceuticals and personal care products, industrial additives, microplastics, etc.) in the first flush are also scarce. Since these compounds are detrimental to both runoff quality and human health, further research works are still necessary to understand their behavior in stormwater runoff.
Lastly, the articles reviewed have indicated that the characteristics of first flush could be used as a criterion in designing stormwater runoff treatment systems for NPS pollution control and quantifying the first flush effect was critical to the design, size, and optimization of LID/BMP facilities. Overall, the bibliometric and comprehensive analyses conducted in this study led to the identification of existing knowledge gaps that can be addressed through further research.

Conclusions
The bibliometric analysis conducted in this study determined the trends and significant research associated with first flush in urban catchments. The generated network map of keywords, years, and countries revealed the different terms used for the first flush phenomenon existed as "first-flush", "first flush effect", and "first flush runoff". The network map also showed the emergence of LIDs and BMPs as a focus of study with regards to first flush. Although most studies on first flush were conducted in countries with temperate climates, such as the United States, China, South Korea, and other European countries, growing collaboration among different countries from different regions was also observed through the network map. The Sankey diagram revealed that the concept of first flush has been continuously evolving. The strong relationship between stormwater management and LID optimization, denitrification, and modeling showed the continuous growth of first flush as a concept.
Upon conducting the comprehensive review, the significant characteristics of first flush in urban catchments were determined. Nonetheless, the review was limited to research articles that were downloadable from the Scopus database. The review did not include resulting research articles relevant to the keyword search but were not downloadable. This comprehensive review has identified differences in the quantitative definition of first flush among several sources. However, a definition sketch has recently been developed as an attempt to characterize the occurrence of first flush in past studies and future research initiatives. Upon reviewing research articles with available data on FF ratios, the presence of first flush was seen in numerous studies. The generated box plot of FF values showed that in the reviewed studies, a huge percentage of the cumulative pollutant mass was present in the initial 20%, 25%, and 30% of the cumulative runoff volume, indicating the occurrence of first flush. Nevertheless, the set of values for FF 20 , FF 25 , and FF 30 each had a wide range, indicating that the characteristics of first flush in terms of flow and concentration are siteand climate-specific.
It was revealed in the comprehensive review that the characteristics of first flush vary greatly depending on the geographical location of the site, climatic conditions, and the pollutants being quantified. Thus, it can be concluded that designing efficient LIDs and selecting BMPs for a specific land use would require initial monitoring, runoff sampling, and first flush analysis in the study or project site. Furthermore, a set of guidelines on the characteristics of first flush in different locations is seen as a potential major contribution to the development of land use-specific stormwater management techniques, which can be beneficial to the improvement of water quality in different regions across the globe.