Exploring the Interactions of Physical, Chemical and Biological Variables of an Urban River Using Network Analysis

Network analysis was used as a method to investigate the relationship between benthic macroinvertebrates in an urban river watershed and physicochemical variables. The measured physicochemical variables were the dissolved oxygen, temperature, nutrients, conductivity, pH, total organic matter, biochemical oxygen demand and river discharge. The metrics applied in the study were the degree of connections between nodes, the number of edges identified for each study location and the functional feeding groups. The river sampling took place over 14 months and sampling took place at five sites, two of which were upstream of a major wastewater treatment works and three sites were downstream of the works. A biological and environmental (BIOENV) analysis was included as part of the overall analysis to compare the variables that influenced the river ecosystem. This study shows that the relationships between benthic macroinvertebrates were stronger at the upstream locations of the watershed, while the downstream locations were controlled by the physicochemical relationships. From this analysis, the river quality and biodiversity were mainly controlled by the discharge, conductivity and availability of relevant organic matter suitable for organisms. Through the network, the degree of connections between the variables revealed the status of the urban river and provided insight into the possible management of vegetation cover across the urban watershed.


Introduction
Physical and chemical environmental factors, such as temperature, discharge, conductivity, nutrients and organic matter, can influence the biological components of rivers, including benthic macroinvertebrates. The metrics used to measure stream health include functional feeding groups [1][2][3], community diversity and the degree of connection between the physicochemical variables and invertebrates [4]. However, increasing human population growth and the associated agricultural and urban demands put large amounts of pressure on freshwater resources in terms of both the quality and quantity needed to satisfy human needs [5][6][7][8][9][10][11][12]. These pressures on rivers modify and sometimes limit the biodiversity abundance, which consequently impacts on the ecosystem system services provided by these rivers [10,13,14]. The impact of point and non-point source pollution on rivers' biodiversity, richness and abundance have been widely studied [15][16][17][18][19]. Due to climate change, freshwater resources have become more stressed and most parts of the world [20,21], including the U.K. [11], will be challenged by changing environmental conditions. For these reasons, freshwater resources are continually influenced by multiple stressors, which means it is important to understand these stressors from an ecological perspective. To understand these multiple stressors, interactions between benthic macroinvertebrates and physicochemical variables have been studied using multivariate analysis [22][23][24]. In recent years, network analysis has been applied to understand complex interactions in diverse fields. For example, the method has been applied to animal social networks and disease transmission networks [25][26][27] in genomics as they allow for complex gene transcription factor and protein component interactions to be identified [28][29][30]. In psychology, network analysis has been used as an important concept and analytical approach [31], where social network analysis has been applied [32] to explain mutualisms and correlations. Thus, constituent variables can mutually influence other variables without the need to hypothesise the existence of causal latent variables [33]. In environmental management, social network analysis had been utilised to explain natural resources governance and to understand the circulation and local management of plant propagative materials. The results have shown that there are connectors involved in natural resource management, which allow for predictions to be made that might affect resource management [34].
In ecological studies, network analysis has been applied in the assessment of microbial communities using genomic indicators, e.g., [35,36]. In plant science, network analysis has been explored to progress research in above-ground and below-ground ecological interactions [37]. Network analysis has been applied in marine ecology to study the structure of energy flows between ecosystem components. Network analysis has been used to reveal major exchanges and dependencies between the major marine subsystems, and as a means to quantify the architecture of ecological complexity [4].
As a new source analysis tool in ecology, networks can link numerous components that interact with one another in a system. The combination of edges and nodes can be applied to the process of finding a solution that will be optimized such that it is interpretable via clearer physical meanings [4,31]. However, applications of this model to identify the interactions between variables in river environments have rarely been reported.
The study aimed to identify the interactions between water quality variables and benthic macroinvertebrate communities by using network analysis and to determine which environmental variable(s) influenced the urban river organisms. The results provided insight into the characteristics of the pollution sources at the Medlock River catchment and can contribute to the development of effective water quality protection strategies.  56.544 ). The river has a catchment size of 65.9 km 2 , an annual mean flow of 0.86 m 3 /s and a maximum altitude of 376 m. The major land cover patterns in the catchment include grassland, woodland and an urban extent (transport networks, commercial and residential areas). The river has a continuously operational wastewater treatment works (WwTW) at Failsworth EPSG:4326 WGS 84 (latitude: 53 • 29 41.282 , longitude: 2 • 9 25.615 ), as well as numerous combined sewer and stormwater overflows (United Kingdom's Environment Agency, personal communication) and surface water drains. As with other urban rivers in the Greater Manchester area, human settlements in this catchment date back as far as Roman times [38]. With increased urbanisation and the need to control water flow due to flood events, most sections of the river have been modified, with a resultant effect on the river quality [39]. The River Medlock is recorded as one of the rivers in England that have not fully complied with the "good ecological status" required by the European Union's Water Framework Directive [40] for surface waters. A high concentration of phosphate from combined sewer overflows and wastewater treatment works is a major challenge [23]. Other challenges include channel modifications, canalisation, urban runoff, leakages Water 2020, 12, 2578 3 of 17 and agricultural effluent, which impact the river's diversity and abundance [39]. Further studies on the land cover for this case study can be found in [14].

Samples
Water samples were obtained from March 2013 to April 2014 from 14 field campaigns carried out upstream and downstream of combined sewer overflows and the wastewater treatment works of the River Medlock ( Figure 2). The sites were numbered with increasing site number based on the distance from the source following the direction of the river; S1 and S2 were above the WwTW, while S3 to S5 were below the works as the river flowed towards Manchester City Centre. The sites were also chosen by considering the temporal variation with spring (March to May), summer (June to August), autumn (September to November) and winter (from December to February). Table 1 shows the characteristics of the sites examined in the study. River substrates obtained from the river were also included in the analysis. For further information on the study locations, see [23]. Sites 1 and 2 had a lower urban cover than those located downstream of the Wastewater Treatment Works (WwTW), where sites 3-5 were considered to have increasingly urbanized catchment areas in that order. These secondary data contained 32 benthic macro-invertebrate families, which constituted 13,664 invertebrate sample counts and physicochemical variables made up of 770 samples (14 months × 5 sites × 11 variables) obtained from S1 to S5. Data was measured for physicochemical variables, including pH; dissolved oxygen (percentage saturation); conductivity (µScm −1 ); river discharge (m 3 s −1 ); temperature ( • C); concentrations (mg/L) of nitrate-N (nitrate), phosphate-P (phosphate), total organic matter (TOM), ammonia-N (ammonia), biological oxygen demand (BOD) and suspended solids. All variables were analysed according to the standard analytical requirements [41]. The benthic macroinvertebrate samples were obtained through the standard three-minute kick sampling techniques and identified at the taxonomical family level [42].
Other physical variables in Table 1, including the catchment area, altitude and slope, were determined using information obtained from the ArcGIS Geoprocessing toolbox [43].
up of 770 samples (14 months × 5 sites × 11 variables) obtained from S1 to S5. Data was measured for physicochemical variables, including pH; dissolved oxygen (percentage saturation); conductivity (µScm −1 ); river discharge (m 3 s −1 ); temperature (°C); concentrations (mg/L) of nitrate-N (nitrate), phosphate-P (phosphate), total organic matter (TOM), ammonia-N (ammonia), biological oxygen demand (BOD) and suspended solids. All variables were analysed according to the standard analytical requirements [41]. The benthic macroinvertebrate samples were obtained through the standard three-minute kick sampling techniques and identified at the taxonomical family level [42]. Other physical variables in Table 1, including the catchment area, altitude and slope, were determined using information obtained from the ArcGIS Geoprocessing toolbox [43].

Data Analysis
The physicochemical variables and benthic macroinvertebrates for each taxonomical family were normalised to allow for a comparison of the counts between sample dates and sites. The abundance values were normalised by dividing the number of organisms counted for one family by the total number of all organisms found at one site for one date.
To produce an interpretable and ecologically informative network graph, a minimum correlation coefficient (ρ) threshold required to remove lower-scoring associations was set and established by testing a range of correlation coefficients, namely, ρ < 0.4, ρ > 0.6 and ρ = 0.9. At ρ < 0.4, the threshold value was found to be too low and showed many edges in the merged network; at ρ = 0.9, it was observed that the threshold value was too high and returned a merged network with only 10 edges, and thus, the graph was uninterpretable. A final threshold value of ρ > 0.6 was chosen to obtain the edges recorded in the final merged network between and within each site. The mean percentage composition and standard deviation of each species for each sampling date and site were

Data Analysis
The physicochemical variables and benthic macroinvertebrates for each taxonomical family were normalised to allow for a comparison of the counts between sample dates and sites. The abundance values were normalised by dividing the number of organisms counted for one family by the total number of all organisms found at one site for one date.
To produce an interpretable and ecologically informative network graph, a minimum correlation coefficient (ρ) threshold required to remove lower-scoring associations was set and established by testing a range of correlation coefficients, namely, ρ < 0.4, ρ > 0.6 and ρ = 0.9. At ρ < 0.4, the threshold value was found to be too low and showed many edges in the merged network; at ρ = 0.9, it was observed that the threshold value was too high and returned a merged network with only 10 edges, and thus, the graph was uninterpretable. A final threshold value of ρ > 0.6 was chosen to obtain the edges recorded in the final merged network between and within each site. The mean percentage composition and standard deviation of each species for each sampling date and site were also calculated. All initial data processing and analysis were performed using RStudio Team (2015) (RStudio: Integrated Development for R. RStudio, Inc., Boston, MA, USA (Computer Software v0.98.1074) http://www.rstudio.com/).

Creation of the Network Using Ecolo_Works Pipeline
The Ecolo_works pipeline is composed of two novel command-line Python programs, namely, NetworkCreate and NetworkMerger, which were designed for network creation and visualisation. NetworkCreate produces a network file based on a Spearman's rank correlation coefficient matrix between and within variables, and the NetworkMerger combines multiple network files that contain information related to the number of repeated edges between networks. For access to the package programs and full usage instructions, please visit https://github.com/charlesjban/ecolo_works. These files were loaded into the network visualisation software Cytoscape [44,45] to produce a network graph.

NetworkCreate and NetworkMerger
The program NetworkCreate.py takes as input both the correlation coefficient matrix (-cfile argument) and the corresponding p-value matrix (-pfile argument). The program outputs a network file in .csv format, which is a tab-separated text file with each line representing a network edge. While the nodes (physicochemical and biological variables) were represented using rectangles, the edges were represented using solid lines. Users have the option to define a minimum threshold value (cvalue argument, default = 0.5) for both the correlation coefficient (ρ) and p-value (-p value argument, default = 0.05). An additional feature of the program allows users to assign a "type" label for each variable in an additional text file (-varfile argument). NetworkCreate uses this file to label the "interaction type" for each edge and creates an output "node type" file containing a tab-separated table of each node and the respective variable type. Each interaction must surpass thresholds for both the correlation coefficient and p-values to be included in the output network and shows whether the correlation is positive or negative. NetworkCreate was used to build a network for each study site based on the coefficient matrix and the corresponding p-value matrix for each study site.
NetworkMerger.py was designed to take up to nine "network" files (created using NetworkCreate.py) that are obtained as aggregates for the sampling duration, which are then merged into a single network. This network shows the interaction type and the direction of the correlation by indicating where conflicting positive or negative correlations are found and shows where they are different in the networks. The nodes were highlighted in blue for the physicochemical variables and green for biological variables. The edges were coloured grey and red for positive and negative correlations, respectively. In the merged network, the edge line thickness increased for variables with repeated numbers of interactions at different sites. The resulting network files of both NetworkCreate and NetworkMerger (Ecolo_works) were designed to be loaded into network visualisation software, such as Cytoscape [44] and used for visualisation, where variables, such as nodes, edges, shapes and colours can be changed. The loading of the "node type" file assists in the conditional visual formatting based on the node variable type. All networks produced by NetworkCreate for each site were then merged using NetworkMerger to create a single network.

Biological and Environmental (BIOENV) Analysis
The biological and environmental (BIOENV) analysis was used to determine which variable(s) affected the benthic invertebrate communities' abundance and distribution. The BIOENV analysis was based on a weighted Spearman rank correlation coefficient (ρ) between the physicochemical variables and benthic invertebrate communities. Both the environmental and biological variables were transformed and normalised to allow for a comparison at the same scale. The rank with the largest ρ was taken to identify the best match with the benthic macroinvertebrate communities. The BIOENV analysis was performed using Primer v6 [46] and the outcome was used to provide a comparison with the network analysis. The river quality was classified under the European Union Water Framework Directive (EU WFD standards [47]).

Benthic Macroinvertebrates and Physicochemical Variables
A total of 32 benthic macroinvertebrate families were identified at the River Medlock. The dominant families included Baetidae, Chironomidae and Tubificidae. Other taxonomical families identified in limited abundance were Heptageniidae, Gammaridae, Simuliidae, Lumbriculidae, Tipulidae and Paediciidae. Table 2 shows that the mean number of families found in the river was the highest at site 1 (7.769 ± 3.270) and the lowest was at site 5 (4.50 ± 3.28). The mean organism count per site sampling peaked at S3. Five major functional feeding groups (FFGs) were identified for the taxa [1,48,49], including collector-filterers (CFs), predators (Prs), shredders (Shs), collector-gatherers (CGs) and scrapers (Scs). The river was dominated by CFs, with the highest numbers recorded at S1 and S3; scrapers were found only at S1 and S2. Table 2. Benthic macroinvertebrate assemblages identified at S1 to S5, including the mean ± SD for each family identified per sample, total organisms identified per sample and the dominant group for each study site. The families were also classified based on their functional feeding groups (FFGs): collector-filterer (CF), collector-gatherer (CG), predator (Pr), shredder (Sh) and scraper (Sc).

Physicochemical Variables and Benthic Macroinvertebrates
The number of interactions between the biological-biological, biological-physicochemical, physicochemical-physicochemical variables are presented in Table 3, including the number of nodes and edges. Figure 3 shows the individual networks per site and Figure 4 shows the merged networks. At S1, the interactions between the physicochemical variables and the benthic macroinvertebrates were both negative and positive (Figure 3), but at S2 to S5, these were mainly positive. Given these relationships, S1 had the highest number of interactions (34 edges) between the networks, while S5 had the lowest number of interactions (22 edges). The average number of edges for the interactions between the benthic invertebrates-benthic invertebrates was 12 edges, the benthic invertebrates-physicochemical interactions had 7.2 edges and the physicochemical-physicochemical interactions had 6.2 edges. For the merged networks, a total of 43 nodes and 106 edges were recorded in the merged network ( Figure 4). Overall, the network consisted of a highly connected component of benthic invertebrates-benthic invertebrates' interactions with an average of 12 edges. These interactions showed that the organisms tended to interact with each other in the food chain based on the availability (concentrations) of physicochemical variables within the river. These interactions could be derived from taxa sharing similar ecological niches. Table 3. Network node and edge counts for each site's network based on Spearman's correlation coefficient of ρ > 0.6 and where p < 0.05. The counts include the total number of nodes, the total number of edges between each variable and the total number of edges for each site. The positive (+ve) and negative (−ve) correlations are given, as well as the types of edges found at the study sites (B-B: biological-biological, B-PC: biological-physicochemical and P-P: physicochemical-physicochemical). BOD: biological oxygen demand, TOM: total organic matter.

Sites
Total Nodes   Table 4 shows the physicochemical variables that correlated with the benthic macroinvertebrate families based on the number of correlations in the following descending order: discharges (5 families), dissolved oxygen (5), BOD (4), temperature (4), pH (3), conductivity (2), phosphate (2), ammonia (1) and suspended solids (1). After analysing the functional composition of the families, it was found that the collector-filterers were the most abundant, with a 40% contribution dominated by Oligochaetes, Simuliidae, Hydropsychidae and Polycentropodidae. The predators ranked second with 27% represented by the following families: Dytiscidae, Gammaridae and Erpobdellidae. Shredders and collector-gatherers both ranked third with a 13% contribution each, while scrapers contributed the least to the group with an abundance of 7%. Small variations were found between sites, as shown in Table 2. Predators were found at all sites except site 1; scrapers were found at sites 1, 2 and 5; shredders were found at sites 2 and 4. Table 4. Nodes (benthic macroinvertebrates) and the correlated edges (physicochemical variables) for the studied river. The numbers in brackets signify the connections between the nodes and the individual edges. The families were also classified based on their functional feeding groups (FFGs): collector-filterer (CF), collector/gatherer (CG), predator (Pr), shredder (Sh) and scraper (Sc).  . Visualised network of the Spearman's rank correlation coefficients for sites 1 to 5 aggregated over the sampling period. Each node represents a variable at the site, where blue nodes represent physicochemical variables and green nodes represent macroinvertebrate counts. Each edge represents a significant (p < 0.05) correlation coefficient value between the variables, where ρ > 0.6. The networks were created with the novel NetworkCreate.py program using a correlation matrix for each site, which was created using the R rcorr function [51]. . Visualised network of the Spearman's rank correlation coefficients for sites 1 to 5 (a merged network). Each node represents a variable at the site, where blue nodes represent physicochemical variables and green nodes represent macroinvertebrate counts. The grey and red edges represent positive and negative interactions, respectively. Each edge represents a significant (p < 0.05) correlation coefficient value between the variables, where ρ > 0.6. Edges that were found at more than one site have thicker lines. Figure 3. Visualised network of the Spearman's rank correlation coefficients for sites 1 to 5 aggregated over the sampling period. Each node represents a variable at the site, where blue nodes represent physicochemical variables and green nodes represent macroinvertebrate counts. Each edge represents a significant (p < 0.05) correlation coefficient value between the variables, where ρ > 0.6. The networks were created with the novel NetworkCreate.py program using a correlation matrix for each site, which was created using the R rcorr function [51].

S5
Water 2020, 12, x FOR PEER REVIEW 11 of 17 Figure 3. Visualised network of the Spearman's rank correlation coefficients for sites 1 to 5 aggregated over the sampling period. Each node represents a variable at the site, where blue nodes represent physicochemical variables and green nodes represent macroinvertebrate counts. Each edge represents a significant (p < 0.05) correlation coefficient value between the variables, where ρ > 0.6. The networks were created with the novel NetworkCreate.py program using a correlation matrix for each site, which was created using the R rcorr function [51]. . Visualised network of the Spearman's rank correlation coefficients for sites 1 to 5 (a merged network). Each node represents a variable at the site, where blue nodes represent physicochemical variables and green nodes represent macroinvertebrate counts. The grey and red edges represent positive and negative interactions, respectively. Each edge represents a significant (p < 0.05) correlation coefficient value between the variables, where ρ > 0.6. Edges that were found at more than one site have thicker lines. Figure 4. Visualised network of the Spearman's rank correlation coefficients for sites 1 to 5 (a merged network). Each node represents a variable at the site, where blue nodes represent physicochemical variables and green nodes represent macroinvertebrate counts. The grey and red edges represent positive and negative interactions, respectively. Each edge represents a significant (p < 0.05) correlation coefficient value between the variables, where ρ > 0.6. Edges that were found at more than one site have thicker lines.

S5
The interactions displayed in Figure 4 show thick, stronger edges for biological-biological variables, including Simuliidae and Hydropsychidae, and between Rhyacophilidae, Limnephillidae and Ephemerellidae. Among these organisms, Hydropsychidae and Simuliidae were the only organisms with repeated interactions at more than two sites, correlating positively with ρ = 0.69, ρ = 0.61 and ρ = 0.63 at S1 to S3, respectively. Among the taxa groups, the insects had the strongest interactions.
The biological-biological relationship between the benthic macroinvertebrates was the highest at site 2. Site 3 had biological-physicochemical as its lowest interaction and physicochemical-physicochemical as its highest interaction ( Figure 5).
Water 2020, 12, x FOR PEER REVIEW 12 of 17 The interactions displayed in Figure 4 show thick, stronger edges for biological-biological variables, including Simuliidae and Hydropsychidae, and between Rhyacophilidae, Limnephillidae and Ephemerellidae. Among these organisms, Hydropsychidae and Simuliidae were the only organisms with repeated interactions at more than two sites, correlating positively with ρ = 0.69, ρ = 0.61 and ρ = 0.63 at S1 to S3, respectively. Among the taxa groups, the insects had the strongest interactions.
The biological-biological relationship between the benthic macroinvertebrates was the highest at site 2. Site 3 had biological-physicochemical as its lowest interaction and physicochemicalphysicochemical as its highest interaction ( Figure 5).  Table 3 shows that five variables had a major influence on the physicochemical conditions of the study area in the following descending order: ammonia concentration, temperature, dissolved oxygen, BOD and discharge. The ammonia concentration correlated with six variables (suspended solids, pH, discharge, phosphate concentration, conductivity and temperature), the temperature correlated with four variables (i.e., dissolved oxygen and the phosphate, nitrate and ammonia concentrations), the dissolved oxygen correlated with three variables (phosphate and nitrate concentrations and temperature), the BOD correlated with two variables (pH and conductivity) and the discharge correlated with two variables (TOM and ammonia concentration). Figures 4 and 5 show that the discharge correlated positively with ammonia concentration at sites 2 and 4 (ρ = 0.78 and ρ = 0.63, respectively), and with the total organic matter concentration for the same sites (ρ = 0.69 and ρ = 0.69, respectively). The ammonia concentration was also positively associated with the suspended solids concentration at sites 1 and 3 (ρ = 0.71 and ρ = 0.63, respectively), and negatively associated with pH at sites 3 and 4 (ρ = −0.64 and ρ = −0.65, respectively). A negative correlation between the temperature and dissolved oxygen levels was recorded at all sites, except site 2 (S1, S3 to S5: ρ = −0.70, ρ = −0.80, ρ = −0.89 and ρ = −0.65, respectively). The temperature and phosphate concentration were negatively associated at site S2 (ρ = −0.63). At sites 3 to 5, positive associations were seen between the temperature and phosphate concentration (ρ = 0.77, ρ = 0.74 and ρ = 0.72, respectively), between the temperature and nitrate concentration (ρ = 0.83, ρ = 0.64 and ρ = 0.77, respectively) and the temperature and phosphate concentrations (ρ = 0.84, ρ = 0.75 and ρ = 0.90, respectively). The individual networks at each site enabled the interactions to be viewed clearly, while the merged networks ( Figure 4) showed repeated edges, especially for variables that occurred frequently.  Table 3 shows that five variables had a major influence on the physicochemical conditions of the study area in the following descending order: ammonia concentration, temperature, dissolved oxygen, BOD and discharge. The ammonia concentration correlated with six variables (suspended solids, pH, discharge, phosphate concentration, conductivity and temperature), the temperature correlated with four variables (i.e., dissolved oxygen and the phosphate, nitrate and ammonia concentrations), the dissolved oxygen correlated with three variables (phosphate and nitrate concentrations and temperature), the BOD correlated with two variables (pH and conductivity) and the discharge correlated with two variables (TOM and ammonia concentration). Figures 4 and 5 show that the discharge correlated positively with ammonia concentration at sites 2 and 4 (ρ = 0.78 and ρ = 0.63, respectively), and with the total organic matter concentration for the same sites (ρ = 0.69 and ρ = 0.69, respectively). The ammonia concentration was also positively associated with the suspended solids concentration at sites 1 and 3 (ρ = 0.71 and ρ = 0.63, respectively), and negatively associated with pH at sites 3 and 4 (ρ = −0.64 and ρ = −0.65, respectively). A negative correlation between the temperature and dissolved oxygen levels was recorded at all sites, except site 2 (S1, S3 to S5: ρ = −0.70, ρ = −0.80, ρ = −0.89 and ρ = −0.65, respectively). The temperature and phosphate concentration were negatively associated at site S2 (ρ = −0.63). At sites 3 to 5, positive associations were seen between the temperature and phosphate concentration (ρ = 0.77, ρ = 0.74 and ρ = 0.72, respectively), between the temperature and nitrate concentration (ρ = 0.83, ρ = 0.64 and ρ = 0.77, respectively) and the temperature and phosphate concentrations (ρ = 0.84, ρ = 0.75 and ρ = 0.90, respectively). The individual networks at each site enabled the interactions to be viewed clearly, while the merged networks ( Figure 4) showed repeated edges, especially for variables that occurred frequently.

BIOENV Analysis
By applying the river characteristics, including the catchment area, altitude and slope, together with all the physicochemical variables and benthic macroinvertebrates (Section 3.1), Table 5 shows that the conductivity, discharge and catchment area were the variables that significantly correlated with the benthic macroinvertebrate communities, with a correlation coefficient of ρ = 0.274. However, with a correlation coefficient of ρ = 0.272 (0.002 less than 0.274), the BIOENV results indicated that conductivity and catchment area were the strongest abiotic variables.

Discussion and Conclusions
The discharge, dissolved oxygen, conductivity and catchment areas were found to be the major variables impacting the river (Sections 3.2-3.4). The discharge had a positive effect on the river ecosystem [52,53] due to the transport of drift, organic matter and food sources for benthic invertebrates and the dilution of contaminants. Although the differences between the upstream (higher altitude) and downstream (lower altitude) locations in the study areas were small (31-140 m), changes in the concentrations between the upper and lower sites were noted. For example, higher levels of conductivity, nutrients and BOD were recorded at the downstream locations (S3-S5), which suggest the impact of increased urban land cover and catchment contributions [47] from the wastewater treatment works' effluent and the intermittent discharge from combined sewer overflows.
The upstream section of the river had higher counts of benthic invertebrates when compared to the lower sections of the river. The distribution of the functional feeding groups (FFGs) ( Tables 2 and 4) showed the dominance of the collector-filterers (40%), which suggested the role of discharge in transporting drift that served as food to the groups [1], indicating the absence of non-forest covers and increased urban cover [14], especially at the more urban sections of the river. Windsor et al. [54] noted that contaminated rivers found in the highly urban sites of South Wales, U.K., were characterised by their reduced taxonomic and functional diversity and simplified food web structure. The Song Stream watershed in Korea [55] showed that streams close to non-vegetative sources had lower taxa richness and abundance, and therefore, were dominated by tolerant colonisers, such as Tubificidae and Hydropsychidae, which dominated the stream due to catchment contributions and sand deposition. The role of catchment contributions in the abundance of collector-filterers was also reported by [56] in a study carried out in a tropical lagoon in West Africa. While the disappearance of scrapers could be linked to the high level of organic matter and other contaminants, some studies [57,58] have shown that scrapers are intolerant of low water quality and high nutrient concentrations. In this study, the scrapers were found at S1 and S2, where the physicochemical concentrations were relatively low but still higher at the downstream sites. The near-equal distribution of collector-gatherers, predators and shredders, along with the abundance of collector-filterers across the watershed, showed that the river was impacted by diverse environmental variables. These were impacted by the discharge, nutrient input, suspended solids, temperature, contribution of contamination from the surrounding watershed, organic matter processing [59] and the role of intermittent discharge in the transport of drift.
Whilst most studies tend to focus on interspecies interactions [60], this study provided a framework using interfamily and physicochemical relationships to visualise the interactions between and within groups of biological variables and physicochemical variables in the same co-occurrence network in a river ecosystem. The results also successfully identified strong associations between physicochemical variables at three of the five study sites tested in the study. One metric used in the study was the degree of connection in terms of the number of nodes [4], which were linked by edges (Table 4). Through the metric, the research revealed that variables that controlled the river quality and invertebrate abundance were the discharge, dissolved oxygen, BOD and temperature. These physicochemical variables were further highlighted by the number of edges that connected the nodes for each sample site ( Figure 5), showing the impact of the physicochemical-physicochemical contribution to the river, especially at the downstream sections. By comparing the variables identified through network and BIOENV analyses, this study found that the variable that influenced the river concentrations at the study sites was the catchment area.
When used with other metrics, such as diversity and functional feeding groups, network analysis provides rich insight into river ecosystems, and in particular, the urban rivers. Network analysis can link many types of data across levels; therefore, it can relate individual taxa at the community or ecosystem levels. These analyses allow for the top ecosystem interactions to be highlighted by removing the low-scoring correlations from the network to provide insight into the sensitivity of the organisms to the river quality and other stressors and to identify subtle interactions. For this reason, network analysis is a powerful tool for analysing complex data relationships, and if considered in terms of ecological metrics [4], it could open up the use of the method in ecological analysis.
Furthermore, through the network visualisation, the relationships between variables are clearer and appealing [61]. The network outputs could provide opportunities for sharing scientific information with diverse groups of people, e.g., during public engagement, for evidence-based advocacy required by decision-makers and with other researchers. Therefore, network analysis has the potential to contribute to a wide range of ecological studies and provides an efficient scientific method as a catchment-based approach to river management.