Response of Chironomids (Diptera, Chironomidae) to Environmental Factors at Different Spatial Scales

Simple Summary Chironomids are probably the most speciose family among aquatic insects, colonizing almost all freshwater habitats. The emphasis of the present research is on: (1) taxonomic composition as the most efficient tool for describing biodiversity in natural habitats; and (2) natural habitat type as the most important factor able to explain different chironomid assemblages. This is because the habitat type summarizes a combination of different biotic and abiotic factors present at each site, determining taxa assemblages. Different spatial scales are often proposed as relevant factors for modeling community composition, but with regard to chironomids, spatial factors act only at a small scale, while environmental factors are the most important determinants for species distribution. Abstract Factors responsible for species distribution of benthic macroinvertebrates, including responses at different spatial scales, have been previously investigated. The aim of the present research was to review the most relevant factors explaining chironomid species distribution focusing on factors operating at different spatial scales, such as latitude, longitude, altitude, substrate, salinity, water temperature, current velocity, conductivity, acidity, dissolved oxygen, nutrient content etc. acting at regional levels and at a large or small water basin level. Data including chironomid species abundances from different lentic and lotic waters in Italy and other surrounding countries were analyzed using partial canonical correspondence analysis (pCCA) and multiple discriminant analysis (DISCR). Spatial analyses, including univariate Moran’s I correlograms, multivariate Mantel correlograms and Moran’s eigenvector maps (MEMs), were thereafter carried out. The results showed that habitat type, including different types of lotic waters (i.e., kryal, crenal, rhithral, potamal) and different lake types (i.e., littoral, sublittoral, profundal zones), is the most significant factor separating chironomid assemblages, while spatial factors act only as indirect influencers.


Introduction
Chironomids are amongst the most speciose families of aquatic insects, with about 15,000 described species, contributing to a large portion of the insect diversity in the benthos.Their ecological tolerance varies among species, with some being tolerant to extreme environmental conditions [1].Various species have been shown to respond to a broad suite of different environmental (=abiotic) factors, such as water temperature, salinity and sediment composition [2], although biotic factors (competition, predation) are also important [3].Furthermore, interactions between different combinations of abiotic and biotic factors as well as spatial and temporal variables are complex, making the species distributions difficult to interpret.Such complexities are further exacerbated in environmentally unstable aquatic habitats such as those in the Mediterranean [4], which are known to show broad hydro-morphological variations among years.The hydrological regime of those habitats will likely become even more unstable due to global climate change [5].
Historical biogeographic factors are known to influence chironomid species composition in aquatic habitats at large (continental) spatial scales [6], but in more specific habitats such as headwater streams, it can be supposed that environmental factors may act at a smaller spatial scale [7].Despite this described complexity, often only a few key factors may account for the largest portion of observed variation [8].The scarcity of samples and data available for study coupled with different sampling methods and tools and differing taxonomic levels used (i.e., family, genus, species), produces further uncertainty in interpreting results.Many studies considered responses at different spatial scales [9], but few studies evaluated the responses of chironomids at different spatial scales [10], and they were primarily focused on small spatial scales [11,12].
The aim of the present study is to confirm the relevance of environmental factors acting at small spatial scales and to assess the potential importance of factors acting at larger spatial scales by using traditional multivariate methods, such as partial canonical correspondence analysis (pCCA) and multiple discriminant analysis (DISCR), and more recent tools used to analyze spatial data based on Moran's eigenvector maps (MEMs) [13].

Materials and Methods
The input data matrix included 788 study sites in 11 different habitat types (Table 1) encompassing lotic and lentic water bodies; the former were classified according to Illies and Botosaneanu [14], and the latter according to Buraschi et al. [15] and Tartari et al. [16].
Table 1.Brief description of the habitat types sampled, with abbreviations used in figures (codes) and the number of sites sampled.See [15,16] for more details.

AL03
Large lakes with 100 km 2 surface, maximum depth above 120 m, below 800 m a.s.l.altitude, including littoral, sublittoral and profundal zones 68 AL04 Lakes with maximum depth < 15 m, below 800 m a.s.l.altitude, polymictic, without a clear thermal stratification 8
The samples were collected with different sampling techniques according to habitat type.Kick sampling was used in kryal, crenal and rhithral habitats, drift nets were used in potamal stretches of rivers, PONAR grabs were used in different lake types, Ekman grabs were used in Alpine lakes.Species abundances were standardized to 1 m 2 bottom surface to allow comparison between samples collected in different habitat.Samples were collected mostly in Italy [2,8], and secondarily in other countries: Austria, Germany [17], Switzerland [18,19], Montenegro [20], Greece [21], Algeria [22,23] and France (Garonna River).The maps with sampling sites were prepared using QGIS version 3.34.1 Prizren (2009) (https://www.qgis.org/it/site/,accessed on 1 February 2024) [24].
Raw data comprised 19,531 samples collected in 788 study sites with 255 variables, i.e., 22 environmental factors/variables and 233 chironomid species.Only samples including values of at least ten variables and variables present in at least 100 samples were considered for analysis.This selection left 782 sites with 101 species and nine environmental variables: altitude (altit) (m a.s.l.), distance from the source (dist) (km), conductivity (cond) (µS cm −1 ), dissolved oxygen (O 2 ) (mg L −1 ), oxygen saturation (O 2 %) (% saturation), water temperature (temp) ( • C), pH, total phosphorous (TP) (µg L −1 ) and ammonium hydroxide (NH 4 -N) (mg L −1 ).Latitude and longitude expressed in the coordinate reference system WGS 84/UTM zone 32, three spatial scales (regional, large and small water basin), habitat type, year and month were included as factors in the database.Mean values of the variables calculated for each of the 39 large water basins are provided in Supplementary Table S1.
To analyze factors responsible for species distribution in the presence of potential influence of spatial factors, data were analyzed with a pCCA after log(x + 1) transformation of the species matrix and standardization of environmental variables.The pCCA was carried out with the species matrix as dependent variables, the nine environmental variables as constraining variables (independent variables) and spatial coordinates as conditioning variables (covariates).Forward and backward selection of environmental variables was performed to select the variables accounting for the largest source of variation.
To confirm the results of pCCA, an inverse pCCA (pCCAi) was calculated, with spatial variables as constraining factors and environmental variables as conditioning factors; in other words, constraining and conditioning factors were exchanged.
To confirm the importance of habitat type in driving species composition, a DISCR analysis was performed with species as dependent variables and habitat type as a discriminant factor.Data were analyzed using the R-package 'vegan' [25].
To analyze the influence of spatial factors, the sites having the same spatial coordinates, but sampled on more than one date, were pooled into a single record, leaving a matrix with 264 sites, with 54 species present in at least 50 sites.
The analysis of spatial factors is needed because both environmental variables and species abundances can be influenced by values of the same variable measured at the surrounding sites.A correlogram measures this influence at increasing spatial scale.Univariate Moran's I spatial correlograms were calculated to analyze the response of each environmental variable and of each species to increasing spatial scale.Multivariate Mantel's correlograms were also calculated to analyze the spatial influence on the full set of environmental variables and the full set of species [25].Correlograms were analyzed using the R-packages 'vegan' and 'spdep' [26].
To further examine the species response at different spatial scales, MEMs were calculated using the R function mem, which transforms the distance matrix into an eigenvector matrix, with sites as rows and eigenvectors as columns; the eigenvectors represent the spatial structure at decreasing spatial scales [26].The species matrix was submitted to a multiscale pattern analysis (MSPA) [27] using the R function mspa, to recalculate the spatial structure of the data and to identify the scales of spatial variation more correlated with species.Calculations were performed using the R-packages 'adespatial', 'spdep' and 'vegan'.As a further step, a spatial weighting matrix (SWM) was calculated using the R functions mem.select, listw.candidatesand listw.select[28] to further select the subset of MEM eigenvectors which yields the highest correlation (i.e., the highest adjusted R 2 ) with species matrix.As a last step, a canonical analysis (=redundancy analysis using the function pcaiv in the package 'ade4') was used to explain the structure of the species matrix constrained by MEM spatial variables selected starting from the best SWM (sel.lw$best).See Jombart et al. [27] for a detailed explanation of all of these steps.Details of MEM, MSPA and SWM calculations are can be found in Bauman et al. [28] and Dray [29] and at the web site: https://cran.r-project.org/web/packages/adespatial/vignettes/tutorial.html(accessed on 1 February 2024).
The last step in data analysis was the calculation of diversity using the R-package 'vegan'.To estimate diversity, three spatial scales (i.e., region, large and small water basin) were considered.The mean species number present in each area (=alpha diversity), calculated as the mean of values for each site within the area, the total number of species present in the area (=gamma diversity) and the ratio between gamma and alpha diversity (=beta diversity) were then calculated.

Results
The high number of species found (233) was expected, because of the diverse habitat analyzed, including both running waters and lakes from different countries.The list of species included in the data analysis is provided in Table 2, with the number of samples for each species.The response of species to environmental factors, removing the potential confounding effect of spatial variables, was analyzed using pCCA.The nine environmental variables included as constraining variables accounted for 11% of total inertia, while spatial variables included as covariates explained less than 5% (Table 3a).The inverse pCCAi, with spatial variables as constraining variables and environmental ones as conditioning variables was carried out to establish if spatial variables can have a direct influence on species response, having removed the possible influence of environmental variables.In pCCAi, spatial variables accounted for an even lower proportion of inertia, less than 4% (Table 3b).
The first eigenvalue accounted for 5.5%, and the second eigenvalue for 2.6% of variation in the pCCA (Table 4a) but only 1.1% and <1%, respectively, in the pCCAi (Table 4b); the other eigenvalues accounted for negligible proportions in both analyses.Results of direct pCCA show a clear separation of high-altitude sites with low water temperature, from lowland sites (Figure 1a).The separation was evident along the first axis and was explained by an altitudinal temperature gradient.The second axis emphasized a clear trophic gradient, with sites rich in total phosphorus and poor in oxygen concentration separated from sites with low total phosphorous and high oxygen concentration.
The plot of sites evidenced a clear habitat separation (Figure 1b), with kryal sites grouped to the left and potamal sites at the bottom right.Large and small lakes (AL03-AL06) prevailed on the right.The alpine lakes (ALAlps) at high altitude were plotted near the kryal sites.The separation of lowland lakes (ME04 and ME07) from rhithral sites was less evident, even if rhithral sites were grouped in the center of the graph, while lakes surrounded them and potamal sites were well separated.
Species distribution in the plot (Figure 1c) mirrored that of sites (Figure 1b), with taxa characteristic of kryal and alpine lakes plotted on the left, species characterizing potamal and rhithral plotted on the right.The separation of species preferring crenal habitats or lakes was less evident here.
The pCCAi emphasized that the second axis separated sites and species according to latitude (Y) and longitude (X) (Figure 1d).The axes describing smaller spatial scales are all plotted in the left part of the graph.No clear separation of species according to their preferences is visible; only Tanytarsini are separated better than the species belonging to other tribes, but the species ordination did not separate groups with different ecological needs.
DISCR analysis using habitat types as factors showed a very good agreement between expected and observed classification (Figure 2, Tables 5 and S2).Some habitats such as AL03 and rhithral had more predicted sites than initially assigned, while others such as lakes (AL05-AL06) and potamal had a lower number of predicted sites than assigned, but in general the agreement was very good.These results confirmed direct pCCA results, i.e., that habitat type was a good predictor of chironomid assemblages.
The response of each environmental variable and of each species to increasing spatial scale, analyzed with univariate spatial correlograms, showed that spatial autocorrelation decreased rapidly with increasing distance class for all variables.The autocorrelations were generally low, below 1.The highest autocorrelation of environmental variables was observed for conductivity (Figure 3a), the highest autocorrelation of species was observed for T. gregarius, which appeared to be one of a few species with autocorrelations observable also at larger distances, but it was an exception (Figure 3b).Low values of autocorrelation mean that the value of a variable at a given site is only slightly related to the values in nearby sites and a still lower influence is observed at increasing spatial scales.2.
DISCR analysis using habitat types as factors showed a very good agreement between expected and observed classification (Figure 2, Table 5, Table S2).Some habitats such as AL03 and rhithral had more predicted sites than initially assigned, while others such as lakes (AL05-AL06) and potamal had a lower number of predicted sites than assigned, but in general the agreement was very good.These results confirmed direct pCCA results, i.e., that habitat type was a good predictor of chironomid assemblages.Plot of sites on the first two axes.Sites are grouped according to habitat and plotted with a different color.Abbreviations are as in Table 1.(c) pCCA results.Plot of species on the first two axes.Species abbreviations are as in Table 2.(d) pCCAi results: plot of spatial factors as constraining variables, with environmental variables as conditioning variables, X = longitude, Y = latitude.Species abbreviations are given in Table 2.   1.
This result was confirmed examining the Mantel correlograms, which showed that Mantel's autocorrelation was very low and vanished just after the first distance class; this was evident both for environmental variables (Figure 3c), and for chironomid assemblages (Figure 3d), confirming the absence of spatial autocorrelations except at the lowest distance.1.
This result was confirmed examining the Mantel correlograms, which showed that Mantel's autocorrelation was very low and vanished just after the first distance class; this was evident both for environmental variables (Figure 3c), and for chironomid assemblages (Figure 3d), confirming the absence of spatial autocorrelations except at the lowest distance.The MEMs generated from SWM were calculated and numbered from 1 to 264, with 264 being the number of sites; MEM1 is the eigenvector map associated with the highest eigenvalue and maps the largest distances, MEM264 is the one associated with the lowest eigenvalue and maps the shortest distances.The full matrix of MEMs is given in Table S3.The most significant MEMs were plotted in European maps; the eigenvectors were divided into five classes and are represented by different colors (Figure 4).The MEMs generated from SWM were calculated and numbered from 1 to 264, with 264 being the number of sites; MEM1 is the eigenvector map associated with the highest eigenvalue and maps the largest distances, MEM264 is the one associated with the lowest eigenvalue and maps the shortest distances.The full matrix of MEMs is given in Table S3.The most significant MEMs were plotted in European maps; the eigenvectors were divided into five classes and are represented by different colors (Figure 4).6 with their R 2 values.Values are grouped into five classes, with different colors (blue, green, yellow, orange, red) from highest to lowest; the full MEM matrix is given in Table S3.
The MEM of the first axis was related to latitude, the MEM of the second and third to longitude, while the other axes could not be associated to a well-defined factor.Few significant correlations between environmental variables and MEM eigenvectors were observed (see Table S4).The R 2 correlations between the species matrix and MEM spatial predictors, calculated from the best SWM (see Data analysis), were filed in decreasing order (Table 6).In Table S5, the completed list of R 2 correlations calculated with two different methods (mem.gab.selfrom mem.select and sel.lw$best from listw.select, see Section 2) are given [27].6 with their R 2 values.Values are grouped into five classes, with different colors (blue, green, yellow, orange, red) from highest to lowest; the full MEM matrix is given in Table S3.The MEM of the first axis was related to latitude, the MEM of the second and third to longitude, while the other axes could not be associated to a well-defined factor.Few significant correlations between environmental variables and MEM eigenvectors were observed (see Table S4).The R 2 correlations between the species matrix and MEM spatial predictors, calculated from the best SWM (see Data analysis), were filed in decreasing order (Table 6).In Table S5, the completed list of R 2 correlations calculated with two different methods (mem.gab.selfrom mem.select and sel.lw$best from listw.select, see Section 2) are given [27].
Finally, redundancy analysis (RDA) was carried out to relate the sites x species matrix with the MEM eigenvector matrix.Detailed results of the RDA can be examined in Table S6, where the scores of all species, sites and MEM variables are given.A summary of results is shown in Figures 5 and 6.The separation of sites is still evident according to habitat type (Figure 5); in this case, the lakes are separated and plotted on the left, even if large, small and volcanic lakes are not well separated, kryal and alpine lakes are in the upper part of figure, while rhithral sites are on the right in the center, and potamal sites are at the bottom right.The separation according to habitat is in agreement with pCCA and DISCR analysis, even if the position of each habitat is changed.The species separation is consequent, with the species with highest spatial autocorrelation as T. gregarius, and P. H. choreus have the highest scores in RDA analysis (Figure 6a, Table S6).Species preferring lentic habitat are plotted on the right, cold stenothermal species at the top and species characteristic of lotic habitats at the bottom right.The MEM variables are also plotted (Figure 6b); those with the highest R 2 values are the ones most distant from the center, as is clearly seen when comparing Table 6 with Figure 6b.RDA results carried out using the species matrix constrained by spatial variables: map of sites grouped according to habitat, plotted with different colors.Habitat abbreviations are given in Table 1.

Figure 5.
RDA results carried out using the species matrix constrained by spatial variables: map of sites grouped according to habitat, plotted with different colors.Habitat abbreviations are given in Table 1.Diversity for different habitats and for different spatial scales was then calculated to support the previous conclusions.The highest gamma diversity was observed in rhithral (95 species) and potamal (87) sites, and the lowest in volcanic lakes (28), while the highest alpha diversity was observed in lakes AL06 (16) and the lowest in alpine lakes (ALAlps) (13).The highest beta diversity was observed in rhithral (6) habitats and the lowest in volcanic lakes (2).A classification based on water basins shows that the highest gamma diversity (93) was observed in the Sarca basin, and the highest beta diversity in Adda (7); at a lower spatial scale, the highest gamma diversity (77) was observed in Lambro stream, and the highest beta diversity (5) in Bormida stream (see Table S7 for detailed results).The gamma and beta diversity decreased from the largest to the smallest spatial scale, respectively measuring 67 and 4.6 at the largest scale (regional), 47 and 3.2 at the intermediate scale (large water basin) and 26 and 1.8 at the smallest scale (small water basin).

Discussion
Our study showed that environmental variables such as substrate, salinity, water temperature, current velocity, conductivity, acidity, dissolved oxygen, nutrient content, etc., can be summarized in a single factor, i.e., habitat type, classified as kryal, crenal, rhithral, potamal zones of running waters, littoral, sublittoral, and profundal zones of lakes, summarizing preferences for different factors interacting with each other.Habitat type is able to explain the chironomid species composition, while geographic (i.e., latitude, longitude, altitude) and spatial factors (i.e., source distance, water depth) have only an indirect influence, corroborating results of previous studies [2,8].In fact, pCCA emphasized that the highest proportion of inertia was accounted for by environmental variables summarized in the habitat type (Figure 1b), while spatial factors (i.e., latitude, longitude and their polynomial expansions) accounted for a much smaller proportion of inertia.DISCR analysis of the species matrix carried out using habitat as the discriminant factor confirmed that there was very good agreement between the 'a priori' and 'a posteriori' classification based on habitat type.
The hypothesis that spatial factors at different scales could significantly influence chironomid response was tested with spatial correlograms and with Moran's maps.At small Diversity for different habitats and for different spatial scales was then calculated to support the previous conclusions.The highest gamma diversity was observed in rhithral (95 species) and potamal (87) sites, and the lowest in volcanic lakes (28), while the highest alpha diversity was observed in lakes AL06 (16) and the lowest in alpine lakes (ALAlps) (13).The highest beta diversity was observed in rhithral (6) habitats and the lowest in volcanic lakes (2).A classification based on water basins shows that the highest gamma diversity (93) was observed in the Sarca basin, and the highest beta diversity in Adda (7); at a lower spatial scale, the highest gamma diversity (77) was observed in Lambro stream, and the highest beta diversity (5) in Bormida stream (see Table S7 for detailed results).The gamma and beta diversity decreased from the largest to the smallest spatial scale, respectively measuring 67 and 4.6 at the largest scale (regional), 47 and 3.2 at the intermediate scale (large water basin) and 26 and 1.8 at the smallest scale (small water basin).

Discussion
Our study showed that environmental variables such as substrate, salinity, water temperature, current velocity, conductivity, acidity, dissolved oxygen, nutrient content, etc., can be summarized in a single factor, i.e., habitat type, classified as kryal, crenal, rhithral, potamal zones of running waters, littoral, sublittoral, and profundal zones of lakes, summarizing preferences for different factors interacting with each other.Habitat type is able to explain the chironomid species composition, while geographic (i.e., latitude, longitude, altitude) and spatial factors (i.e., source distance, water depth) have only an indirect influence, corroborating results of previous studies [2,8].In fact, pCCA emphasized that the highest proportion of inertia was accounted for by environmental variables summarized in the habitat type (Figure 1b), while spatial factors (i.e., latitude, longitude and their polynomial expansions) accounted for a much smaller proportion of inertia.DISCR analysis of the species matrix carried out using habitat as the discriminant factor confirmed that there was very good agreement between the 'a priori' and 'a posteriori' classification based on habitat type.
The hypothesis that spatial factors at different scales could significantly influence chironomid response was tested with spatial correlograms and with Moran's maps.At small scales, spatial autocorrelation was evident only at the shortest distances, with the possible exception of few species such as T. gregarius, which also showed a moderate autocorrelation at larger distances, but for most species it was negligible also at the shortest distances.Considering large spatial scales, the Moran's maps evidenced a spatial separation of species according to latitude and longitude, but this was interpreted as an artifact, because the sites at the highest altitudes with lower temperature are more frequent in the western part of the sampled area (Western Alps), suggesting a misleading influence of longitude, and the sites at lower latitude suffer an indirect effect of temperature, having higher temperatures [30,31] than the northern sites.
As regards diversity [32], in previous studies, the influence of spatial factors on chironomid diversity [11,12] was analyzed considering the effect of river order.The highest diversity of chironomids was found for streams of intermediate order, suggesting that factors acting at intermediate spatial scales are more relevant than factors at large and small spatial scales, in agreement with the old intermediate disturbance hypothesis [33].We did not test influence of river order on diversity; our data emphasized that beta and gamma diversity decreased at the three different spatial scales (regional, large and small water basin), but this was expected given the decreasing extent of the geographic areas.We also observed the highest gamma and beta diversity in rhithral and potamal sites, and the lowest in kryal habitats, again confirming the importance of habitat in explaining chironomid distribution.
Taxonomic composition was considered too cumbersome in ecological studies in recent years by many authors, who tried to substitute or augment taxonomic composition with functional composition descriptors [34,35] or with species trait analysis [36,37].The choice between taxonomic, functional and trait composition influenced the interpretation of results at different spatial scales.For example, it was emphasized [5] that species trait composition could be less affected by regional scale than taxonomic composition, while ecoregion and season accounted for 20.5% of the variance in functional composition, but only 10.9% of the variance in taxonomic structure [10].In the present case, we considered only taxonomic composition, because in our opinion, trait analysis and functional descriptors are not sufficiently developed to be applied to chironomid species [37].
There are few or no publications that can be compared with our study in considering response of chironomids to different spatial scales.Feld & Hering [10] investigated benthic macroinvertebrates at different spatial scales (ecoregion, catchment, reach, site) in four different countries (Sweden, the Netherlands, Germany and Poland).Mykrä et al. [38] focused their study on the Fennoscandia bioregion with pCCA, and concluded that local factors are prevalent in explaining macroinvertebrate distribution, although regional factors are also relevant at a larger spatial scale.In the present analysis, the area sampled included Italy, France, Switzerland, Austria, Germany, Greece, Montenegro and Algeria, although samples from Italy were largely prevalent and included Western, Central, Eastern Alps, Prealps, lowland Po River, different rivers in Central-South Italy, large and small prealpine lakes and volcanic lakes (Table S1), and the whole area was analyzed at three different spatial scales (regional, large and small water basin).Similar analyses (i.e., pCCA) were used [10,38] as in the current study; however, for various reasons, a direct comparison with our results is complicated.For instance, in those studies, only running water samples were analyzed, while we included both lotic and lentic samples.Furthermore, although both the current and previous studies included species-level data, in the current study, we focused only on chironomid assemblages, while in the studies of Feld & Hering [10] and Mykrä et al. [38], the whole benthic macrofauna was analyzed.Finally, fewer environmental and spatial variables were included in the current study (nine environmental and two spatial variables) compared to the previous studies: in [10], 31 and in [38], 15 environmental variables at four [10] (i.e., mega, macro, meso, micro), or three [38] (i.e., bioregion, ecoregion, drainage) hierarchical extents.Feld & Hering [10] concluded that interactions at different spatial scales confounded the interpretation of the results, but the meso-scale variables accounted for the largest source of variation; Mykrä et al. [38] considered local factors more important.
We can conclude that spatial scales are not relevant in modeling chironomid taxa assemblages.It is well known that at a very large spatial scale, considering wide global areas (Palaearctic, Nearctic, Oriental, Neotropic, Australian), chironomid species composition is quite different [6], but within the restricted area considered (i.e., the Mediterranean-Alpine area), the spatial factors have no influence, or they are mediated by other factors (e.g., altitude, water temperature etc.).Anthropogenic stress was not a focus of the present research [39], but even if many altered sites were included in the dataset, they did not substantially alter our conclusions about the prominent effect of habitat, although the well-known response of chironomids to oxygen shortage and eutrophication (measured as TP) [2,8,18] was confirmed here (Figure 1a).

Conclusions
Considering the response of benthic macroinvertebrates, and of chironomids in particular, to environmental factors, one concern is that important predictors may be missing in the model.This issue can be addressed by considering a more general factor such as habitat type, classified as littoral, sublittoral and profundal zone for lakes, and kryal, crenal, rhithral and potamal zone for running waters, as the best predictor of benthic macroinvertebrates in general and of chironomids in particular.Habitat type summarizes a suite of different environmental variables, such as substrate, salinity, water temperature, current velocity, conductivity, acidity, dissolved oxygen, nutrient content etc., avoiding the risk that some important predictors may be overlooked.The present database was analyzed with multivariate methods (pCCA, DISCR) able to combine different variables into few descriptors, i.e., eigenvectors with high inertia; we found that the eigenvectors with the highest inertia could be easily associated with habitat types.This was also confirmed by the analysis of spatial autocorrelation and Moran's I eigenvector maps, which supported the conclusion that spatial factors act only as indirect drivers determining species composition, at least within an area with an extension comparable to the Mediterranean region.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/insects15040272/s1,Table S1: Mean values of input data matrix used in data analysis, Table S2: Multiple discriminant analysis results, Table S3: Factors and eigenvector map values of 264 sites, Table S4: Correlations between environmental variables and MEM eigenvector values, Table S5: R 2 between species matrix and MEM eigenvector matrix calculated from two different SWM distance matrices, Table S6: MSPA and RDA results from species matrix constrained to eigenvector matrix, Table S7: Alpha, beta and gamma diversity groups according to habitat and stations.

Figure 1 .
Figure 1.(a) pCCA results.Plot of environmental variables on the first two axes.Abbreviations: altitude (altit), distance from the source (dist), conductivity (cond), dissolved oxygen (O2), water temperature (temp), pH, total phosphorous (TP), ammonium hydroxide (NH4).(b) pCCA results.Plot of sites on the first two axes.Sites are grouped according to habitat and plotted with a different color.Abbreviations are as in Table1.(c) pCCA results.Plot of species on the first two axes.Species abbreviations are as in Table2.(d) pCCAi results: plot of spatial factors as constraining variables, with environmental variables as conditioning variables, X = longitude, Y = latitude.Species abbreviations are given in Table2.

Table 5 .Figure 1 .
Figure 1.(a) pCCA results.Plot of environmental variables on the first two axes.Abbreviations: altitude (altit), distance from the source (dist), conductivity (cond), dissolved oxygen (O 2 ), water temperature (temp), pH, total phosphorous (TP), ammonium hydroxide (NH 4 ).(b) pCCA results.Plot of sites on the first two axes.Sites are grouped according to habitat and plotted with a different color.Abbreviations are as in Table1.(c) pCCA results.Plot of species on the first two axes.Species abbreviations are as in Table2.(d) pCCAi results: plot of spatial factors as constraining variables, with environmental variables as conditioning variables, X = longitude, Y = latitude.Species abbreviations are given in Table2.

Figure 2 .
Figure 2. Discriminant analysis: plot of sites using habitat as a discriminant factor; different habitats are marked with different colors.Habitat type abbreviations are given in Table1.

Figure 2 .
Figure 2. Discriminant analysis: plot of sites using habitat as a discriminant factor; different habitats are marked with different colors.Habitat type abbreviations are given in Table1.

Figure 3 .
Figure 3. Univariate correlogram of conductivity (a) and of T. gregarius (b).Multivariate Mantel's correlogram of environmental variables (c) and of species (d).Black squares mean significant values.

Figure 3 .
Figure 3. Univariate correlogram of conductivity (a) and of T. gregarius (b).Multivariate Mantel's correlogram of environmental variables (c) and of species (d).Black squares mean significant values.

Figure 4 .
Figure 4. Plot of six (MEM1, MEM2, MEM3, MEM4, MEM5, MEM23) of the out of ten most significant Moran's I eigenvectors, given in Table6with their R 2 values.Values are grouped into five classes, with different colors (blue, green, yellow, orange, red) from highest to lowest; the full MEM matrix is given in TableS3.

Figure 4 .
Figure 4. Plot of six (MEM1, MEM2, MEM3, MEM4, MEM5, MEM23) of the out of ten most significant Moran's I eigenvectors, given in Table6with their R 2 values.Values are grouped into five classes, with different colors (blue, green, yellow, orange, red) from highest to lowest; the full MEM matrix is given in TableS3.

Insects 2024 , 19 Figure 5 .
Figure5.RDA results carried out using the species matrix constrained by spatial variables: map of sites grouped according to habitat, plotted with different colors.Habitat abbreviations are given in Table1.

Figure 6 .
Figure 6.RDA results carried out using the species matrix constrained by spatial variables.(a) Plot of species.Species abbreviations are given in Table 2. (b) Plot of MEM variables.

Figure 6 .
Figure 6.RDA results carried out using the species matrix constrained by spatial variables.(a) Plot of species.Species abbreviations are given in Table 2. (b) Plot of MEM variables.

Author Contributions:
Conceptualization, B.R.; investigation, B.R. and L.M.; methodology, software, data curation, B.R.; writing-Original Draft Preparation, B.R.; Writing-Review & Editing, L.M. and B.R. All authors have read and agreed to the published version of the manuscript.Funding: This research received no external funding.

Table 2 .
List of species included in data analysis, in phylogenetic order according to the GBIF dataset (https://www.gbif.org/dataset/90d9e8a6-0ce1-472d-b682-3451095dbc5a,accessed on 1 February 2024), with abbreviations used in figures and the number of samples for each species (n).

Table 4 .
Results of partial canonical constrained ordination (pCCA) and its inverse (pCCAi).Eigenvalues of the first CCA axes, and their contribution to the scaled chi-square after removing the contribution of conditioning variables are shown.

Table 5 .
Number of sites originally present in each habitat and predicted in the same or in other habitats.The row sums are the total number of sites originally classified in one habitat; the column sums are the number of sites predicted in the same habitat.

Table 6 .
Moran's eigenvector maps (MEMs), ordered according to R 2 correlation with species matrix using listw.candidatesandlistw.selectfunctions.Only the ten most significant vectors are given.The complete matrix is provided in TableS5.

Table 6 .
Moran's eigenvector maps (MEMs), ordered according to R 2 correlation with species matrix using listw.candidatesandlistw.selectfunctions.Only the ten most significant vectors are given.The complete matrix is provided in TableS5.