Using Cilioplankton as an Indicator of the Ecological Condition of Aquatic Ecosystems

: We assess the quality of surface water in water bodies located in the Middle Volga region (Russian Federation). The water quality is assessed using 19 chemical compounds and cilioplankton indicators, such as the total number of species, the abundance of each species, and, based on both of them, the saprobity index and the Shannon–Weaver diversity index (H). We classify the water quality from polluted to extremely dirty by using abiotic indicators, and from conditionally clean to dirty by means of biotic indicators. Using the logistic regression method, we are able to predict the water quality (clean or dirty) in correspondence with the species diversity index (H) and to clarify how the quality of the water is related to its physicochemical properties. The seven most signiﬁcant chemical predictors of both natural origin (mineralization, hydro carbonates, and chlorides) and natural-anthropogenic origin (organic substances (according to BOD 5 ), nitrates, total petroleum hydrocarbons, iron), identiﬁed during the stepwise selection procedure, have a substantial inﬂuence on the outcome of the model. Qualitative and quantitative indicators of development of ciliates, as well as indices calculated on their basis, allow assessing with a very high level of accuracy the water quality and the condition of aquatic ecosystems in general. The Shannon index calculated for the number of ciliates can be successfully used for ranking water bodies as “clean / dirty”. Using 10-fold cross-validation with ﬁve repeats, it was shown that the model has a prediction accuracy of 75%. This positively characterizes the resulting model.


Introduction
Throughout recent decades, the system of assessment of surface water quality has undergone significant changes. After the adoption of the Water Framework Directive in 2000, bioindication methods have been widely used for this purpose [1]. Phytoplankton, zooplankton, and zoobenthos organisms are used as the main indicator groups. However, microzooplankton organisms, in particular, infusoria, as an indicator group are not taken into account at all in this system. Many researchers admit that infusoria are an important component of the plankton community in freshwater ecosystems [2][3][4][5][6][7][8][9][10][11][12] and a convenient object for characterizing the ecological condition of water bodies by synecological approaches [13,14].
While developing massively, infusoria actively feed on bacterioplankton and phytoplankton, detritus, and small rotifers, thus being an intermediate between producers and larger consumers, i.e., zooplankton, fish larvae, and others. The communities of ciliates that populate water bodies with various trophic levels and distinct habitats (plankton, benthos, periphyton) differ from each other. Infusoria are highly sensitive to changes in environmental conditions and react to them before other organisms. Under an extreme external influence, ciliates may reduce their number, and under a prolonged action, their species composition can change. A distinctive feature of infusoria is the presence prolonged action, their species composition can change. A distinctive feature of infusoria is the presence of zoochlorella in many species, which allows them to inhabit biotopes with low oxygen content. Therefore, the use of infusoria allows one to adequately assess the water quality and characterize the ecological condition of water bodies, especially highly polluted ones, where planktonic and benthic organisms practically do not survive.
Despite all the advantages of using this group of protozoa to assess the quality of water bodies, in practice, these organisms are rarely used for monitoring purposes. Some researchers, however, use this group as an indicator of water quality [15,16].
The research aims to study the species diversity in cilioplankton, its quantitative characteristics in water bodies of the Middle Volga region, assess the water quality using cilioplankton indicators, evaluate their reliability, and contrast the assessment with hydrochemical methods.

Study Area
We carried out studies of cilioplankton on different types of water bodies, namely, lakes, rivers, and the reservoir of the Middle Volga basin ( Figure 1). The territory of the Middle Volga region is located within the Russian Plain, in the temperate climate zone, at the junction of three zones: broadleaved forests, southern taiga, and forest-steppe. The study area is cut by large European waterways: the Volga, Kama, and Vjatka rivers, with their numerous tributaries, as well as small and large reservoirs on them [17].
The largest water body considered in the study area is the Kujbyshev Reservoir, which is the result of the closure of the Volga River by the dam of the Kujbyshev Hydroelectric Station in the area of the Zhiguli Mountains. The reservoir was filled from the end of October 1955 to May 1957, when the water level reached a normal impound level. It has a total capacity of 57.3 km 3 and a useful capacity of 33.9 km 3 . The catchment area is 1210 km 2 , while the surface area reaches 6.15 thousand km 2 . The maximum width of the reservoir is about 26 km (near the Kama Estuary); its total length is The largest water body considered in the study area is the Kujbyshev Reservoir, which is the result of the closure of the Volga River by the dam of the Kujbyshev Hydroelectric Station in the area of the Zhiguli Mountains. The reservoir was filled from the end of October 1955 to May 1957, when the water level reached a normal impound level. It has a total capacity of 57.3 km 3 and a useful capacity of 33.9 km 3 . The catchment area is 1210 km 2 , while the surface area reaches 6.15 thousand km 2 . The maximum width of the reservoir is about 26 km (near the Kama Estuary); its total length is 467 km along the Volga River and 280 km along the Kama River. The average depth is 9.3 m, while the maximum depth reaches 41 m (Volga Hydroelectric Station dam). The total shore length is 2604 km [18].
We conducted the studies in the Kujbyshev Reservoir and its tributaries of different orders: the rivers Vjatka, Steppe Zai, Kazanka, Mesha, Svijaga, Ilet, and its tributary Yushut. We also studied the Raifa Lake for comparison purposes and contrasted the condition of this aquatic ecosystem with other water bodies. Since all investigated water bodies are located within the same region, their hydrochemical characteristics are similar (see Table 1). There are various anthropogenic sources of pollution of natural waters in this region. The main ones are the following: domestic wastewater; industrial contaminated wastewater; melting snow and rainwater from the territory of towns and factories; drainage water from amelioration systems; surface runoff with melt and rainwater from agricultural lands; and wastewater from livestock complexes and poultry farms [19].
The territories of several cities (Kazan, Naberezhnye Chelny, Nizhnekamsk, Zelenodolsk, and Volzhsk), including engineering, chemical, petrochemical, and processing factories, are located in the catchment area of the Kujbyshev Reservoir. The impact of urban ecosystems manifests itself as the abstraction of water, discharges of industrial and domestic wastewater, and surface runoff of snowmelt and rainwater from their territories. The main pollutants from these facilities are phenols, total petroleum hydrocarbons (TPH), manganese, copper, iron, and nitrogen compounds [19].
Anthropogenic impact on small rivers (Mesha, Svijaga) is mainly due to agricultural facilities, as well as organized discharges of sewage from cities and towns (Steppe Zai, Kazanka, and Vjatka). The Steppe Zai River is additionally affected by oil extraction.
The Ilet River with its tributary Yushut is located in a sparsely populated forest zone, where there are no large towns. Therefore, the anthropogenic impact on these ecological systems is minimal.
Raifa Lake is located on the territory of the Volga-Kama State Nature Reserve and is in pristine condition.
Thus, the objects of study are characterized by different levels of anthropogenic influence ranging from high (Kujbyshev Reservoir) to low (Ilet and Yushut rivers, Raifa Lake).

Sampling and Analysis Methods
Water samples were collected from May to October during a long period from 1997 to 2004 and from 2018 to 2019, with a frequency of one sample every two weeks or once a month.
Samples of ciliates were collected with a bathometer in amounts of 0.5 L of water at each station. Then, 300 mL of water were taken from the sample and filtered through a membrane filter No. 9 by gravity (without vacuum) to a volume of 10 mL. Next, 0.5 mL from the obtained filtrate were distributed in drops on a glass plate and examined under a microscope to account for small forms (from 15 to 50 µm). The remaining 9.5 mL of water were examined in a Bogorov chamber to count large forms (from 50 to 400 µm). The number of ciliates per 1 m 3 was calculated by the formula where N is the number of ciliates in 1 L (× 1000 = 1 m 3 ), n 1 is the number of ciliates in 0.5 mL of the sample for small forms, and n 2 is the number of large ciliates in 10 mL of the sample [5].
Species guides [10,20] were used to identify the species. The Shannon-Weaver index [21] was used to assess species diversity. Water quality was assessed by the Pantley-Buck saprobity index [22] modified by Sladecek [23]. The degree of pollution was assessed according to a system commonly adopted in Russia [24,25]. As specified by this system, the water quality is characterized as clean if the saprobity index is less than 1.5, weakly polluted if it is 1.5-2.5, polluted if it is 2.5-3.5, dirty if it is 3.5-4, and extremely dirty if it is greater than 4. A total of 165 water samples were collected and analyzed throughout the study.
In parallel, we performed a hydrochemical analysis to determine the composition of water by 19 components: pH, hardness, mineralization, bicarbonates, oxygen dissolved in water, biological and chemical oxygen demand BOD 5 , COD, N-NO 2 − , N-NO 3 − , N-NH 4 + , chlorides, sulfates, total iron, phenols, total petroleum hydrocarbons, Cu 2+ , Zn 2+ , total Cr, and Mn 2+ . Moreover, we performed an analysis of long-term (over the past 15 years) hydrochemical information. To assess the water quality by chemical indicators, we calculated the specific combinatoric index of water pollution (SCIWP), which takes into account the multiplicity and frequency ratio of exceeding the maximum permissible concentrations (MPC) [25].

Data Analysis Methods
To analyze the data as a biological indicator, we calculated the species diversity index (H) based on the number of ciliates; its values varied from 0 to 4.5. All studied water bodies according to H were characterized as "moderately polluted" and "dirty", that is, they were divided into two classes of water quality, the first one being clean (0) and the second one, dirty (1). The database included 126 observations: 85 clean (0) and 41 dirty (1). Class 0 is characterized by a higher species diversity of cilioplankton compared to class 1.
We applied logistic regression to assess the relationship between the class of water quality (clean-0 or dirty-1) determined by the species diversity index (H), calculated for the cilioplankton, and data obtained from the hydrochemical analysis.
The binary logistic regression is a type of regression used when the dependent variable is a dichotomous one and the independent variables are quantitative data.
The binary logistic regression model with several independent variables is given as where p is the probability that Y = 1 for a given set of values of X, and z is the standard regression equation [26].
Logical regression is a popular statistical method used in biology and ecology [27][28][29]. It is a standard analysis method in situations where the dependent variable (Y) is dichotomous, taking the values 0 and 1.
In the case considered in the study, the dependent variable (Y) is the class of water quality (clean-0 or dirty-1). The independent variables (predictors) are 19 chemical indicators of water quality. During the initial data analysis, we found that some variables have a pronounced asymmetric distribution; in this connection, we preliminarily calculated the logarithm of these data. The entire original sample was randomly divided into two sets of 80% (training data) and 20% (test data).
A stepwise procedure of inclusions with exclusions of "weak" predictors was applied to determine the significant variables that most strongly affect the resulting variable.
The quality of the logistic regression model was assessed by ROC analysis (Receiver Operating Characteristic). The predictive capability of the constructed model was assessed by 10-fold cross-validation with five repeats.
The model construction and all the calculations were conducted using programs written in R. We used the standard R libraries and several packages, such as vegan, ROCR, boot, ggplot2, and caret [30].

Hydrochemical Features of the Objects of Study
The chemical composition of the water in the investigated water bodies is given in Table A1,  Table A2 from Appendix A.
The water of the rivers Ilet, Kazanka, Steppe Zai, and Noksa features elevated hardness and mineralization. The mineralization of the water in the Kazanka and Ilet rivers (351 to 1457 mg/L) is mainly due to sulfates and is associated with the intensive discharge of groundwater. The mineralization of water in the Steppe Zai River (429 to 1480 mg/L) is determined by the high content of chlorides coming from groundwater contaminated as a result of oil extraction activities on the catchment of this river. The water of the Vjatka River is characterized by lower mineralization (113 to 511 mg/L); the same is true for the Kujbyshev Reservoir and Raifa Lake (171 to 263 mg/L). Other indicators are mainly connected to factors of anthropogenic nature.
The Kujbyshev Reservoir at most sites was classified as "dirty" according to its hydrochemical composition. The main contribution to the surface water pollution is made by copper compounds, organic compounds (in terms of BOD 5 and COD), ammonium nitrogen, nitrite nitrogen, and TPH. The frequency of exceeding the MPC of these substances ranges from 30 to 50%. The oxygen regime in the reservoir was satisfactory throughout the study.
The water of Kazanka and Steppe Zai rivers was also classified as dirty. The main chemical indicators of pollution include organic matter (in terms of COD and BOD 5 ), nitrite nitrogen, copper and iron, TPH, ammonium nitrogen, and sulfates. Chlorides are significant indicators only in the Steppe Zai River.
The rivers Vjatka, Mesha, Svijaga, and Noksa were classified as highly polluted by their water quality. Organic substances (in terms of COD), sulfates, and copper are characteristic indicators in these rivers. Additionally, nitrite nitrogen appeared as a characteristic indicator in the Noksa River. We observed a deterioration of the oxygen regime only in the Noksa River in summer.
The degree of pollution of the Raifa Lake is also very high; we detected an increased content of organic substances (in terms of COD and BOD 5 ), as well as compounds of both iron and copper.
The Ilet and Yushut rivers were classified as polluted. In these rivers, we found an increased content of sulfates, iron, and copper. In the Yushut, we detected organic substances leading to characteristic pollution (in terms of COD).
Moreover, a section of the Kujbyshev Reservoir was characterized as extremely dirty. This corresponds to the wastewater discharge area at the Mari El Pulp and Paper Mill (MPPM), near the town of Volzhsk. Water pollution by phenols, organic substances (in terms of COD and BOD 5 ), and ammonium nitrogen compounds were persistently detected at this site. The complete absence of oxygen in the water should also be notated.
Thus, the objects of research were classified in accordance with their water quality and ranked from polluted to extremely dirty.

Features of the Development of Ciliates
The surveyed water bodies and watercourses can be distinguished by a wide variety in their infusoria fauna and the ranges of fluctuation in their numbers (see Table 2). The number of species of ciliates and their abundance are the most important indicators characterizing the condition of ecosystems, since a sudden external influence leads ciliates to reduce their number, whereas a prolonged one cause changes in species composition, along with a simplification of trophic structure.  In the examined water bodies, we identified 156 species of ciliates (see Table 2). The Kujbyshev Reservoir shows the greatest species diversity, with 99 species. The least diversity was detected in the area of wastewater discharge at the MPPM: only 10 species.
Among the examined rivers, the largest number of species was found in the Kazanka River (111 species), and the smallest in the Noksa river (13 species).
A low species diversity and little variation in the number of species and abundance are typical for water bodies that are less affected by anthropogenic factors (background condition) and have a good oxygen regime (7.  (Kahl, 1935), and Limnostrombidium pelagicum (Kahl, 1932).
The saprobity index (S) varied within from 1.3 to 2.1, which corresponds to waters going from conditionally clean to moderately polluted. A significant variation in the number of cilioplankton was detected in rivers that were characterized as "dirty" due to increased contents of organic substances in terms of both BOD 5  According to the species diversity index (H), calculated for the number of cilioplankton (see Table 2), all the examined water bodies were classified as moderately polluted and dirty (H > 3 corresponds to clean, H from 1 to 3 is moderately polluted, and H < 1 is dirty).

Statistical Analysis Logistic Regression Model
We used logistic regression to study the relationship between hydrochemical indicators that characterize a water body and its water quality, which is classified as "clean-0/dirty-1" according to the species diversity index.
Using a stepwise procedure, we chose eight of the most important predictors: temperature, chlorides, mineralization, bicarbonates, biological oxygen consumption, nitrates, iron, and total petroleum hydrocarbons.
Based on the eight selected variables, we simulated the probability of an object falling into class 1. As a result, we obtained the following logistic regression equation: where X 1 is temperature, X 2 stands for chlorides (Cl), X 3 for mineralization (Min), X 4 for hydrocarbonates (HCO 3 − ), X 5 for biological oxygen demand (BOD 5 ), X 6 for nitrates (NO 3 − ), X 7 for iron (Fe), and X 8 for total petroleum hydrocarbons. Table 3 contains the estimated coefficients, standard errors, and Wald test statistics (Z value) for the logistic regression. All coefficients in the model are statistically significant. The obtained coefficients reflect the contribution of the corresponding predictor. The optimal cutoff threshold for the probability was set to 0.35. This cutoff threshold was found as the intersection of three important indicators of the model: specificity, sensitivity, and efficiency. An analysis of the training sample showed that the specificity and sensitivity for the constructed model was 78%, while the accuracy was 77%. An analysis of the test sample showed that the accuracy of the model is 74%, the specificity is 52%, and the sensitivity is 85%.
To assess the quality of the resulting logistic regression model, we constructed the ROC curve in coordinates x = 1 − Specificity and y = Sensitivity, which reflects the quality of the logistic model ( Figure 2). The Area Under the Curve AUC = 0.833 characterizes the quality of the model as good. The optimal cutoff threshold for the probability was set to 0.35. This cutoff threshold was found as the intersection of three important indicators of the model: specificity, sensitivity, and efficiency. An analysis of the training sample showed that the specificity and sensitivity for the constructed model was 78%, while the accuracy was 77%. An analysis of the test sample showed that the accuracy of the model is 74%, the specificity is 52%, and the sensitivity is 85%.
To assess the quality of the resulting logistic regression model, we constructed the ROC curve in coordinates x = 1 − Specificity and y = Sensitivity, which reflects the quality of the logistic model ( Figure 2). The Area Under the Curve AUC = 0.833 characterizes the quality of the model as good. Using 10-fold cross-validation with five repeats, it was shown that the model has a prediction accuracy of 75%. This positively characterizes the resulting model.

Discussion
According to their chemical parameters, the studied water bodies were characterized as polluted Using 10-fold cross-validation with five repeats, it was shown that the model has a prediction accuracy of 75%. This positively characterizes the resulting model.

Discussion
According to their chemical parameters, the studied water bodies were characterized as polluted and dirty, due mainly to pollutants of anthropogenic origin. The most significant among them were organic substances (in terms of both COD and BOD 5 ), sulfates, chlorides, nitrogen compounds (ammonium nitrogen, nitrates, nitrites), as well as heavy metals (iron, copper, manganese, aluminum), total petroleum hydrocarbons, and phenols.
The The SCIWP calculated by chemical parameters varied in the studied water bodies from 2.66 to 11.35. The lowest values were recorded in the Ilet and Yushut rivers; this characterizes these rivers as contaminated or polluted. The highest value of the index was in the area of the MPPM, which allowed attributing this section of the Kujbyshev Reservoir to extremely dirty.
The other sections of the Kujbyshev Reservoir, as well as the Steppe Zai and Kazanka correspond to the dirty interval (SCIWP 4.1-4.63). The Noksa, Svijaga, Mesha, Vjatka, and Raifa Lake were rated as highly polluted (SCIWP 3.34-3.64). This means that, in terms of water quality, most of the examined water bodies in the Middle Volga fall into the category of highly polluted and dirty.
The water-quality assessment by the saprobity index showed that most water bodies correspond to the category of slightly polluted. Moreover, one object was classified as conditionally clean-slightly polluted (Raifa Lake); the Noksa River was rated as slightly polluted, the Steppe Zai River as contaminated, and the wastewater discharge area at the MPPM as contaminated to dirty. According to the index of species diversity, most water bodies fall into the category of moderately polluted (=slightly polluted) waters, the Vjatka river was clean, and the wastewater discharge area at the MPPM was rated as dirty. Therefore, the assessment of water quality by the saprobity index does not coincide with the assessment by chemical indicators and, for most water bodies, the former was found to be better than the former. Another disadvantage of the saprobity index is that it cannot be used to evaluate extremely dirty ecosystems, where ciliates are completely absent. In this case, the species diversity index (H) is more appropriate, as it takes into account the qualitative and quantitative development of organisms in the environment.
The constructed model of logistic regression can determine the quality of water bodies and classify them as "clean/dirty". For this purpose, it uses the species diversity index (H) of cilioplankton and seven significant chemical predictors of both natural origin (mineralization, hydrocarbonates, and chlorides) and natural-anthropogenic origin (organic matter content in terms of BOD 5 , nitrates, total petroleum hydrocarbons, iron), as well as a physical factor, the water temperature, which mostly determines the development of cilioplankton.
Previous studies have proven the existence of dependence between the indicators of cilioplankton and the content of organic substances both in surface water [15,16,31,32] and in wastewater [33]. However, it has been noted that high concentrations of organic substances (in terms of COD and BOD) and ammonium salts adversely affect the development of ciliates [34]. The most toxic for ciliates are high concentrations of ammonium and salts of heavy metals: cadmium, copper, chromium, lead, and nickel [35,36]. Moreover, it has been noticed that changes in the ecological and morphological parameters of ciliates occur under the influence of nitrates, in particular, a decrease in body size, contractile vacuole, and peristome [37].
The highest contents of ammonium and nitrates (4.8 and 9.6 mg/L), as well as organic substances Thus, when the concentration of nitrogen in the water increases, the species diversity of infusoria and their abundance decrease, and small-sized forms prevail. These patterns were noted earlier by other researchers [38][39][40]. The Shannon species diversity index calculated for the number of cilioplankton has been suggested by other researchers [31,32] for the assessment of the quality of water bodies of different types (in particular, seawater). It has been pointed out that protozoan community diversity indices could be used as indicators that are more effective in biomonitoring aquatic ecosystems. This is because infusoria have a short life cycle and, as a result, the community response to external influence is rapid.
Recently, new approaches based on synecological methods have been introduced in addition to traditional methods for assessing water quality using infusoria (saprobity index, index of species diversity). We can mention, in particular, a method of functional diversity of infusoria, which is based on the structural features of ciliates and the type of nutrition [13]. The use of this technique in different communities of infusoria (benthic, epiphytic, planktonic) showed that the functional diversity of communities has a high potential for monitoring the ecological condition of the aquatic environment [41,42]. In our opinion, these methods, based on the approaches of functional and evolutionary ecology, are designed to study the condition of an ecosystem as a whole and we plan to consider these methods in future studies.

Conclusions
Water bodies of the Middle Volga region are subjected to varying degrees of anthropogenic impact. This affects the quality of surface water, which can be rated from contaminated to extremely dirty.
Various cilioplankton communities develop in water bodies with different levels of pollution. They differ in species composition and abundance. According to the saprobity index (S) and the species diversity index (H), the water quality was rated as moderately polluted in most water bodies.
The statistical analysis showed the existence of a relationship between the water quality of a water body (classified as "clean/dirty" according to the species diversity index (H) of cilioplankton) and hydrochemical indicators that characterize the water quality. Using logistic regression, we determined eight significant predictors: water temperature, chlorides, mineralization, hydrocarbonates, BOD 5 , nitrates, iron, and total petroleum hydrocarbons. These are the most significant when determining the water quality class by the Shannon index calculated for the number of ciliates.
Qualitative and quantitative indicators of development of ciliates, as well as indices calculated on their basis, allow assessing with a very high level of accuracy the water quality and the condition of aquatic ecosystems in general. In particular, the Shannon index calculated for the number of ciliates can be successfully used for ranking water bodies as "clean/dirty".
Thus, ciliates as an object of bioindication are a very convenient and promising group for biomonitoring needs, including the implementation of methods of the Water Framework Directive.

Conflicts of Interest:
The authors declare no conflict of interest.  The numerator is min-max, the denominator is average ± ∆; TPH-total petroleum hydrocarbons; SCIWP-the specific combinatoric index of water pollution.