Inferring Water Quality in the Songhua River Basin Using Random Forest Regression Based on Satellite Imagery and Geoinformation
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Area
2.2. Data Collection
2.3. Methods
2.3.1. Construction of Inversion Band Combinations
2.3.2. Pearson Correlation Analysis
2.3.3. Model Construction
2.3.4. Model Evaluation
3. Results
3.1. Pearson Correlation Results
3.2. Evaluation of Different Models
3.3. Spatial Variation in Water Quality Parameters
4. Discussion
5. Conclusions
- (1)
- The conductivity results show that the closer to the source, the better the water quality. And the TN content of Yanbian Prefecture was higher than that of downstream cities, which may be influenced by tributaries caused by vigorous tourism in recent years.
- (2)
- The overall results show that the water quality in the upper reaches of the Songhua River is better than that in the lower reaches and that the water quality of the west Songhua River and Nenjiang River is much higher than that of the east Songhua River.
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
TN | Total nitrogen |
TP | Total phosphorus |
SRB | Songhua River Basin |
USGS | United States Geological Survey |
WSR | West-flowing Songhua River |
ESR GEE | East-flowing Songhua River Google Earth Engine |
IHO | International Hydrological Organization |
IOC | Oceanographic Commission |
DEM | Digital elevation model |
R2 | Coefficient of determination |
MAE | Mean absolute error |
MSE | Mean squared error |
RMSE | Root mean square error |
References
- Zhang, Z.; Chen, X.; Xu, C.-Y.; Hong, Y.; Hardy, J.; Sun, Z. Examining the influence of river–lake interaction on the drought and water resources in the Poyang Lake basin. J. Hydrol. 2015, 522, 510–521. [Google Scholar] [CrossRef]
- Baggio, G.; Qadir, M.; Smakhtin, V. Freshwater availability status across countries for human and ecosystem needs. Sci. Total Environ. 2021, 792, 148230. [Google Scholar] [CrossRef]
- Wang, H.; Zhang, J.; Zeng, W. Intelligent simulation of aquatic environment economic policy coupled ABM and SD models. Sci. Total Environ. 2018, 618, 1160–1172. [Google Scholar] [CrossRef]
- Schwarzenbach, R.P.; Egli, T.; Hofstetter, T.B.; von Gunten, U.; Wehrli, B. Global Water Pollution and Human Health. Annu. Rev. Environ. Resour. 2010, 35, 109–136. [Google Scholar] [CrossRef]
- Githaiga, K.B.; Njuguna, S.M.; Gituru, R.W.; Yan, X. Water quality assessment, multivariate analysis and human health risks of heavy metals in eight major lakes in Kenya. J. Environ. Manag. 2021, 297, 113410. [Google Scholar] [CrossRef] [PubMed]
- Shou, C.-Y.; Yue, F.-J.; Zhou, B.; Fu, X.; Ma, Z.-N.; Gong, Y.-Q.; Chen, S.-N. Chronic increasing nitrogen and endogenous phosphorus release from sediment threaten to the water quality in a semi-humid region reservoir. Sci. Total Environ. 2024, 931, 172924. [Google Scholar] [CrossRef]
- Saeedi, R.; Sadeghi, S.; Massoudinejad, M.; Oroskhan, M.; Mohagheghian, A.; Mohebbi, M.; Abtahi, M. Assessing drinking water quality based on water quality indices, human health risk, and burden of disease attributable to heavy metals in rural communities of Yazd County, Iran, 2015–2021. Heliyon 2024, 10, e33984. [Google Scholar] [CrossRef]
- Vörösmarty, C.J.; McIntyre, P.B.; Gessner, M.O.; Dudgeon, D.; Prusevich, A.; Green, P.; Glidden, S.; Bunn, S.E.; Sullivan, C.A.; Liermann, C.R.; et al. Global threats to human water security and river biodiversity. Nature 2010, 467, 555–561. [Google Scholar] [CrossRef]
- Cao, X.; Zhang, J.; Meng, H.; Lai, Y.; Xu, M. Remote sensing inversion of water quality parameters in the Yellow River Delta. Ecol. Indic. 2023, 155, 110914. [Google Scholar] [CrossRef]
- Wang, F.; Wang, Y.; Chen, Y.; Liu, K. Remote sensing approach for the estimation of particulate organic carbon in coastal waters based on suspended particulate concentration and particle median size. Mar. Pollut. Bull. 2020, 158, 111382. [Google Scholar] [CrossRef]
- Harkort, L.; Duan, Z. Estimation of dissolved organic carbon from inland waters at a large scale using satellite data and machine learning methods. Water Res. 2023, 229, 119478. [Google Scholar] [CrossRef]
- Chen, Y.; Arnold, W.A.; Griffin, C.G.; Olmanson, L.G.; Brezonik, P.L.; Hozalski, R.M. Assessment of the chlorine demand and disinfection byproduct formation potential of surface waters via satellite remote sensing. Water Res. 2019, 165, 115001. [Google Scholar] [CrossRef]
- Guo, K.; Zou, T.; Jiang, D.; Tang, C.; Zhang, H. Variability of Yellow River turbid plume detected with satellite remote sensing during water-sediment regulation. Cont. Shelf Res. 2017, 135, 74–85. [Google Scholar] [CrossRef]
- Ahmed, W.; Mohammed, S.; El-Shazly, A.; Morsy, S. Tigris River water surface quality monitoring using remote sensing data and GIS techniques. Egypt. J. Remote Sens. Space Sci. 2023, 26, 816–825. [Google Scholar] [CrossRef]
- Moradi, M. Comparison of the efficacy of MODIS and MERIS data for detecting cyanobacterial blooms in the southern Caspian Sea. Mar. Pollut. Bull. 2014, 87, 311–322. [Google Scholar] [CrossRef] [PubMed]
- Qing, S.; Zhang, J.; Cui, T.; Bao, Y. Retrieval of sea surface salinity with MERIS and MODIS data in the Bohai Sea. Remote Sens. Environ. 2013, 136, 117–125. [Google Scholar] [CrossRef]
- Feng, L.; Hou, X.; Zheng, Y. Monitoring and understanding the water transparency changes of fifty large lakes on the Yangtze Plain based on long-term MODIS observations. Remote Sens. Environ. 2019, 221, 675–686. [Google Scholar] [CrossRef]
- Shi, K.; Zhang, Y.; Zhang, Y.; Qin, B.; Zhu, G. Understanding the long-term trend of particulate phosphorus in a cyanobacteria-dominated lake using MODIS-Aqua observations. Sci. Total Environ. 2020, 737, 139736. [Google Scholar] [CrossRef]
- Zhou, Y.; Yu, D.; Cheng, W.; Gai, Y.; Yao, H.; Yang, L.; Pan, S. Monitoring multi-temporal and spatial variations of water transparency in the Jiaozhou Bay using GOCI data. Mar. Pollut. Bull. 2022, 180, 113815. [Google Scholar] [CrossRef]
- Doxaran, D.; Lamquin, N.; Park, Y.-J.; Mazeran, C.; Ryu, J.-H.; Wang, M.; Poteau, A. Retrieval of the seawater reflectance for suspended solids monitoring in the East China Sea using MODIS, MERIS and GOCI satellite data. Remote Sens. Environ. 2014, 146, 36–48. [Google Scholar] [CrossRef]
- Caballero, I.; Navarro, G. Application of extended full resolution MERIS imagery to assist coastal management of the area adjacent to the Guadalquivir estuary. Prog. Oceanogr. 2018, 165, 215–232. [Google Scholar] [CrossRef]
- Tao, B.; Mao, Z.; Lei, H.; Pan, D.; Shen, Y.; Bai, Y.; Zhu, Q.; Li, Z. A novel method for discriminating Prorocentrum donghaiense from diatom blooms in the East China Sea using MODIS measurements. Remote Sens. Environ. 2015, 158, 267–280. [Google Scholar] [CrossRef]
- Bernardo, N.; Watanabe, F.; Rodrigues, T.; Alcântara, E. Evaluation of the suitability of MODIS, OLCI and OLI for mapping the distribution of total suspended matter in the Barra Bonita Reservoir (Tietê River, Brazil). Remote Sens. Appl. 2016, 4, 68–82. [Google Scholar] [CrossRef]
- Montanher, O.C.; Novo, E.M.L.M.; Barbosa, C.C.F.; Rennó, C.D.; Silva, T.S.F. Empirical models for estimating the suspended sediment concentration in Amazonian white water rivers using Landsat 5/TM. Int. J. Appl. Earth Obs. Geoinf. 2014, 29, 67–77. [Google Scholar] [CrossRef]
- Griffin, C.G.; McClelland, J.W.; Frey, K.E.; Fiske, G.; Holmes, R.M. Quantifying CDOM and DOC in major Arctic rivers during ice-free conditions using Landsat TM and ETM+ data. Remote Sens. Environ. 2018, 209, 395–409. [Google Scholar] [CrossRef]
- Du, Y.; Song, K.; Liu, G.; Wen, Z.; Fang, C.; Shang, Y.; Zhao, F.; Wang, Q.; Du, J.; Zhang, B. Quantifying total suspended matter (TSM) in waters using Landsat images during 1984–2018 across the Songnen Plain, Northeast China. J. Environ. Manag. 2020, 262, 110334. [Google Scholar] [CrossRef]
- Xia, K.; Wu, T.; Li, X.; Wang, S.; Shen, Q. A new method for accurate inversion of Forel-Ule index using MODIS images—revealing the water color evolution in China’s large lakes and reservoirs over the past two decades. Water Res. 2024, 255, 121560. [Google Scholar] [CrossRef]
- Sahoo, D.P.; Sahoo, B.; Tiwari, M.K. MODIS-Landsat fusion-based single-band algorithms for TSS and turbidity estimation in an urban-waste-dominated river reach. Water Res. 2022, 224, 119082. [Google Scholar] [CrossRef]
- Zhang, S.; Wang, L.; Wang, Y.; Zhang, X.; Zhu, Y.; Ma, G. Monitoring of Low Chl-a Concentration in Hulun Lake Based on Fusion of Remote Sensing Satellite and Ground Observation Data. Remote Sens. 2024, 16, 1811. [Google Scholar] [CrossRef]
- Sajeev, S.; Sekar, S.; Kumar, B.; Senapathi, V.; Chung, S.Y.; Gopalakrishnan, G. Variations of water quality deterioration based on GIS techniques in surface and groundwater resources in and around Vembanad Lake, Kerala, India. Geochemistry 2020, 80 (Suppl. S4), 125626. [Google Scholar] [CrossRef]
- Zhang, W.; Rong, N.; Jin, X.; Meng, X.; Han, S.; Zhang, D.; Shan, B. Dissolved oxygen variation in the North China Plain river network region over 2011–2020 and the influencing factors. Chemosphere 2022, 287, 132354. [Google Scholar] [CrossRef] [PubMed]
- Feng, Y.; Guo, Y.; Shen, Y.; Zhang, G.; Wang, Y.; Chen, X. Change of crop structure intensified water supply-demand imbalance in China’s Black Soil Granary. Agric. Water Manag. 2024, 306, 109199. [Google Scholar] [CrossRef]
- Shen, L.Q.; Amatulli, G.; Sethi, T.; Raymond, P.; Domisch, S. Estimating nitrogen and phosphorus concentrations in streams and rivers, within a machine learning framework. Sci. Data. 2020, 7, 161. [Google Scholar] [CrossRef]
- Wang, S.; Wang, Y.; Ran, L.; Su, T. Climatic and anthropogenic impacts on runoff changes in the Songhua River basin over the last 56years (1955–2010), Northeastern China. Catena 2015, 127, 258–269. [Google Scholar] [CrossRef]
- Pekel, J.-F.; Cottam, A.; Gorelick, N.; Belward, A.S. High-resolution mapping of global surface water and its long-term changes. Nature 2016, 540, 418–422. [Google Scholar] [CrossRef]
- Maimouni, S.; Moufkari, A.A.; Daghor, L.; Fekri, A.; Oubraim, S.; Lhissou, R. Spatiotemporal monitoring of low water turbidity in Moroccan coastal lagoon using Sentinel-2 data. Remote Sens. Appl. Soc. Environ. 2022, 26, 100772. [Google Scholar] [CrossRef]
- Yin, F.; Yang, G.; Yan, M.; Xie, Q. Application of multispectral remote sensing technology in water quality monitoring. Desal. Water Treat. 2019, 149, 363–369. [Google Scholar] [CrossRef]
- Yousefi, M.; Oskoei, V.; Esmaeli, H.R.; Baziar, M. An innovative combination of extra trees within adaboost for accurate prediction of agricultural water quality indices. Results Eng. 2024, 24, 103534. [Google Scholar] [CrossRef]
- Li, B.; Liu, K.; Wang, M.; Wang, Y.; He, Q.; Zhuang, L.; Zhu, W. High-spatiotemporal-resolution dynamic water monitoring using LightGBM model and Sentinel-2 MSI data. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103278. [Google Scholar] [CrossRef]
- Wang, F.; Wang, Y.; Zhang, K.; Hu, M.; Weng, Q.; Zhang, H. Spatial heterogeneity modeling of water quality based on random forest regression and model interpretation. Environ. Res. 2021, 202, 111660. [Google Scholar] [CrossRef]
- Tan, Z.; Ren, J.; Li, S.; Li, W.; Zhang, R.; Sun, T. Inversion of Nutrient Concentrations Using Machine Learning and Influencing Factors in Minjiang River. Water 2023, 15, 1398. [Google Scholar] [CrossRef]
- Zhang, Y.; Jin, S.; Wang, N.; Zhao, J.; Guo, H.; Pellikka, P. Total Phosphorus and Nitrogen Dynamics and Influencing Factors in Dongting Lake Using Landsat Data. Remote Sens. 2022, 14, 5648. [Google Scholar] [CrossRef]
- Hadjikakou, M.; Miller, G.; Chenoweth, J.; Druckman, A.; Zoumides, C. A comprehensive framework for comparing water use intensity across different tourist types. J. Sustain. Tour. 2015, 23, 1445–1467. [Google Scholar] [CrossRef]
- Debels, P.; Figueroa, R.; Urrutia, R.; Barra, R.; Niell, X. Evaluation of Water Quality in the Chillán River (Central Chile) Using Physicochemical Parameters and a Modified Water Quality Index. Environ. Monit. Assess. 2005, 110, 301–322. [Google Scholar] [CrossRef] [PubMed]
- Zhou, X.; Liu, C.; Carrion, D.; Akbar, A.; Wang, H. Spectro-environmental factors integrated ensemble learning for urban river network water quality remote sensing. Water Res. 2024, 267, 122544. [Google Scholar] [CrossRef]
- Peng, C.; Xie, Z.; Jin, X. Using Ensemble Learning for Remote Sensing Inversion of Water Quality Parameters in Poyang Lake. Sustainability 2024, 16, 3355. [Google Scholar] [CrossRef]
Feature Classification | Band Combination | Number |
---|---|---|
Single band | 6 | |
Band square index | 2 | |
Band sum index | 1 | |
Band subtraction index | 2 | |
Band mixture index | 1 | |
Modified Normalized Difference Water Index (MNDWI) | 1 | |
Normalized Difference Vegetation Index (NDVI) | 1 |
DEM | Precipitation | Grain Yield | Cropland Area | Proportion 1 | Slope | Temperature | ||
---|---|---|---|---|---|---|---|---|
DEM | Pearson Correlation | -- | ||||||
Significance | -- | |||||||
Precipitation | Pearson Correlation | 0.444 ** | -- | |||||
Significance | 0.000 | -- | ||||||
Grain yield | Pearson Correlation | −0.365 ** | 0.040 | -- | ||||
Significance | 0.000 | 0.495 | -- | |||||
Cropland area | Pearson Correlation | −0.127 * | −0.068 | 0.569 ** | -- | |||
Significance | 0.031 | 0.247 | 0.000 | -- | ||||
Proportion 1 | Pearson Correlation | −0.084 | 0.288 ** | 0.674 ** | 0.273 ** | -- | ||
Significance | 0.155 | 0.000 | 0.000 | 0.000 | -- | |||
Slope | Pearson Correlation | 0.461 ** | −0.228 ** | −0.492 ** | −0.138 * | −0.500 ** | -- | |
Significance | 0.000 | 0.000 | 0.000 | 0.019 | 0.000 | -- | ||
Temperature | Pearson Correlation | −0.002 | 0.290 ** | −0.130 * | −0.339 ** | 0.244 ** | −0.328 ** | -- |
Significance | 0.980 | 0.000 | 0.026 | 0.000 | 0.000 | 0.000 | -- | |
Conductivity | Pearson Correlation | −0.492 ** | −0.171 ** | 0.215 ** | −0.107 | 0.263 ** | −0.350 ** | 0.186 ** |
Significance | 0.000 | 0.003 | 0.000 | 0.070 | 0.000 | 0.000 | 0.001 |
DEM | Precipitation | Grain Yield | Cropland Area | Proportion 1 | Slope | Temperature | ||
---|---|---|---|---|---|---|---|---|
DEM | Pearson Correlation | -- | ||||||
Significance | -- | |||||||
Precipitation | Pearson Correlation | 0.374 ** | -- | |||||
Significance | 0.000 | -- | ||||||
Grain yield | Pearson Correlation | −0.324 ** | 0.078 | -- | ||||
Significance | 0.000 | 0.186 | -- | |||||
Cropland area | Pearson Correlation | −0.012 | −0.032 | 0.505 ** | -- | |||
Significance | 0.840 | 0.591 | 0.000 | -- | ||||
Proportion 1 | Pearson Correlation | −0.116 * | 0.324 ** | 0.662 ** | 0.221 ** | -- | ||
Significance | 0.047 | 0.000 | 0.000 | 0.000 | -- | |||
Slope | Pearson Correlation | 0.519 ** | −0.288 ** | −0.426 ** | −0.023 | −0.512 ** | -- | |
Significance | 0.000 | 0.000 | 0.000 | 0.694 | 0.000 | -- | ||
Temperature | Pearson Correlation | −0.217 ** | 0.325 ** | 0.001 | −0.285 ** | 0.367 ** | −0.548 ** | -- |
Significance | 0.000 | 0.000 | 0.983 | 0.000 | 0.000 | 0.000 | -- | |
TN | Pearson Correlation | −0.306 ** | −0.015 | 0.326 ** | −0.060 | 0.383 ** | −0.356 ** | 0.296 ** |
Significance | 0.000 | 0.003 | 0.000 | 0.070 | 0.000 | 0.000 | 0.001 |
DEM | Precipitation | Grain Yield | Cropland Area | Proportion 1 | Slope | Temperature | ||
---|---|---|---|---|---|---|---|---|
DEM | Pearson Correlation | -- | ||||||
Significance | -- | |||||||
Precipitation | Pearson Correlation | 0.461 ** | -- | |||||
Significance | 0.000 | -- | ||||||
Grain yield | Pearson Correlation | −0.372 ** | 0.011 | -- | ||||
Significance | 0.000 | 0.852 | -- | |||||
Cropland area | Pearson Correlation | −0.097 | −0.064 | 0.588 ** | -- | |||
Significance | 0.100 | 0.280 | 0.000 | -- | ||||
Proportion 1 | Pearson Correlation | −0.105 | 0.254 ** | 0.675 ** | 0.325 ** | -- | ||
Significance | 0.074 | 0.000 | 0.000 | 0.000 | -- | |||
Slope | Pearson Correlation | 0.495 ** | −0.172 ** | −0.448 ** | −0.107 | −0.476 ** | -- | |
Significance | 0.000 | 0.003 | 0.000 | 0.069 | 0.000 | -- | ||
Temperature | Pearson Correlation | −0.083 | 0.264 ** | −0.132 * | −0.338 ** | 0.218 ** | −0.395 ** | -- |
Significance | 0.156 | 0.000 | 0.024 | 0.000 | 0.000 | 0.000 | -- | |
TP | Pearson Correlation | −0.463 ** | −0.264 ** | 0.012 | −0.130 * | −0.091 | −0.085 | −0.039 |
Significance | 0.000 | 0.000 | 0.840 | 0.027 | 0.120 | 0.151 | 0.504 |
R2 | MAE | MSE | RMSE | |||
---|---|---|---|---|---|---|
Random forest | Conductivity | Training | 0.95 | 9.39 | 174.06 | 13.19 |
Test | 0.67 | 27.92 | 1314.70 | 36.26 | ||
TN | Training | 0.76 | 0.27 | 0.13 | 0.36 | |
Test | 0.52 | 0.39 | 0.28 | 0.53 | ||
TP | Training | 0.73 | 0.02 | 0.00 | 0.03 | |
Test | 0.47 | 0.03 | 0.02 | 0.04 | ||
AdaBoost | Conductivity | Training | 0.78 | 26.79 | 939.28 | 30.65 |
Test | 0.54 | 30.73 | 1622.65 | 40.28 | ||
TN | Training | 0.58 | 0.40 | 0.22 | 0.47 | |
Test | 0.25 | 0.52 | 0.45 | 0.68 | ||
TP | Training | 0.63 | 0.03 | 0.00 | 0.04 | |
Test | 0.29 | 0.04 | 0.00 | 0.06 | ||
LightGBM | Conductivity | Training | 0.93 | 11.10 | 243.73 | 15.61 |
Test | 0.59 | 28.41 | 1463.03 | 38.25 | ||
TN | Training | 0.87 | 0.20 | 0.07 | 0.26 | |
Test | 0.39 | 0.46 | 0.37 | 0.61 | ||
TP | Training | 0.89 | 0.01 | 0.00 | 0.02 | |
Test | 0.37 | 0.04 | 0.00 | 0.05 |
Conductivity (μS/cm) | TN (mg/L) | TP (mg/L) | |
---|---|---|---|
West-flowing Songhua River | 161.62 | 1.62 | 0.10 |
Nenjiang River | 167.79 | 1.71 | 0.11 |
East-flowing Songhua River | 212.72 | 2.07 | 0.14 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, Z.; Yu, H.; Li, L.; Yu, J.; Yu, J.; Gao, X. Inferring Water Quality in the Songhua River Basin Using Random Forest Regression Based on Satellite Imagery and Geoinformation. Hydrology 2025, 12, 61. https://doi.org/10.3390/hydrology12030061
Yu Z, Yu H, Li L, Yu J, Yu J, Gao X. Inferring Water Quality in the Songhua River Basin Using Random Forest Regression Based on Satellite Imagery and Geoinformation. Hydrology. 2025; 12(3):61. https://doi.org/10.3390/hydrology12030061
Chicago/Turabian StyleYu, Zhanqiang, Hangnan Yu, Lan Li, Jiangtao Yu, Jie Yu, and Xinyue Gao. 2025. "Inferring Water Quality in the Songhua River Basin Using Random Forest Regression Based on Satellite Imagery and Geoinformation" Hydrology 12, no. 3: 61. https://doi.org/10.3390/hydrology12030061
APA StyleYu, Z., Yu, H., Li, L., Yu, J., Yu, J., & Gao, X. (2025). Inferring Water Quality in the Songhua River Basin Using Random Forest Regression Based on Satellite Imagery and Geoinformation. Hydrology, 12(3), 61. https://doi.org/10.3390/hydrology12030061