Next Article in Journal
Prior Semantic Information Guided Change Detection Method for Bi-temporal High-Resolution Remote Sensing Images
Next Article in Special Issue
Backscattering Characteristics of SAR Images in Damaged Buildings Due to the 2016 Kumamoto Earthquake
Previous Article in Journal
Band Ratios Combination for Estimating Chlorophyll-a from Sentinel-2 and Sentinel-3 in Coastal Waters
Previous Article in Special Issue
Bio-Geophysical Suitability Mapping for Chinese Cabbage of East Asia from 2001 to 2020
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of the Area of High-Turbidity Water in the Yatsushiro Sea, Japan, Using Machine Learning with Satellite, Meteorological, and Oceanographic Data

by
Kazutaka Nagayama
and
Hideyuki Tonooka
*
Graduate School of Science and Engineering, Ibaraki University, Hitachi 3168511, Japan
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(6), 1652; https://doi.org/10.3390/rs15061652
Submission received: 20 February 2023 / Accepted: 17 March 2023 / Published: 18 March 2023

Abstract

:
Turbid water is known to affect aquatic ecosystems. If the spread of turbid water can be predicted, it is expected to lead to the prediction of damage caused by turbid water in rich aquatic ecosystems and aquaculture farms, and to countermeasures against turbid water. In this study, we developed a method for predicting the area of high-turbidity water using machine learning with satellite-observed total suspended solids (TSS) product and relatively readily available meteorological and oceanographic data (rainfall, wind direction and speed, atmospheric pressure, and tide level) in the past and evaluated it for the Kuma River estuary of the Yatsushiro Sea in Japan. The results showed that the highest accuracy was obtained using random forest regression, with a coefficient of determination of 0.552, when the area of high-turbidity water based on the previous day’s TSS product and hourly meteorological and oceanographic data from the previous day were used as inputs. The most important factor for the prediction was the area of high-turbidity water, followed by wind, and tide level, but the effect of rainfall was small, which was probably due to the flood-control function of the river. Our future work will be to evaluate the applicability of the method to other areas, improve the accuracy, and predict the distribution area.

1. Introduction

The preservation of the aquatic environment is an important issue for modern society. Turbid water is known to affect organisms living in the aquatic environment, and an accurate understanding of turbid water is useful in confirming the state of pollution. There are various substances that cause turbid water, ranging from inorganic substances such as sand and gravel to organic substances such as leaves and plankton. In particular, insoluble inorganic and organic matter that is less than 2 mm and greater than 1 μm is called total suspended solids (TSS) [1].
Many researchers have studied the effects of turbid water on aquatic ecosystems. For example, it has been confirmed that although “sand discharge”, which is the discharge of sand and other substances that accumulate in dams downstream, is not directly related to fish injury or mortality, it causes stress to fish by raising their hemoglobin levels when concentrations are high [2,3,4,5]. Cases of fish showing avoidance behavior toward turbid water have also been reported [4]. In addition, when the particle size of suspended solids in turbid water exceeds a range of 1.237–35.977 μm, they tend to adhere to the gills, leading to gill blockage [6]. As for tuna, it is said that tuna are susceptible to the effects of harmful algae such as red tide because they require a large amount of oxygen and take in large amounts of seawater at a time [7]. In the case of a large number of bluefin tuna dying due to a red tide in a tuna aquaculture fishery, it was pointed out that the high concentration of turbid water made the tuna less visible in the water, causing them to come into contact with nets and other objects, resulting in injuries [8]. In addition to fish, turbid water also adversely affects the growth of brown algae such as wakame and kajime seaweed. For example, it has been reported that suspended solids inhibit the sedimentation of migratory spores produced by brown algae, increasing the time required for their attachment, which leads to a decrease in brown algae [9]; it has also been reported that suspended solids inhibit the attachment of Nori shell spores [10,11].
If the turbid water spreads to areas with rich aquatic ecosystems and aquaculture farms, the damage described above is expected to become more pronounced. However, if its characteristics and scale can be predicted in advance, it could be used to predict damage and evacuate aquaculture facilities. There are several examples of studies on the prediction of turbidity: Wang et al. (2021) focused on the relationship between turbidity and tidal level and compared and evaluated several methods for predicting turbidity areas based on tidal level [12]. The results showed that an artificial neural network method that uses the tide level as an input and the turbidity index “Nephelometric Turbidity Unit (NTU)” obtained from buoys installed in coastal areas as an output has the highest accuracy, and that the two previous tidal cycles, excluding the forecast period, are important for the input tide-level data. The input tide-level data are important for the previous two tidal cycles, excluding the time of the forecast. Turbidity prediction based on weather information was studied by Zhang et al. (2021) and Tsai et al. (2017) [13,14]. Zhang et al. developed a method to predict lake turbidity information obtained from smartphone photography by inputting wind speed, wind direction, temperature, and rainfall into a random forest [13]. The coefficient of determination is more than 0.89, and the most important input parameters are wind direction and wind speed. Tsai et al. developed a model to predict weir turbidity from rainfall and water volume and obtained 5.787 as the mean-square error [14]. Alizadeh et al. (2018) used buoys placed at an estuary to obtained data on turbidity, water temperature, salinity, etc., and river flow data were used to study a method for predicting turbidity in estuaries [15]. The results showed that the river flow of one hour before was important for the prediction; Kumar et al. (2022) predicted turbidity at multiple sites in an estuary in Hong Kong based on meteorological information, pH, and oxygen-dissolved solids [16]. As a result, they achieved an average prediction accuracy of 88.45% with LSTM-RNN. However, since most existing studies focus on turbidity, it is difficult to capture the area of turbid water. In addition, they often use detailed field data, which are difficult to apply in places where the observation environment is not well-developed.
Based on this background, we propose a machine learning method to predict the area of high-turbidity water around an estuary on the following day, using satellite TSS products and easily obtainable local meteorological and oceanographic data as inputs, using the Yatsushiro Sea in Japan as the study area. We believe that the prediction of the area of high-turbidity water using these types of data is a unique attempt not seen in previous studies. For machine learning, Support Vector Regression (SVR) and Random Forest Regression (RFR) are evaluated in terms of accuracy. Satellite data will be used for the TSS product using Geostationary Ocean Color Imager-I (GOCI-I) [17,18], which is capable of observing coastal areas at high frequency from geostationary orbit.

2. Materials and Methods

2.1. Satellite-Based TSS Observations and GOCI

Sensors such as GOCI, the Second-generation Global Imager (SGLI) onboard Global Change Observation-Climate (GCOM-C), and the Moderate Resolution Imaging Spectroradiometer (MODIS) instruments onboard the Terra and Aqua satellites have bands suitable for observing coastal areas and are capable of estimating TSS [19,20]. In this type of sensor, although GOCI has a lower spatial resolution (500 m) than SGLI and MODIS and its observation area is limited to East Asia, it is capable of TSS observation with high frequency (eight times a day) from a geostationary orbit and cloud removal by time-series composite [17]. Since no other satellite sensor has these characteristics, we selected the TSS product of the GOCI-I instrument for this study, although it is limited to analysis of sea areas that do not include narrow bays.
Table 1 shows the spectral bands of the GOCI-I instrument [21]. The TSS product of GOCI-I is calculated from the visible bands at 490 nm and 745 nm by the following equation [22].
TSS = 10 1.0758 + 1.1230 × R r s ( 745 ) / R r s ( 490 ) ,
where Rrs(490) and Rrs(745) are the remote sensing reflectance values for the 490 nm and 745 nm bands, respectively. This equation is less accurate when TSS is low in concentration and should be used with caution in the open ocean but provides more reliable TSS information in coastal areas where TSS is generally high in concentration [22].

2.2. Proposed Method

In this study, we propose a method to predict the area of high-turbidity water one day later by machine learning using satellite-based TSS products and local meteorological and oceanographic data, acquired over the past several days. Local meteorological and oceanographic data should be parameters that are closely related to turbidity; in this study, wind direction and speed, and tide level, were selected based on the work of Zhang et al. [13] and Wang et al. [12], and rainfall and pressure were selected based on the work of Kumar et al. [16]. It is known that glacial meltwater can affect turbidity in the presence of glacial meltwater [23], but this effect was not considered in this study area, which is located at mid-latitudes.
The processing procedure of the proposed method is as follows.
  • In the target area, using a specific time T of the prediction day as the prediction reference, collect a satellite-based TSS product at time T one day ago, and rainfall, wind speed and direction, pressure, and tide-level data for each hour from 1 to N days ago. That is, the input TSS image is one image, whereas the meteorological and oceanographic data are 24 data per day (from time T to 23 h before that time).
  • Binarize the TSS product by thresholding based on the presence or absence of high-turbidity water. The number of high-turbid water pixels in the binarized TSS image is then counted to obtain the area of high-turbidity zone.
  • Standardize the area of the high-turbidity water based on satellite observations, and the four meteorological and oceanographic parameters to have a mean of 0 and a standard deviation of 1, respectively, and apply these values to the learned machine learning model to regressively estimate the area of high-turbidity water (normalized value) for the prediction day. The machine learning model needs to be trained in advance using training data sets for each of the prediction dates for which TSS data were available.
Figure 1 shows the above process flow. In the proposed method, the importance of each input data and the value of N are evaluated in the sections that follow. We also compare and evaluate SVR and RFR as the machine learning model.

2.3. Study Area

In this study, a part of the Yatsushiro Sea, Japan, defined as the region of (32.43°N, 130.58°E) to (32.54°N, 130.42°E) including the estuary of the Kuma River (around 32.50°N, 130.57°E) flowing through Kumamoto Prefecture, was selected as the study area. The location of the study area is shown in Figure 2. The Yatsushiro Sea is relatively calm because it is an inland sea surrounded by the Kyushu mainland and the Amakusa Islands. This sea is rich in fishery resources, especially in the cultivation of sea bream, yellowtail, nori, tiger prawn, and pearls [24]. We chose this area rather than a larger area because the range of turbidity variation and the mechanism of turbidity are different in the coastal and open ocean areas, and the coastal areas generally have diverse aquatic ecosystems and are important for aquaculture.

2.4. Data Used

The period covered by this study was defined as 11 years, from 2011 to 2021.
TSS products were obtained from data observed by GOCI-I at 10:16 UTC (1:16 UTC) during this period, and only those data with cloud coverage of 10% or less in the target water area were selected. In addition, a cutout was made from each TSS image to include the study area according to the turbid water coverage of each image, and only pure water pixels without land were selected using a land mask created from coastline vector data and high-resolution satellite imagery. Then, of all the TSS data obtained, one set of TSS images was defined as those obtained for two consecutive days. As a result, a total of 219 TSS data sets were obtained for the study period under study.
Next, for each of the 219 datasets, hourly meteorological and oceanographic data were obtained from the Japan Meteorological Agency (JMA) website for each day up to nine days prior to the prediction date [26]. Rainfall data were obtained from the Automated Meteorological Data Acquisition System (AMeDAS) [27] at eight sites located in the catchment area of the Kuma River (Yatsushiro, Isshochi, Yamae, Hitoyoshi, Itsuki, Kami, Taraki, and Yamae-Yokoya). Wind direction and speed data were obtained from AMeDAS at one location (Yatsushiro) near the mouth of the river, and east–west and north–south vectors were calculated from these data and used as input data (for example, for a 2 m wind speed from the east-southeast, the east–west vector is 1.848 m/s and the north–south vector is −0.766 m/s). The barometric pressure was obtained from the nearest meteorological observatory (Hitoyoshi) from the mouth of the river. Figure 3 shows the location of each observation station.
For tide data, we used astronomical tide data near the mouth of the river (Yatsushiro) provided by JMA [28]. Astronomical tide level is a predicted value of the change in tide level caused by lunar and solar tidal forces, and although it differs from the actual measured tide level, it was adopted because it is readily available.
In applying the above data to machine learning, the 219 data sets were divided in half, with 109 sets for training and 110 sets for testing.

2.5. Evaluation Method

2.5.1. Evaluation (A)

In Section 2.2, Procedure (1), N is the number of days back for the meteorological/oceanographic data to be input. In this study, nine cases were evaluated, ranging from N = 9 (using data from one to nine days before the prediction date) to N = 1 (using only data from one day before). In addition, for SVR, the radial basis function (RBF) was applied with epsilon = 0.1 and 0.01, and the cost parameter C = 1 and 10 for SVR; and the number of trees, trees = 100 and 1000, and the maximum depth of each tree, depth = 100 and 1000, were investigated for RFR. Thus, combining these conditions, the total number of cases evaluated is 9 (N) × { 4 (SVR parameters) + 4 (RFR parameters) } = 72 cases.

2.5.2. Evaluation (B)

In Evaluation (A), N means that all meteorological and oceanographic data for each day from 1 to N days before the prediction date are used, but alternatively, a model that uses only meteorological and oceanographic data from n days ago is also possible. Therefore, we evaluated this alternate model by inputting only the meteorological and oceanographic data from n days ago and no TSS images to each machine learning model. In this case, the number of evaluation cases is 9 (n) × { 4 + 4 } = 72 cases.

2.5.3. Evaluation (C)

Using the model that showed the highest accuracy in evaluations (A) and (B), we evaluated the impact of each input parameter on estimation accuracy by training the model using only one of the five input parameters: TSS, rainfall, wind direction and speed, barometric pressure, and tide level.

3. Results

3.1. TSS Image and Its Binarization

Figure 4 shows the Landsat 8 image at 9:45 on 17 November 2020 JST and the TSS image of GOCI-I and its binarized image at 10:16 on the same day as an example. In the binarized TSS image (c), pixels with high turbidity were selected from the original TSS image (b) with a threshold of 4 mg/L. The area of high-turbidity water in this example is 20.50 km2 (=82 pixels), and its normalized area is 0.390, where each normalized area was obtained by dividing each area of high-turbidity water by the maximum area of high-turbidity water of 52.50 km2 (=210 pixels) obtained from all data from 2011 to 2021.
Figure 5 shows the relationship between the threshold value for extracting high-turbidity water and the number of pixels extracted as high-turbidity pixels. In the case of the threshold value of 4 mg/L used in this study, the number of extracted high-turbidity pixels corresponds to approximately 10% of the total number of water pixels.

3.2. Results of Evaluation (A)

Table 2 shows the root mean-square error (RMSE) of the coefficient of determination (R2), correlation coefficient (R), and area (km2) for each result. The results show that SVR has a low R2 and is unreliable, while RFR has a high R2 and has the potential to make predictions. The most accurate model was RFR with N = 1 and trees = 100 and depth = 100, resulting in R2 = 0.552, R = 0.746, and RMSE = 7.260 km2 (0 to a maximum of 52.50 km2). In addition, Table 3 shows the results of feature importance (FE) analysis for each input variable: rainfall (RF), wind vector (WV), barometric pressure (BP), tide level (TL), and area of high-turbidity water (HT). From this table, it can be seen that the factor that most influences the prediction is the area of high-turbidity water on the previous day, followed by wind and tide level. Figure 6 is a scatterplot showing the relationship between the observed and predicted normalized high-turbidity area for each case of N = 1 to 9 in evaluation (A) using RFR with trees = 100 and depth = 100.
Figure 6 shows several outliers. Investigation of these outliers revealed that some of them are often caused by several consecutive days of strong winds. Therefore, from the 219 training data, we excluded the five data sets that had five consecutive days with daily maximum values of instantaneous wind speeds exceeding 10 m, and trained the RFR (trees = 100, depth = 100). The obtained coefficients of determination, correlation coefficients, and RMSE are shown in Table 4, and the scatter plots are shown in Figure 7. A comparison of Figure 6 and Figure 7 shows that most of the outliers were removed and the correlation was improved for all N. In fact, Table 4 shows that R2 = 0.636, R = 0.810, and RMSE = 6.081 km2 are obtained for N = 1, which is an improvement over Table 3.

3.3. Results of Evaluation (B)

Table 5 shows the R2, R, and the RMSE of area (km2) for each result in Evaluation (B). Outliers due to high winds were not excluded. It can be seen that both SVR and RFR are less accurate than the results of Evaluation (A), and even n = 1, which corresponds to N = 1, the highest accuracy in Evaluation (A). Evaluation (B) uses only meteorological and oceanographic data from n days ago and does not use the high-turbidity-area data from the previous day, suggesting that it is necessary to include the high-turbidity-area data from the previous day when predicting the area of high-turbidity water. The accuracy with respect to n was slightly higher for n = 3, where the importance analysis of RFR with trees = 100 and depth = 100 showed that rainfall, wind, barometric pressure, and tide level were 0.061, 0.643, 0.055, and 0.241, respectively, indicating that wind had the greatest influence on the prediction. However, the accuracy for the value of n varied overall due to the influence of meteorological factors.

3.4. Results of Evaluation (C)

The results of Evaluation (A) and (B) showed that N = 1 was the most accurate for SVR and RFR. Therefore, Evaluation (C) was conducted to examine the contribution of each factor (rainfall, wind direction and speed, pressure, tide level, and high-turbidity area) at N = 1 in each model. Outliers due to high winds were not excluded. The results obtained are shown in Table 6. It can be seen that, as in Evaluation (B), it is difficult to predict the area of high-turbidity water from meteorological and oceanographic data alone, and that the area of high-turbidity water on the previous day is important.

4. Discussion

The results of evaluation (A) showed that the RFR with trees = 100 and depth = 100 was the most accurate at N = 1, with R2 = 0.552, R = 0.746, and RMSE = 7.260 km2 (0 to maximum 52.50 km2). The accuracy improved to R2 = 0.636, R = 0.810, and RMSE = 6.081 km2 when data with five or more consecutive days of daily maximum instantaneous wind speeds exceeding 10 m were excluded. The importance analysis of RFR showed that the previous day’s high-turbidity area had the strongest influence, and the next most influential factors were wind direction and speed. On the other hand, rainfall had little influence. The coefficients of determination were close to zero or negative for most of the inputs in Evaluations (B) and (C), suggesting that any model would have difficulty predicting without using the previous day’s high-turbidity area. However, the highest prediction accuracy using only the area of the previous day’s high-turbidity water in Evaluation (C) (R2 = 0.474, R = 0.712, and RMSE = 7.861 km2 in SVR with ε = 0.01 and C = 1.0 for N = 1) is lower than the highest accuracy in Evaluation (A). Thus, it can be seen that combining the area of high-turbidity water on the previous day with meteorological and oceanographic data is better than using the area alone. This may indicate that the area of high-turbidity water can be approximated by the area of high-turbidity water on the previous day, but the meteorological and oceanographic conditions (especially wind) up to the previous day explain the degree of its diffusion.
The comparison results between SVR and RFR show that the latter is able to predict the high-turbidity area with more stable accuracy for all inputs, though SVR showed a better performance in Evaluation (C), which was a simpler regression problem.
The present assessment indicated that the influence of rainfall was low. This may be due to the characteristics of the catchment area of the Kuma River. Several flood-control dams have been installed on the Kuma River. In addition to these, there are about 3000 hectares of rice paddies around the Kuma River, which are used as “rice field dams” during heavy rains [29], and since the soil in the catchment easily permeates water, the proportion of water directly flowing into the river due to rainfall is low and the amount of sand and other substances that cause TSS to be transported to the estuary is considered relatively low [30]. Due to these effects, the impact of rainfall on TSS dispersion is considered to be relatively small in the study area [29]. In addition, the reason for the strong influence of wind direction and wind speed, followed by tide level, in the importance analysis of RFR shown in Table 3 may be related to the characteristics of the Yatsushiro Sea. The Yatsushiro Sea is an inland sea with a high degree of closure. In other words, the area away from the connecting waters is less influenced by the ocean currents of the open sea and more susceptible to the influence of tidal currents, resulting in a relatively calm sea. In such calm seas, wind is thought to have a stronger influence on the diffusion of surface water.
In this study, the Kuma River estuary in the Yatsushiro Sea was targeted, but it can be applied to other ocean areas as well. However, since the Yatsushiro Sea is a highly enclosed inland sea, the entry and exit of seawater from the open sea is limited, and it is strongly influenced by tidal currents and is susceptible to wind and so on, while conditions will be very different in the estuary facing the open sea. In addition, the Kuma River has three flood-control dams and about 3000 hectares of rice paddies in its vicinity, so the impact of rainfall around the river will be suppressed for normal levels of rainfall, except for extreme rainfall such as typhoons; however, in rivers where these conditions are different, the effects of rainfall are expected to be more significant. In addition, although GOCI TSS products were used in this study, it should be noted that the observation area of GOCI is limited to East Asia, and that the use of polar-orbiting satellite sensor products such as SGLI and MODIS is necessary to apply this method to other regions of the world. However, this study suggests that the combination of satellite-based TSS products and readily available meteorological and oceanographic data can be used to predict the approximate area of high-turbidity water, and future developments are expected.

5. Conclusions

In this study, we developed a method to predict the area of high-turbidity water around the estuary of the Kuma River in the Yatsushiro Sea, Japan, by feeding satellite TSS products and meteorological/oceanographic data to machine learning models. This method differs from previous studies in that it does not predict turbidity for each location where observation buoys are installed, but rather predicts the area of high-turbidity water around the estuary of the river. The evaluation results showed that the highest accuracy was obtained when RFR with trees = 100 and depth = 100 was used, with a determination coefficient of 0.552, correlation coefficient of 0.746, and RMSE of 7.260 km2 with the maximum range of 0 to 52.50 km2, using TSS product and meteorological/oceanographic data from the previous day as inputs. The accuracy improved to R2 = 0.636, R = 0.810, and RMSE = 6.081 km2 when data with five or more consecutive days of daily maximum instantaneous wind speeds exceeding 10 m were excluded.
In the importance analysis of the RFR, the most important factor for prediction was the area of high-turbidity water on the previous day, followed by wind and tide level due to the study area being an inland sea. On the other hand, rainfall had a smaller impact because there are three flood-control dams on the Kuma River and approximately 3000 hectares of rice paddies in the surrounding area. However, it is highly likely that the accuracy and parameter contributions will differ in other ocean areas where these conditions are different, and particularly the impact of rainfall is expected to be more significant. In addition, we excluded narrow bays in this study due to limitations of the spatial resolution of GOCI. However, since the occurrence and diffusion mechanisms of turbidity in narrow bays may differ from one another, the prediction of turbidity in narrow bays using a higher resolution image is a topic that should be addressed in the future.
In this study, we have demonstrated the possibility of predicting the area of high-turbidity water around an estuary on the following day using satellite TSS products and local meteorological and oceanographic data as inputs. This result is expected to lead to research that can predict damage to rich aquatic ecosystems and aquaculture farms caused by turbid water and provide information for evacuating aquaculture facilities. Our future work is to evaluate its applicability to other ocean areas, improve its accuracy, and predict its distribution area.

Author Contributions

Conceptualization, H.T.; methodology, K.N. and H.T.; software, K.N.; validation, K.N.; formal analysis, K.N.; investigation, K.N.; resources, K.N. and H.T.; data curation, K.N.; writing—original draft preparation, K.N.; writing—review and editing, H.T.; visualization, K.N. and H.T.; supervision, H.T.; project administration, H.T.; funding acquisition, H.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to thank Ken Endo, Shinichi Shida, Toshifumi Hiramatsu, and Ayata Susa (Pasco Corporation); and Katsuya Saito and Takashi Yabuki (Japan Fisheries Information Service Center) for valuable discussions on the damage of turbid water in aquaculture and its countermeasures.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ministry of Land, Infrastructure, Transport and Tourism. Available online: https://www.mlit.go.jp/river/shishin_guideline/kasen/suishitsu/houhou.html (accessed on 9 December 2022).
  2. Kinoshita, A.; Fujita, M.; Mizuyama, T.; Sawada, T. An evaluation method of the impacts on char of turbid water by sediment flushing from dams. Proc. Hydraul. Eng 2003, 47, 1129–1134. [Google Scholar] [CrossRef]
  3. Kinoshita, A.; Fujita, M.; Tagawa, M.; Mizuyama, T.; Sawada, T. The physiological impact of turbid water caused by sediment flushing on fish and a prediction method. Jour. Jpn. Soc. Eros. Control Eng. 2005, 58, 34–43. [Google Scholar]
  4. Hori, M.; Wakabayashi, H.; Yamamoto, K.; Kato, S.; Kojima, T. Assessment of influence of sediment flushing from dam on fishery products. Bull. Soc. Sea Water Sci. Jpn. 2007, 61, 352–359. [Google Scholar]
  5. Kinoshita, A.; Fujita, M.; Mizuyama, T.; Sawada, T. Study about the decrease of the local refuge space of chars at mountain stream by sediment deposition on bed. J. Jpn. Soc. Civ. Eng. B1 2012, 64, 1117–1122. [Google Scholar]
  6. Muraoka, K.; Amano, K.; Doi, T.; Kubota, H.; Miwa, J. Effects of suspended solid concentrations and particle size on survival of ayu (plecoglossus altivelis altivelis). Jpn. J. Ichthyol. 2011, 58, 141–151. [Google Scholar]
  7. Kumai, H. Studies on bluefin tuna artificial hatching, rearing and reproduction. Jpn. Soc. Sci. Fish. Sci. 1998, 64, 601–605. [Google Scholar] [CrossRef] [Green Version]
  8. Ishida, N.; Yamatogi, T.; Ura, K.; Hirae, S.; Aoki, K.; Koike, K. Mortality factors of cultured bluefin tuna thunnus orientalis in the coastal area of Tsushima, Nagasaki prefecture, Japan. Jpn. Soc. Fish. Sci. 2017, 83, 41–51. [Google Scholar] [CrossRef] [Green Version]
  9. Arakawa, H.; Matsuike, K. Influence on sedimentation velocity of brown algae zoospores and their base-plate insertion exerted by suspended matters. Japan. Jpn. Soc. Fish. Sci. 1990, 56, 1741–1748. [Google Scholar] [CrossRef] [Green Version]
  10. Suzuki, Y.; Maruyama, T.; Miura, A. Effect of suspended matters on the adhesion of porphyra yezoensis conchospores. Jpn. Soc. Civil Eng. 1997, 580, 19–26. [Google Scholar] [CrossRef] [Green Version]
  11. Suzuki, Y.; Maruyama, T.; Miura, A.; Shin, J. Effects of suspended or accumulated kaolinite particles on adhesion and germination of porphyra yezoensis conchospores. Jpn. Soc. Civil. Eng. 1997, 559, 73–79. [Google Scholar] [CrossRef] [Green Version]
  12. Wang, Y.; Chen, J.; Cai, H.; Yu, Q.; Zhou, Z. Predicting water turbidity in a macro-tidal coastal bay using machine learning approaches. Estuar. Coast. Shelf Sci. 2021, 252, 107276. [Google Scholar] [CrossRef]
  13. Zhang, Y.; Yao, X.; Wu, Q.; Huang, Y.; Zhou, Z.; Jun, Y.; Xiaowei, L. Turbidity prediction of lake-type raw water using random forest model based on meteorological data: A case study of Tai lake, China. J. Environ. Manag. 2021, 290, 112657. [Google Scholar] [CrossRef]
  14. Tsai, T.M.; Yen, P.H. GMDH algorithms applied to turbidity forecasting. Appl. Water Sci. 2017, 7, 1151–1160. [Google Scholar] [CrossRef] [Green Version]
  15. Alizadeh, M.J.; Kavianpour, M.R.; Danesh, M.; Jason, A.; Shahabbodin, S.; Kwok, W.C. Effect of river flow on the quality of estuarine and coastal waters using machine learning models. Eng. Appl. Comput. Fluid Mech. 2018, 12, 810–823. [Google Scholar] [CrossRef] [Green Version]
  16. Kumar, L.; Afzal, M.S.; Ahmad, A. Prediction of water turbidity in a marine environment using machine learning: A case study of Hong Kong. Reg. Stud. Mar. Sci. 2022, 52, 102260. [Google Scholar] [CrossRef]
  17. Sakuno, Y. Accuracy evaluation of chlorophyll product data of geostationary ocean color satellite, “GOCI” in inner bay. J. Jpn. Soc. Civ. Eng. B3 2012, 68, I_582–I_587. [Google Scholar]
  18. Korea Ocean Satellite Center. Available online: https://kosc.kiost.ac.kr/index.nm?menuCd=48&lang=en (accessed on 9 December 2022).
  19. Hori, M.; Murakami, H.; Miyazaki, R.; Honda, Y.; Nasahara, K.; Kajiwara, K.; Nakajima, T.Y.; Irie, H.; Toratani, M.; Hirakawa, T.; et al. GCOM-C data validation plan for land, atmosphere, ocean, and cryosphere. Trans. Jpn. Soc. Aeronaut. Space Sci. 2018, 16, 218–223. [Google Scholar] [CrossRef] [Green Version]
  20. Chen, S.; Han, L.; Chen, X.; Li, D.; Sun, L.; Li, Y. Estimating wide range total suspended solids concentrations from MODIS 250-m imageries: An improved method. ISPRS J. Photogramm. Remote Sens. 2015, 99, 58–69. [Google Scholar] [CrossRef]
  21. Ishizaka, J.; Kusunoki, T.; Youngie, P. Geostationary Ocean Color Mission (GOCI-I, II). Bull. Coast. Oceanogr. 2016, 54, 23–28. [Google Scholar]
  22. Marine Satellite Data Online Analysis Platform. Available online: https://www.satco2.com/index.php?m=content&c=index&a=show&catid=317&id=179 (accessed on 9 December 2022).
  23. Osinka, M.; Bialik, R.J.; Wójcik-Długoborska, K.A. Interrelation of quality parameters of surface waters in five tidewater glacier coves of King George Island, Antarctica. Sci. Total Environ. 2021, 771, 144780. [Google Scholar] [CrossRef]
  24. Ministry of the Environment, Japan. 2006. Available online: https://www.env.go.jp/council/20ari-yatsu/y200-23/mat02_3-9.pdf (accessed on 9 December 2022).
  25. Geospatial Information Authority of Japan. Available online: https://maps.gsi.go.jp/vector/ (accessed on 9 December 2022).
  26. Japan Meteorological Agency. Available online: https://www.data.jma.go.jp/obd/stats/data/kaisetu/shishin/shishin_all.pdf (accessed on 9 December 2022).
  27. Kobayashi, T.; Satoshi Shirai, S.; Kitadate, S. AMeDAS: Supporting Mitigation and Minimization of Weather-related Disasters. FUJITSU Sci. Tech. J. 2017, 53, 53–61. [Google Scholar]
  28. Japan Meteorological Agency. Available online: https://www.data.jma.go.jp/kaiyou/db/tide/suisan/index.php (accessed on 9 December 2022).
  29. Kumamoto Prefecture. Available online: https://www.pref.kumamoto.jp/uploaded/life/92667_133166_misc.pdf (accessed on 9 December 2022).
  30. Ministry of Agriculture, Forestry and Fisheries. Available online: https://www.maff.go.jp/j/nousin/kanri/pdf/attach/02-4.pdf (accessed on 9 December 2022).
Figure 1. Process flow of the proposed method.
Figure 1. Process flow of the proposed method.
Remotesensing 15 01652 g001
Figure 2. Location of the study area. Original map was provided by Geospatial Information Authority of Japan [25].
Figure 2. Location of the study area. Original map was provided by Geospatial Information Authority of Japan [25].
Remotesensing 15 01652 g002
Figure 3. Locations of observation stations in the Kuma River catchment area: (1) Yatsushiro, (2) Isshochi, (3) Yamae, (4) Hitoyoshi, (5) Itsuki, (6) Kami, (7) Taraki, and (8) Yamae-Yokotani. Original map was provided by Geospatial Information Authority of Japan [25].
Figure 3. Locations of observation stations in the Kuma River catchment area: (1) Yatsushiro, (2) Isshochi, (3) Yamae, (4) Hitoyoshi, (5) Itsuki, (6) Kami, (7) Taraki, and (8) Yamae-Yokotani. Original map was provided by Geospatial Information Authority of Japan [25].
Remotesensing 15 01652 g003
Figure 4. (a) Landsat-8 visible image observed over the study area at 9:45 on 17 November 2020 JST; (b) TSS image observed by GOCI-I at 10:16 on the same day, shown with a range of 0 to 10 mg/L; and (c) binarized image from (b) with a threshold of 4 mg/L.
Figure 4. (a) Landsat-8 visible image observed over the study area at 9:45 on 17 November 2020 JST; (b) TSS image observed by GOCI-I at 10:16 on the same day, shown with a range of 0 to 10 mg/L; and (c) binarized image from (b) with a threshold of 4 mg/L.
Remotesensing 15 01652 g004
Figure 5. Relationship between the threshold value for extracting high-turbidity water and the number of pixels extracted as high-turbidity pixels.
Figure 5. Relationship between the threshold value for extracting high-turbidity water and the number of pixels extracted as high-turbidity pixels.
Remotesensing 15 01652 g005
Figure 6. Scatter plot of observed and predicted normalized areas of high-turbidity water for each case of N = 1 to 9 in Evaluation (A) using RFR with trees = 100 and depth = 100, with the kernel function set to RBF.
Figure 6. Scatter plot of observed and predicted normalized areas of high-turbidity water for each case of N = 1 to 9 in Evaluation (A) using RFR with trees = 100 and depth = 100, with the kernel function set to RBF.
Remotesensing 15 01652 g006aRemotesensing 15 01652 g006b
Figure 7. Scatter plot of observed and predicted normalized areas of high-turbidity water for each case of N = 1 to 9 using the best model (RFR with trees = 100 and depth = 100) trained by the data excluding samples of five or more consecutive days with maximum instantaneous wind speeds exceeding 10 m.
Figure 7. Scatter plot of observed and predicted normalized areas of high-turbidity water for each case of N = 1 to 9 using the best model (RFR with trees = 100 and depth = 100) trained by the data excluding samples of five or more consecutive days with maximum instantaneous wind speeds exceeding 10 m.
Remotesensing 15 01652 g007aRemotesensing 15 01652 g007b
Table 1. Spectral bands of the GOCI-I instrument.
Table 1. Spectral bands of the GOCI-I instrument.
BandCenter WavelengthBand WidthSpatial Resolution
1412 nm20 nm500 m
2443 nm20 nm
3490 nm20 nm
4555 nm20 nm
5660 nm20 nm
6680 nm10 nm
7745 nm20 nm
8865 nm40 nm
Table 2. Coefficient of determination (R2), correlation coefficient (R), and RMSE (km2) for each model in each case of N = 1 to 9 in Evaluation (A).
Table 2. Coefficient of determination (R2), correlation coefficient (R), and RMSE (km2) for each model in each case of N = 1 to 9 in Evaluation (A).
ModelIndexN = 1N = 2N = 3N = 4N = 5N = 6N = 7N = 8N = 9
SVR
(ε = 0.1,
C = 1.0)
R20.2370.1090.0730.0500.0210.0280.018−0.016−0.034
R0.5610.4330.3900.3810.3430.3500.3260.2710.238
RMSE (km2)9.46310.23410.43510.56910.72510.68410.74210.92811.026
SVR
(ε = 0.01,
C = 1.0)
R20.3730.2220.1400.1320.1070.1080.0960.0680.046
R0.6350.5060.3880.3810.3450.3480.3290.2700.228
RMSE (km2)8.5799.55910.05410.09610.24110.23210.30810.46710.587
SVR
(ε = 0.1,
C = 10.0)
R20.2510.1200.0970.0660.0170.0560.025−0.013−0.030
R0.5680.4290.4170.3880.3370.3500.3290.2840.257
RMSE (km2)9.37510.16510.28710.46510.72710.52710.70610.90711.005
SVR
(ε = 0.01,
C = 10.0)
R20.4020.2480.1940.1610.1110.1370.1070.0720.056
R0.6450.5110.4590.4220.3710.3780.3400.2830.255
RMSE (km2)8.3789.3899.7189.91010.19810.05910.24210.43810.534
RFR
(trees = 100,
depth = 100)
R20.5520.5320.5440.5310.5190.4800.5180.5040.493
R0.7460.7430.7420.7410.7290.7100.7380.7180.719
RMSE (km2)7.2607.4187.3267.4257.5217.8217.5317.6407.723
RFR
(trees = 1000,
depth = 100)
R20.5300.5240.5240.5090.5110.5170.5200.5150.516
R0.7370.7360.7360.7260.7250.7280.7290.7250.725
RMSE (km2)7.4367.4787.4837.5967.5817.5357.5147.5527.547
RFR
(trees = 100,
depth = 1000)
R20.5370.5170.5260.5200.4720.5250.5340.5120.520
R0.7440.7290.7380.7300.7090.7350.7360.7240.729
RMSE (km2)7.3807.5377.4627.5147.8807.4767.4047.5777.516
RFR
(trees = 1000,
depth = 1000)
R20.5380.5200.5200.5170.5110.5090.5150.5080.515
R0.7410.7340.7310.7290.7260.7250.7260.7210.725
RMSE (km2)7.3707.5167.5137.5397.5867.5997.5537.6057.549
Table 3. Feature importance analysis results for N = 1 to 9 for each input variable in RFR: rainfall (RF), wind vector (WV), barometric pressure (BP), tide level (TL), and area of high-turbidity water (HT).
Table 3. Feature importance analysis results for N = 1 to 9 for each input variable in RFR: rainfall (RF), wind vector (WV), barometric pressure (BP), tide level (TL), and area of high-turbidity water (HT).
ModelVariableN = 1N = 2N = 3N = 4N = 5N = 6N = 7N = 8N = 9
RFR
(trees = 100,
depth = 100)
RF0.0010.0120.0330.0390.0510.0580.0680.0530.032
WV0.2380.2520.2420.2380.2040.2020.2230.2340.224
BP0.0120.0150.0180.0120.0150.0210.0160.0200.021
TL0.0620.0850.0700.0850.1090.1110.0840.1150.097
HT0.6880.6370.6370.6260.6210.6080.6090.5780.626
RFR
(trees = 1000,
depth = 100)
RF0.0010.0170.0320.0470.0450.0480.0480.0460.044
WV0.2490.2420.2330.2170.2240.2230.2290.2300.236
BP0.0130.0140.0170.0160.0160.0170.0160.0180.017
TL0.0750.0770.0850.0960.1000.0950.0980.1050.106
HT0.6620.6500.6330.6240.6150.6170.6080.6010.597
RFR
(trees = 100,
depth = 1000)
RF0.0010.0230.0300.0400.0590.0340.0480.0440.031
WV0.2630.2290.2290.2290.2190.2320.2350.2420.231
BP0.0150.0170.0220.0150.0140.0210.0150.0200.022
TL0.0750.1000.0790.0820.0940.0790.0810.0770.106
HT0.6460.6310.6410.6350.6140.6340.6210.6170.609
RFR
(trees = 1000,
depth = 1000)
RF0.0010.0170.0330.0480.0420.0440.0550.0460.049
WV0.2500.2480.2290.2190.2230.2210.2260.2330.229
BP0.0130.0140.0160.0150.0150.0170.0180.0170.016
TL0.0680.0780.0880.0910.0960.1010.0920.1100.102
HT0.6680.6430.6340.6260.6250.6160.6090.5940.603
Table 4. Coefficient of determination (R2), correlation coefficient (R), and RMSE (km2) in each case of N = 1 to 9 using the best model (RFR with trees = 100 and depth = 100) trained by the data excluding samples of five or more consecutive days with maximum instantaneous wind speeds exceeding 10 m.
Table 4. Coefficient of determination (R2), correlation coefficient (R), and RMSE (km2) in each case of N = 1 to 9 using the best model (RFR with trees = 100 and depth = 100) trained by the data excluding samples of five or more consecutive days with maximum instantaneous wind speeds exceeding 10 m.
ModelIndexN = 1N = 2N = 3N = 4N = 5N = 6N = 7N = 8N = 9
RFR
(trees = 100,
depth = 100)
R20.6360.6320.6330.6260.6010.6190.6130.6170.600
R0.8100.8040.8000.7940.7750.7880.7840.7880.777
RMSE (km2)6.0816.1106.1006.1656.3676.2176.2696.2386.375
Table 5. Coefficient of determination (R2), correlation coefficient (R), and RMSE (km2) for each model in each case of n = 1 to 9 in Evaluation (B).
Table 5. Coefficient of determination (R2), correlation coefficient (R), and RMSE (km2) for each model in each case of n = 1 to 9 in Evaluation (B).
ModelIndexn = 1n = 2n = 3n = 4n = 5n = 6n = 7n = 8n = 9
SVR
(ε = 0.1,
C = 1.0)
R2−0.0270.0120.0770.007−0.043−0.090−0.057−0.161−0.086
R0.1970.2220.3290.2380.1440.0430.1420.0490.140
RMSE (km2)10.98710.77710.41710.80311.07111.31911.15111.68411.300
SVR
(ε = 0.01,
C = 1.0)
R20.0260.0220.0740.040−0.031−0.0430.022−0.155−0.023
R0.2200.2060.2890.2330.1250.0990.2050.0550.166
RMSE (km2)10.70410.72310.43610.62211.00811.07410.72111.65410.970
SVR
(ε = 0.1,
C = 10.0)
R2−0.126−0.0320.031−0.019−0.104−0.206−0.198−0.376−0.148
R0.1780.2100.3480.2470.1730.0340.0590.0110.170
RMSE (km2)11.50611.01510.67610.94611.39411.90611.86612.72011.619
SVR
(ε = 0.01,
C = 10.0)
R2−0.122−0.0300.052−0.005−0.124−0.160−0.174−0.392−0.163
R0.1700.2020.3540.2420.1360.0910.1200.0370.163
RMSE (km2)11.48511.00410.55710.87011.49811.67711.74712.79311.691
RFR
(trees = 100,
depth = 100)
R2−0.098−0.1030.044−0.013−0.048−0.127−0.102−0.222−0.183
R0.2010.2530.3400.2590.1930.1460.1010.0580.079
RMSE (km2)11.36011.39010.60010.91311.10011.51211.38411.98611.795
RFR
(trees = 1000,
depth = 100)
R2−0.058−0.1310.0500.019−0.037−0.078−0.096−0.192−0.164
R0.2230.2190.3360.2740.1930.1750.1060.0540.089
RMSE (km2)11.15311.53010.56610.73811.04111.25811.34911.83711.700
RFR
(trees = 100,
depth = 1000)
R2−0.058−0.1210.060−0.015−0.069−0.071−0.091−0.203−0.141
R0.2210.2460.3450.2550.1690.1840.1160.0510.099
RMSE (km2)11.15511.48010.51110.92411.21111.22211.32711.89511.580
RFR
(trees = 1000,
depth = 1000)
R2−0.064−0.1300.0350.024−0.048−0.066−0.100−0.226−0.166
R0.2240.2230.3260.2800.1850.1870.0970.0360.086
RMSE (km2)11.18311.52610.65310.71111.09911.19411.37112.00811.711
Table 6. Coefficient of determination (R2), correlation coefficient (R), and RMSE (km2) for each model for each meteorological and oceanographic parameter, in Evaluation (C) with N = 1. Each parameter was separately input to each model.
Table 6. Coefficient of determination (R2), correlation coefficient (R), and RMSE (km2) for each model for each meteorological and oceanographic parameter, in Evaluation (C) with N = 1. Each parameter was separately input to each model.
ModelIndexRainfallWind VectorBarometric
Pressure
Tide LevelArea of High-Turbidity Water
SVR
(ε = 0.1,
C = 1.0)
R2−0.011−0.039−0.0090.0880.354
R0.0800.2090.0190.3050.646
RMSE (km2)10.90311.05310.89410.3568.713
SVR
(ε = 0.01,
C = 1.0)
R2−0.1380.050−0.1560.0460.474
R0.0820.2280.0280.3250.712
RMSE (km2)11.56510.57111.65810.5937.861
SVR
(ε = 0.1,
C = 10.0)
R2−0.014−0.039−0.1130.0760.335
R0.0730.2090.0190.3390.621
RMSE (km2)10.91811.05311.43910.4218.841
SVR
(ε = 0.01,
C = 10.0)
R2−0.1490.042−0.0840.0330.368
R0.0620.2220.1250.3270.652
RMSE (km2)11.62410.61511.29210.6648.617
RFR
(trees = 100,
depth = 100)
R20.055−0.203−0.205−0.1780.350
R0.2360.033−0.0570.2820.676
RMSE (km2)10.53811.89111.90211.7668.743
RFR
(trees = 1000,
depth = 100)
R20.060−0.163−0.209−0.1330.376
R0.2450.060−0.0470.2840.680
RMSE (km2)10.51411.69111.92211.5418.566
RFR
(trees = 100,
depth = 1000)
R20.065−0.142−0.221−0.1270.377
R0.2570.069−0.0680.2990.679
RMSE (km2)10.48311.58611.98311.5138.559
RFR
(trees = 1000,
depth = 1000)
R20.060−0.165−0.210−0.1120.373
R0.2460.051−0.0530.2890.678
RMSE (km2)10.51211.70411.92611.4368.583
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nagayama, K.; Tonooka, H. Prediction of the Area of High-Turbidity Water in the Yatsushiro Sea, Japan, Using Machine Learning with Satellite, Meteorological, and Oceanographic Data. Remote Sens. 2023, 15, 1652. https://doi.org/10.3390/rs15061652

AMA Style

Nagayama K, Tonooka H. Prediction of the Area of High-Turbidity Water in the Yatsushiro Sea, Japan, Using Machine Learning with Satellite, Meteorological, and Oceanographic Data. Remote Sensing. 2023; 15(6):1652. https://doi.org/10.3390/rs15061652

Chicago/Turabian Style

Nagayama, Kazutaka, and Hideyuki Tonooka. 2023. "Prediction of the Area of High-Turbidity Water in the Yatsushiro Sea, Japan, Using Machine Learning with Satellite, Meteorological, and Oceanographic Data" Remote Sensing 15, no. 6: 1652. https://doi.org/10.3390/rs15061652

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop