Retrieval and Evaluation of Chlorophyll-A Spatiotemporal Variability Using GF-1 Imagery: Case Study of Qinzhou Bay, China

Chlorophyll-a (Chl-a) concentration is a measure of phytoplankton biomass, and has been used to identify ‘red tide’ events. However, nearshore waters are optically complex, making the accurate determination of the chlorophyll-a concentration challenging. Therefore, in this study, a typical area affected by the Phaeocystis ‘red tide’ bloom, Qinzhou Bay, was selected as the study area. Based on the Gaofen-1 remote sensing satellite image and water quality monitoring data, the sensitive bands and band combinations of the nearshore Chl-a concentration of Qinzhou Bay were screened, and a Qinzhou Bay Chl-a retrieval model was constructed through stepwise regression analysis. The main conclusions of this work are as follows: (1) The Chl-a concentration retrieval regression model based on 1/B4 (near-infrared band (NIR)) has the best accuracy (R2 = 0.67, root-mean-square-error = 0.70 μg/L, and mean absolute percentage error = 0.23) for the remote sensing of Chl-a concentration in Qinzhou Bay. (2) The spatiotemporal distribution of Chl-a in Qinzhou Bay is varied, with lower concentrations (0.50 μg/L) observed near the shore and higher concentrations (6.70 μg/L) observed offshore, with a gradual decreasing trend over time (−0.8).


Introduction
Coastal waters are an important ecosystem that humans depend on. They play an important role in fisheries, industry, and tourism [1,2]. With the rapid development of coastal economies, the pressures on the natural environment along the coast from urban expansion, population growth, and industrialization are increasing. Many environmental problems caused by unreasonable planning, unscientific management, and uncoordinated production have also emerged [3].
The explosive growth of marine algae (i.e., phytoplankton blooms) can cause the sea color to change to red or brown, depending on the blooming species and concentration [4][5][6]. This phenomenon primarily occurs in nearshore environments, including upwelling shadows [7,8]. Phytoplankton blooms that have a toxic or harmful effect on marine life are called harmful algal blooms (HABs) [9,10]. Although some 'red tides' are harmless, these events can, for example, block the exchange of oxygen and cause aquatic organisms to die from oxygen deficiency [11,12].
Previous studies have reported that the coastal waters of China have been polluted by 'red tides' to varying degrees [13]. This is mainly reflected in the increasing frequency of 'red tides', types of algae observed, and areas affected by 'red tides' [14,15]. In the 1970s, only nine 'red tide' events were recorded in China. This number increased to 75 in the 1980s, and then to 262 in the 1990s [16]. The cumulative area of 'red tide' bloom from 2000 to 2019 reached 210,000 km 2 (http://www.mnr.gov.cn. accessed on 10 March 2021). These 'red tide' events have become an increasingly serious environmental problem in China [13,17,18].
The concentration of chlorophyll-a (Chl-a) is directly related to phytoplankton biomass and can be used as an indicator of the presence of 'red tides' [19,20]. Therefore, monitoring accurate Chl-a concentrations is critical for studying 'red tide' dynamics [21]. Traditional monitoring is time-consuming and laborious, and some areas are difficult to reach; therefore, the needs of real-time monitoring and large-scale macro monitoring are difficult to meet. Remote-sensing monitoring can overcome these shortcomings due to its real-time, continuous, large-scale data collection, and has gradually become an important marine monitoring approach [22][23][24][25].
The optical properties of water in the ocean far away from the shore are stable and mainly affected by phytoplankton. Oceanic Chl-a concentration can be accurately obtained from the ratio of blue to green bands [26]. However, the optical characteristics of water near the shore are affected by phytoplankton, total suspended matter (TSM), and colored dissolved organic matter (CDOM). Owing to these factors, the retrieval of Chl-a concentration is still a challenge [25,27]. It is especially serious for coastal waters of lower Chl-a concentration, where the signal-to-noise ratio of Chl-a is very low. Existing empirical [28], semi-analytical [29], analytical [30], and machine learning [31,32] retrieval methods have been demonstrated to have great potential in retrieving the concentration of nearshore Chl-a. Owing to the different geographical environment [33], riverine sources, and phytoplankton in this region, the applicability of the models aforementioned were extremely limited. Therefore, it is necessary to build a retrieval model for Chl-a concentration in specific regions according to the actual situation.
The concentration of Chl-a in the coastal waters of Qinzhou Bay in Guangxi is at a relatively low level. Due to rapid economic development, the discharge of land-based pollutants has increased. In 2014, the Phaeocystis bloom blocked the cold source water intake system of a nuclear power plant [34]. This was the first incident in China that threatened the safety of nuclear power due to a bloom of algae. Based on this, this study screened out the significant variables that affect the concentration of Chl-a in Qinzhou Bay based on the Gaofen-1 (GF-1) image, and established a retrieval model for the concentration of Chl-a in Qinzhou Bay. The spatiotemporal distribution of Chl-a in Qinzhou Bay was discussed and its trend was determined, providing a reference method for the prevention and management of 'red tides' in the coastal waters of Qinzhou Bay.

Study Area
Qinzhou Bay is located in southern Guangxi, China, and is part of Beibu Bay, covering an area of 908.37 km 2 between latitudes and longitudes of 21 • 33 20 -21 • 54 30 N and 108 • 28 20 -108 • 45 30 E, respectively. The bay consists of inner (Maowei Sea) and outer (Qinzhou Bay) bays, and is generally 'gourd-shaped' [35]. Qinzhou Bay has a good ecological environment and rich solar radiation energy. The total radiation is 104-129 cal/cm 3 , the annual average temperature is about 21-23 • C, the annual average sunshine is 1801 h, and the annual average rainfall is 1500-1800 mm [36].

In Situ Data
Water quality monitoring data were obtained from the Guangxi Academy of Sciences. The sampling period extended from October 2017 to February 2018. Together with the Guangxi Academy of Sciences, the Institute of Oceanology, Chinese Academy of Sciences monitors algal blooms in Qinzhou Bay. The layout of the monitoring points is shown in Figure 1. The specific longitudes and latitudes of the sampling points are listed in Table 1. The concentrations of measured Chl-a are shown in Table 2. Sustainability 2021, 13, x FOR PEER REVIEW 3 of 13

Satellite Data
In this study, the remote sensing data used are the multispectral remote sensing image of China's GF-1 satellite. The GF-1 satellite was launched on 26 April 2013. It has four wide-angle field-of-view (WFV, 16 m spatial resolution) cameras and one panchromatic and multispectral (PMS) sensor [37]. The revisit cycle of GF-1 data is four days, which is better than Landsat data (16-day repeat cycle). Obtaining cloud-free remote sensing data matching the field monitoring time is thus more likely [38,39]. Based on the water quality sampling monitoring times, four GF-1 WFV images covering the whole study area with

Satellite Data
In this study, the remote sensing data used are the multispectral remote sensing image of China's GF-1 satellite. The GF-1 satellite was launched on 26 April 2013. It has four wide-angle field-of-view (WFV, 16 m spatial resolution) cameras and one panchromatic and multispectral (PMS) sensor [37]. The revisit cycle of GF-1 data is four days, which is better than Landsat data (16-day repeat cycle). Obtaining cloud-free remote sensing data matching the field monitoring time is thus more likely [38,39]. Based on the water quality sampling monitoring times, four GF-1 WFV images covering the whole study area with cloud cover of less than 20% were selected. The payload parameters of the GF-1 satellite are listed in Table 3. Information on the selected GF-1 images and field monitoring time is shown in Table 4.  The water quality monitoring was conducted two days before and after the GF-1 satellite imaging. The weather conditions were good, with a breeze and no precipitation.

Data Processing 2.4.1. Radiometric Calibration
Radiometric calibration: By using Equation (1), the GF-1 multispectral image digital numbers were converted to the radiance, and the error of the sensor itself is eliminated, which can be realized by the radiation calibration tool in the ENVI 5.3 image processing software.
where Le is the radiance, DN is the digital numbers value observed by the sensor, and gain and bias are coefficients [40].

Atmospheric Correction
Most of the radiation received by satellite sensors originates from the atmosphere [41]. Therefore, atmospheric correction of remote sensing images is vital for water quality retrieval. The FLAASH module in ENVI 5.3, which is based on the MODTRAN 5 radiation transfer model, and includes all MODTRAN atmospheric and aerosol styles, can effectively remove the water vapor/aerosol scattering effect and 'proximity effect' between pixels in different images [42]. The study area is at a low latitude. The atmospheric model is set to tropical, and the aerosol type is maritime.

Orthorectification
The orthorectification of a remote-sensing image is conducted in order to correct the distortion that occurs in the process of imaging. It helps eliminate errors caused by terrain, camera geometry, and sensors, allowing us to obtain a consistent image [43]. The downloaded L1-level GF-1 imaging product has not been geometrically corrected, and the position is offset, but it carries a rational polynomial coefficient (RPC) file to obtain an accurate geographic location. Therefore, for GF-1 image data, orthorectification based on RPC file and RPC model is used to realize geometric correction [44]. This process can be implemented in the orthocorrection tool of ENVI 5.3. The digital elevation model (DEM) used is the global terrain elevation data that comes with ENVI5.3, with a resolution of 900 m.

Correlation Analysis and Regression Models
Previous studies have shown that the blue-green ratio can accurately estimate the concentration of chlorophyll-a in clear waters [26]. There is a fluorescence peak in the red to near-red band [45]. However, due to the complex composition of nearshore water, the concentration of Chl-a presents different spectral curves. Based on this, we tried to combine the blue, green, red, and near-red bands, and used the Pearson correlation coefficient to find the sensitive band and band combination of Chl-a concentration in Qinzhou Bay.
Based on the results of the correlation analysis, the regression relationship between the point value of remote sensing reflectance and measured data were established following a stepwise regression analysis method. Variables were introduced to the model individually, and the F-test was conducted each time to remove insignificant variables and those causing multicollinearity. This is an iterative process that ends when no significant variables enter the model, and the resulting variable set is guaranteed to be optimal.

Model Calibration and Validation
To minimize the effects of modeling a random factor, we used the leave-one-out cross-validation (LOOCV) [46,47] method to achieve model calibration and validation. The subset (31 data samples) was randomly selected from a total of 32 data samples for modeling, while the remaining sample was used for model validation. This process was repeated 32 times to ensure that each sample was used for calibration and validation to avoid the over-fitting and under-fitting of the model.
To evaluate the performance of our retrieval model, the fitting coefficient (R 2 ), root mean square error (RMSE), and mean absolute percentage error (MAPE) were used. The R2 value reflects the fitting between the retrieval value and measured value, and is a statistical index used to evaluate the reliability of a regression model. RMSE is the square root of the square and total number ratios of the deviation between the retrieval and measured values, and is used to measure the deviation between the retrieval and measured values. The MAPE can accurately reflect the error between the retrieval and measured values, and the calculation formulas of each evaluation index are as follows: where y in−situ is the measured Chl-a concentration, and y retrieval is the value calculated by the retrieval model.

Results of Correlation Analysis
SPSS was used to conduct a Pearson correlation analysis of the reflectance of each band and the measured Chl-a concentration data to determine the significant variables of the retrieval model, and the correlation variables were as follows: Table 5 results show that 1/B4 had the highest correlation, with correlation coefficients reaching 0.745 and 0.781, and was significantly correlated with the measured Chl-a concentration within a 99% confidence interval. Generally, the closer a Pearson correlation coefficient is to 1, the better the linear fitting degree between the band combinations (model independent variable) and the Chl-a concentration (model dependent variable). Therefore, this study attempts to use the band combination with the highest Pearson correlation coefficient (|r| > 0.7) for modeling analysis.

Model Validation
The bands and band combinations for which correlation coefficients greater than 7 were set as variables, along with insignificant variables, were removed by stepwise regression analysis. One to three extremely significant variables were retained, and the following model was established: To further determine the optimal retrieval model, unmodeled measured data and model retrieval values were verified and compared, and the verification-comparison charts were as follows: Based on the LOOCV analytic method, the statistical values of the model validation results are shown in Table 6, and the scatter plots of the retrieval values and measured values of the six regression models are shown in Figure 2. Among all models, 1/B4 and Ln(Chl-a) has the best model performance (R 2 = 0.67, RMSE = 0.70 µg/L, MAPE = 0.23), and the samples are concentrated near the 95% prediction interval. The formula is as follows: Ln(Chl-a) = 487 × X − 0.86, X = 1/B4 (5)

Spatial Variation
The selected equation (5) retrieval model was used to retrieve the Chl-a concentration in Qinzhou Bay, and the Chl-a concentration distribution from October 2017 to February 2018 is shown in Figures 3-5: Sustainability 2021, 13, x FOR PEER REVIEW 8 of 13

Spatial Variation
The selected equation (5)

Spatial Variation
The selected equation (5)     According to Figures 3 and 4, the low value of Chl-a concentration dominates most areas of Qinzhou Bay. Generally, higher concentrations were found offshore, while lower concentrations were found near the shore, decreasing with time. These findings are consistent with previous studies [48].
Due to the Venturi effect, the water velocity in the northern coast is faster than in other areas [49]. Most of the nutrients carried by the surface runoff are entrained to the far shore, and fully mixed with the seawater, providing a material basis for the growth and reproduction of phytoplankton [50][51][52]. In addition, aquaculture is practiced near the coast [53,54], and the filter-feeding effect of fish, shrimp, shellfish and other animals has an important control effect on the biomass of phytoplankton [55,56]. This causes the concentrations to be low near the shore but high offshore.
As the seasons change, the temperature gradually decreases from 27 • C in October to 18 • C in February. Phytoplankton lack a suitable growth environment and the concentration of Chl-a decreases as a whole over this period.

Details of Change in Chla Concentration
Based on Figure 3a, in October 2017, the Chl-a concentration in Qinzhou Bay was high overall, with the highest concentration reaching 6.70 µg/L. The Chl-a concentrations increased gradually from north to south, which may be due to the ebb and flow of the sea [57]. As shown in Table 7, the sea was in ebb at this time and reached its lowest point at noon. The tide then began to increase gradually. Ebb causes Qinzhou Bay's phytoplankton to move out to the sea with water. In addition, due to the impact of severe typhoon Khanun in October 2017 [58], the water velocity accelerated, and a large number of phytoplankton moved with the sea from east to west or southwest. Therefore, the concentration of Chl-a is higher in the southwest. Based on Figure 3b, during November 2017, the highest Chl-a concentration was 3.55 µg/L (no more than 5 µg/L), and the Chl-a distribution in most areas was relatively constant (approximately 2.2 µg/L). However, the area with higher Chl-a concentration appeared to be a 'ring', as shown in Figure 5, which may be due to the influence of the hydraulic conditions. October-November is the transitional period between anticyclone circulation and cyclonic circulation in Qinzhou Bay [60,61], and the wind weakens. Affected by the coastal circulation in the southwest direction, the offshore water flows to the nearshore and mixes with the runoff input to produce upwelling [62].This triggers the exchange of nutrients up and down on the seafloor, providing suitable conditions for the growth and reproduction of phytoplankton and algae.
Based on Figure 3c,d, the concentration of Chl-a in Qinzhou Bay from December 2017 to February of the following year was generally low (1.35 ug/L). In winter, the temperature in Qinzhou Bay reaches the lowest temperature of 18 • C, and the flow of the river into the sea weakens due to the decrease in rainfall, resulting in a decrease in the input of nitrogen and phosphorus nutrients from land sources [63]. The lack of suitable temperature and nutrients potentially limits phytoplankton production. and phosphorus nutrients from land sources [63]. The lack of suitable temperature and nutrients potentially limits phytoplankton production. The distribution of the Chl-a concentration is affected by various factors, and it is difficult to analyze the spatiotemporal variation rules by solely relying on a four-month retrieval image. Seasonal changes, weather changes, hydraulic conditions, and other factors may cause fluctuations in the Chl-a content.

Conclusions
In this study, the 16 m resolution GF-1 remote sensing image was used to construct a Chl-a concentration model for Qinzhou Bay to explore the spatiotemporal variations in the Chl-a concentration of Qinzhou Bay, and the following conclusions were drawn: (1) The regression retrieval model with 1/B4 variable was the best, with fitting coefficient (R 2 ), root mean square error (RMSE), and mean absolute percentage error (MAPE) values of 0.67, 0.70 μg/L and 0.23, respectively, which can meet the requirements of retrieving the Chl-a concentration in Qinzhou Bay via remote sensing.  The distribution of the Chl-a concentration is affected by various factors, and it is difficult to analyze the spatiotemporal variation rules by solely relying on a four-month retrieval image. Seasonal changes, weather changes, hydraulic conditions, and other factors may cause fluctuations in the Chl-a content.

Conclusions
In this study, the 16 m resolution GF-1 remote sensing image was used to construct a Chl-a concentration model for Qinzhou Bay to explore the spatiotemporal variations in the Chl-a concentration of Qinzhou Bay, and the following conclusions were drawn: (1) The regression retrieval model with 1/B4 variable was the best, with fitting coefficient (R 2 ), root mean square error (RMSE), and mean absolute percentage error (MAPE) values of 0.67, 0.70 µg/L and 0.23, respectively, which can meet the requirements of retrieving the Chl-a concentration in Qinzhou Bay via remote sensing.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Not applicable.

Conflicts of Interest:
The authors declare no conflict of interest.