1. Introduction
Islands account for only 5% of the world, but they support more than 20% of the biodiversity and provide abundant marine resources for human beings [
1,
2]. Under the influence of human activities like urban expansion, industrial land, agriculture and fisheries, and tourism activities, the seawater quality of island archipelagos can become seriously polluted, which aggravates the vulnerability and instability of archipelago ecosystems [
3,
4,
5,
6,
7,
8,
9]. Dissolved inorganic nitrogen (DIN) is the main index for seawater pollution in archipelago ecosystems [
10]. Excess DIN content may lead to seawater acidification [
11] and eutrophication [
12], which can cause the poisoning of marine organisms [
13,
14]. Therefore, it is of great significance to monitor the DIN content in the seawater of archipelago ecosystems, to protect marine organisms and maintain the ecological environment.
Traditional seawater DIN monitoring uses ship-borne equipment or buoy equipment to measure the contents of ammonia nitrogen (NH4-N), nitrite nitrogen (NO2-N), and nitrate nitrogen (NO3-N) and then sums the contents of the three inorganic nitrogen forms/species [
15,
16]. This method is limited by factors such as spatial location, weather, manpower, and equipment cost. Large-scale, high-density, and high-frequency monitoring cannot be performed, and it is difficult to obtain the DIN distribution in continuous wide fields [
16,
17]. With the development of remote sensing technology, the spectral information received by remote sensing satellites can be used to establish a physical model, an empirical model, or a semi-analytical model for surface seawater quality, and the results have the advantages of high spatial coverage, spatio-temporal continuity, and low cost [
18,
19]. Nitrogen has a weak optical reaction [
20], so many researchers have used empirical models to reflect the nitride content or the mixture of nitrogen on the surface of the water.
Isenstein and Park [
21] and Li et al. [
22] developed linear regression equations for the inversion of the total nitrogen in small water bodies such as lakes and reservoirs, but the R
2 was only 0.75. Yu et al. [
23] divided a large area of the sea into three small regional areas, and a stepwise regression model was constructed for each region, obtaining an R
2 fitting accuracy of greater than 0.77 for all three regions. However, in recent years, researchers have found that the relationship between the dissolved nutrients or a mixture of nitrogen and the band reflectivity in surface water is nonlinear, and the linear fitting accuracy was not satisfactory. Meanwhile, machine learning models have achieved good fitting results [
18]. For example, Huang et al. [
19] used a back propagation neural network (BPNN) model to invert the DIN in Shenzhen Bay in China, obtaining a fitting accuracy R
2 of 0.9; Guo et al. [
17] used a variety of machine learning methods to explore the total nitrogen of small lakes, and the highest fitting accuracy R
2 reached 0.88; Wang et al. [
24] established a support vector machine regression model for NH4-N content in small rivers, and the fitting accuracy R
2 was as high as 0.98; and Vakiliz et al. [
25] constructed an artificial neural network (ANN) model for reservoir total nitrogen, and the fitting accuracy R
2 reached 0.93. These machine learning models can perform well in small water bodies, but due to the limitations of the known data space and quantity, they cannot solve the complex differences of optical properties in different regions of medium- and large-scale water bodies [
26]. In view of this, researchers have attempted to add geographical information elements to the inversion of large-scale sea areas to solve the nonlinear distribution of water quality in space. For example, Du et al. [
27] proposed a geographically neural network weighted regression (GNNWR) model to evaluate the water quality of the Zhejiang coastal sea in China, obtaining a model accuracy R
2 of >84%. Du et al. [
28] used a geographically and cycle-temporally weighted regression (GCTWR) model to explore the spatially continuous distribution of chlorophyll in coastal waters, and the fitting accuracy R
2 was 0.8721. However, these huge models cannot take into account the changes in the content of small- and medium-sized details, and they require a large volume of measured data. Island sea areas are mesoscale water bodies, lying between small- and large-scale water bodies. On the one hand, the water quality of mesoscale water bodies is the same as that of the large-scale sea areas, in that the water quality is affected by the horizontal and vertical movement of the ocean currents and the unbalanced primary productivity, and the spatial distribution can be complex in different regions; simple spatial regression models struggle to smooth the complex spatial nonlinearity of large-scale water quality [
27,
28]. On the other hand, the volume of the measured data in mesoscale research areas is often small, making it difficult to establish suitable water quality inversion models for mesoscale sea areas [
29,
30].
At the same time, the existing inversion models involve simple band selection or simple band combination. For example, Huang et al. [
19] directly utilized the b (coastal), b (blue), b (green), b (red), and b (NIR) bands of Landsat 8 and the b (blue), b (green), b (red), and b (NIR) bands of Landsat 5; Guo et al. [
17] adopted the Sentinel-2 B3, B4, B5, B6, B7, and B8 bands; Wang et al. [
24] used the R1, R2, R3, and R4 bands of SPOT-5; Du et al. [
28] used the Moderate Resolution Imaging Spectroradiometer (MODIS) B1, B3, B5, and B7 bands; Vakili et al. [
25] used Landsat 8 Band 4 and the band ratio of Band 3/Band 2; and Torbick et al. [
31] used the Landsat Thematic Mapper (TM)3, TM1/TM3, and TM3/TM1 bands. However, the reflectance of remote sensing bands can show the phenomenon of different spectra for the same object, even if the same characteristic bands with high correlation have different degrees of nitrogen information. Simple band selection and band combination only consider the high contribution degree and do not consider the differences between the contribution degrees of these characteristic bands during the inversion process, resulting in a reduction in the fitting accuracy. Therefore, it is necessary to analyze the contribution of the related bands to nitrogen and use a weighted calculation where the band information with a large contribution and low error is retained to participate in the fitting regression calculation.
What is more, NH4-N, NO2-N, and NO3-N pose different risks to marine organisms: NH4-N can lead to asphyxia, acidosis, and decreased blood oxygen in fish [
32,
33,
34]; NO2-N can lead to serious electrolyte imbalances in aquatic animals [
35,
36]; while the toxicity of NO3-N to aquatic organisms is extremely low and almost negligible [
12]. When the dissolved oxygen (DO) content in seawater is normal, nitrification will occur, and NH4-N will be converted into NO2-N and then into NO3-N, and the water quality of the seawater will gradually improve. When the DO content is insufficient, denitrification will occur, and the water quality cannot be improved with the increase in NO2-N content [
35,
37,
38]. To date, the continuous distribution of NH4-N, NO2-N, and NO3-N has not been mapped synchronously in the existing research. The individual nitrogen compounds or total DIN content can neither allow for a comprehensive analysis of the toxicity of seawater nor judge the water purification situation in detail. Therefore, it is necessary to monitor the content of NH4-N, NO2-N, and NO3-N synchronously, which can not only determine whether the DIN content exceeds the standard but also allow a more thorough judgment on the specific situation of water environmental pollution at that time.
In this paper, in order to synchronously monitor the contents of the three inorganic nitrogen forms/species in mesoscale archipelago waters by considering the nonlinear spatial distribution of the nitrogen compounds and using more effective information in the selected feature bands, we propose a multiple weighted regression model considering spatial characteristics (S-WSVR). In the S-WSVR model, the spatial features and correlated bands are taken as the input parameters, and a multivariate weighting module based on the Mahalanobis distance is added to help the support vector regression (SVR) model calculate the weight relationship between the input bands and the target parameters, and the regression relationship is established by finding the optimal hyperplane. In this study, the coastal area of Changshan Islands, China was selected as the experimental area, and 320 sets of measured NH4-N, NO2-N, and NO3-N data from 2013 to 2022 were collected. According to the time and spatial scale, the band reflectance of the medium spatial resolution Landsat 8 multi-spectral remote sensing imagery was matched. Three experimental datasets of NH4-N, NO2-N, NO3-N and their correlated bands containing spatial characteristics were established. Finally, the NH4-N, NO2-N, and NO3-N regression models were obtained by training and testing the data at an 8:2 ratio. In addition, we mapped the spatial and temporal distribution of NH4-N, NO2-N, and NO3-N in the archipelagic waters of the study area during the spring and autumn based on the designed experimental regression model. We also analyzed the distribution of the three types of nitrogen using the “spatial quantification of the relationship between human activities and marine ecosystems” (SQRHM) model and dissolved oxygen.
5. Conclusions
In this paper, a multiple weighted regression (S-WSVR) model taking into account spatial information has been proposed to monitor the continuous distribution of NH4-N, NO2-N, and NO3-N in the surface layer of the seawater in a mesoscale archipelagic environment, which smooths the spatial complexity of the mesoscale water quality distribution and retains more useful information in the characteristic bands. Based on the Mahalanobis distance and mathematical and statistical analysis, the contribution of the different characteristic wavebands was calculated, and the spatial information was utilized as one of the input parameters. The accuracy of the experimental results for NH4-N, NO2-N, and NO3-N was better than that of the original model, with r-values of 0.9063, 0.8900, and 0.9755 and RMSEs of 0.2097 mg/L, 0.1230 mg/L, and 0.1573 mg/L, respectively, which represent an increase in r-value of 43.53%, 45.35%, and 52.10% and a decrease in RMSE of 0.0487 mg/L, 0.2977 mg/L, and 0.1571 mg/L, respectively, compared with the original model. The accuracy was improved the most by the spatial information, while the multivariate weighting resulted in a small improvement, which proves that the three nitrogen compounds are nonlinear and heterogeneous in spatial distribution, and the contributions of the characteristic bands in the calculation are different. Moreover, the inversion results for the three nitrogen compounds were summed and compared to the measured DIN concentration, obtaining an r-value of 0.9028. In addition, we also obtained the characteristic bands associated with Landsat 8 for the three nitrogen compounds: B2, B3, B4, B5, B6, and B7 for NH4-N; B3, B6, and B7 for NO2-N; and B5, B6, and B7 for NO3-N.
Furthermore, we input the 2013–2022 images into the S-WSVR model to plot the spatial and temporal distributions of NH4-N, NO2-N, and NO3-N. The distribution patterns of the three nitrogen compounds in the Changshan Islands area were then analyzed from the spatio-temporal distribution map. It was found that the Changshan Islands sea area has been polluted by DIN in the whole area since 2014, but the seawater is still in a relatively healthy state. The SQRHM analysis revealed that the intensity of human activities had a greater impact on NH4-N and NO3-N. Human daily life on the islands, tourism development, and harbor shipping were the main sources of pollution, and shellfish and algae on surface culture floating rafts were conducive to nitrogen purification. In addition, the water body discharged by the Yalu River is anoxic, and the closer the sea area is to the estuary of the Yalu River, the lower the DO content and the higher the NO2-N content. Overall, the pollution concentration of inorganic nitrogen decreases with the distance from human activities, and it is necessary to regulate human activities and the Yalu River discharge in order to protect the ecological environment of this sea area.
Consequently, this study shows that the S-WSVR model can monitor the three inorganic nitrogen forms/species content in seawater comprehensively at the island archipelago scale and then thoroughly monitor the seawater DIN condition, which can thus be used to monitor the seawater environment. However, the similarity between the relevant bands of the model and the inorganic nitrogen forms/species will be changed by the external conditions of the measured data. Therefore, the degree of the contribution of the relevant bands to the different inorganic nitrogen forms/species will also be changed by the external conditions of the measured data, so the weight coefficients need to be adjusted when using the modified model, which requires more quasi-real-time measured data. In the future, the input parameters could be adjusted to improve the correlation with inorganic nitrogen forms/species by combining the bands and then calculating the similarity with inorganic nitrogen forms/species, so that more valid information can be retained to further improve the accuracy.